summary refs log tree commit diff stats
path: root/results/classifier/zero-shot/105/mistranslation/1866870
blob: 83f367608cee37cff44447d7a5954f45da66efe2 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
mistranslation: 0.887
semantic: 0.872
graphic: 0.835
other: 0.833
assembly: 0.790
instruction: 0.790
device: 0.783
boot: 0.774
socket: 0.759
KVM: 0.757
vnc: 0.750
network: 0.748

KVM Guest pauses after upgrade to Ubuntu 20.04

As outlined here: https://bugs.launchpad.net/qemu/+bug/1813165/comments/15  

After upgrade, all KVM guests are in a default pause state. Even after forcing them off via virsh, and restarting them the guests are paused.

These Guests are not nested.

A lot of diganostic information are outlined in the previous bug report link provided. The solution mentioned in previous report had been allegedly integrated into the downstream updates.



Hi tstrike,
thanks for the report.
I have slightly modified the description and changed the bug tasks accordingly for you.

I first checked the related known fixes from the old case that is linked.
Just in case if we might miss one in Ubuntu 20.04 that you are using.


Kernel:
=> https://marc.info/?l=kvm&m=155085391830663&w=2
Tested and verified https://bugs.launchpad.net/qemu/+bug/1813165/comments/13
This got upstream and is in:
$ git describe --contains ad7dc69aeb231
v5.0-rc8~1^2~2
That we'd clearly have in Focal being on 5.4

qemu
https://git.qemu.org/?p=qemu.git;a=commit;h=9c1f8f4493e8355d0e48f7d1eebdf86893ba082d
Other fixes related to the topic are in qemu 2.8

On seabios disabling of SMM
- https://bugzilla.redhat.com/show_bug.cgi?id=1378006
- https://bugzilla.redhat.com/show_bug.cgi?id=1464654#c21
The following is from >=1.12.0-1 (was enabled by default before)
There is a small (for old qemu) and large binary (new qemu):
 42 build/bios.bin:                                                                  
 43 # A stripped-down version of bios, to fit in 128Kb, for qemu <= 1.7              
 44 »···$(call build-bios,bios,QEMU=y ROM_SIZE=128 PVSCSI=n BOOTSPLASH=n XEN=n USB_OHCI=n USB_XHCI=n USB_UAS=n SDCARD=n TCGBIOS=n MPT_SCSI=n NVME=n USE_SMM=n VGAHOOKS=n)
 45 build/bios-256k.bin:                                                             
 46 »···$(call build-bios,bios,QEMU=y ROM_SIZE=256)

Note: if we are out of options we could try testing to set USE_SMM=n here, but lets check other details first.

But as already explained on the linked bug 1813165:
"If you're seeing "KVM internal error. Suberror: 1" it can be multiple things, not necessarily the same bug."

Copied here from the other bug about the system setup that is in use:

L0 DistroRelease: Ubuntu 20.04 on Kernel Linux 5.4.0-14-generic x86_64
L1 3 guests Windows 10, Centos 8
No L2s
No guests are enabled for UEFI Boot

libvirt: 6.0.0-0ubuntu4
qemu 1:4.2-3ubuntu1

Issue triggers without nesting (ensured via modprobe kvm_intel nested=)

@tstrike - can you trigger the same issue with all your guests?
You list Windows and Centos guests, does it triggers with Centos as well or only the Windows guests?
Also if you have a chance (just to be sure) does it trigger with an Ubuntu guest as well? This would help for people retrying not using a case that doesn't even trigger in your setup.

@tstrike - you seem to hit this while starting your guest through libvirt.
Could you please attach your guest XML so that we can try to recreate this case?
  $ virsh dumpxml <guestname>
That will help when trying to recreate your case.

@tstrike
It would also be helpful to get your qemu commandline as well as any further messages qemu might have reported.
You'll find that in the per guest log file at:
 $ cat /var/log/libvirt/qemu/<guestname>.log

If you could report all that here that should be useful for everyone tracking this bug. 

@tstrike: finally for the sake of apparmor denials or any other odd error that might be mentioned in there attaching the output of `dmesg` on your host might be useful as well.

Christian,

Thanks for getting my report in the proper syntax. I would be extremely happy to follow through on the tasks you laid out to me. Give me about 3 hours and I will update the report with the items requested.













Thank you @tstrike:

In your logs I see a bunch of qemu warnings right at the beginning:
2020-02-12T15:09:37.773025Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-02-12T15:09:37.773107Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
2020-02-12T15:09:37.774800Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-02-12T15:09:37.774821Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
2020-02-12T15:09:38.024821Z qemu-system-x86_64: warning: Unknown firmware file in legacy mode: etc/msr_feature_control

And then a crash like:
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88

Reproduced via attempt to install KVM Guest F31 Server on Ubuntu 20.04 (bare metal) 

I tried with a guest XML matching yours (other than disk setup).
I didn't get those errors you reported even when using your config.


Notable differences to my default - your guest has:
- a rather old chip type (Penryn is a 2007 chip)
- a rather old machine type (uses xenial which matches ~pc-i440fx-2.5)
This probably based on when the system was created.
But since you also have the same issues on the windows guest which has the modern:
  <type arch='x86_64' machine='pc-q35-4.2'>hvm</type>
  <cpu mode='host-model' check='partial'/>
So this isn't a route we need to go down...


Note: I tried this on kernel 5.4.0-14-generic with some common not too old & not too new chips
- Intel(R) Xeon(R) CPU E5-2620
- AMD Opteron(tm) Processor 4226

Then I remembered that you followed to disable nesting and after all vmx-* you see in the warnings could be related.

I ran this and restarted my guests:
# sudo rmmod kvm_intel
# sudo modprobe kvm_intel nested=0
or
# sudo rmmod kvm_amd
# sudo modprobe kvm_amd nested=0

Even then I didn't get the same warnings or crashes you got.

FYI: maybe related (similar symptom - which could be anything as we know, but still worth a link): https://bugzilla.redhat.com/show_bug.cgi?id=1718584

Thanks Boris for chiming in!
Maybe it is something in the guest (or the way virt-manager sets things up) after all - will install an F31 via virt-manager as well ...

I've got the same issue starting guest via virt-install even with serial console.

I fetched
https://download.fedoraproject.org/pub/fedora/linux/releases/31/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-31-1.9.iso

And installed it on Ubuntu 20.04 via virt-manager (keeping all things on its default).

- New
- Local Media
- select ISO (Autodetects F31)
- Forward, Forward, Forward, Finish

Now warnings:
sudo cat /var/log/libvirt/qemu/fedora31.log  | grep warning
<empty>

And it just works, the installer is on the graphical UI and waiting for me.

@Boris - in your log I've seen that you also got the Penryn cpu which I find odd.
"-cpu Penryn,vme=on,vmx=on,x2apic=on,tsc-deadline=on,xsave=on,hypervisor=on,arat=on,tsc-adjust=on,arch-capabilities=on,skip-l1dfl-vmentry=on \"
Assuming you also only used default I wonder how it got to that, maybe the reason for that is the same reason that eventually triggers the error.
But virt-manager/libvirt would usually just do a best-fit (for me Haswell-noTSX-IBRS).


@Boris and @tstrike Could you both please report:
$ virsh capabilities
$ virsh domcapabilities
$ sudo qemu-system-x86_64 --enable-kvm --nographic --nodefaults -S -qmp-pretty stdio
{"execute":"qmp_capabilities"}
{"execute":"query-cpu-definitions"}
Note: the command seems to hang as you are on QMP, then just enter the two commands below one by one. This will add "qemu's explanation why a given cpu is usable or not"





This particular command seems to hang on:

qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.svm [bit 2]

I tried to execute (thinking I was in a shell):

{"execute":"qmp_capabilities"}
{"execute":"query-cpu-definitions"}


I might have misinterpreted what you wanted me to do.

Thanks a lot . I will do it at my earliest convenience this night.  Haswell i4770 is installed on small server 32 GB. Department's policy doesn't allow me to test Ubuntu whichever release on bare metal.  I could test only on outdated CPU's box and it seems to be a core reason. 

Done on Penryn's box

Thanks Boris!

@tstrike - is your system also "really an old penryn" or is it something newer?
Maybe share /proc/cpuinfo?

tstrike39@islandhealthcenter-media:~$ sudo cat  /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
stepping	: 10
microcode	: 0xa0b
cpu MHz		: 2416.548
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 5303.23
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
stepping	: 10
microcode	: 0xa0b
cpu MHz		: 2010.620
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 5303.23
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
stepping	: 10
microcode	: 0xa0b
cpu MHz		: 2419.534
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 5303.23
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
stepping	: 10
microcode	: 0xa0b
cpu MHz		: 1988.790
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 5303.23
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:



It detects host as Penryn as well for @tstrike.
Which is fine if it is a chip of around that era.
He reported to have an "Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz"
And for that chip the detection and chip used might be correct.

So to summarize all repro fails, but on Penryn ERA chips 2/2 cases trigger the bug.

I wonder if one that wants to reproduce needs a system with such a chip as well then to test and trigger this.

There should be plenty of people on CC as this is mirrored to qemu-devel due to the upstream qemu task. Is there an microarchitectural x86 specialist that knows if the chips of that generation have some special issues in regard to VMX that might explain what we see here?

It would be great if everyone could ask around for more systems with chips of that era.
Maybe we can further bisect which work and which will fail.

I tried launching a focal vm on a focal host, and the vm launched but is in a paused state.

Attached is its log.

This is on an old E660 intel core system.

/proc/cpuinfo


I also have these two apparmor denied messages in dmesg:
[ 1380.529549] audit: type=1400 audit(1584023445.093:139): apparmor="DENIED" operation="open" profile="libvirt-aa346a1d-8caa-4c55-bef9-c3acbe17bdac" name="/" pid=19712 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0
[ 1380.529856] audit: type=1400 audit(1584023445.093:140): apparmor="DENIED" operation="open" profile="libvirt-aa346a1d-8caa-4c55-bef9-c3acbe17bdac" name="/sys/bus/nd/devices/" pid=19712 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0


And one last bit of info, this system booted with mitigations=off.

virsh capabilities

virsh domcapabilities

AppArmor is completely disabled on my server.

After changing cpu to <cpu mode='host-model'/>:


I got this log (still in a paused state):
char device redirected to /dev/pts/3 (label charserial0)
2020-03-12T15:06:22.560159Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-vnmi-pending [bit 22]
2020-03-12T15:06:22.560708Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-secondary-ctls [bit 31]
2020-03-12T15:06:22.560971Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48BH).vmx-apicv-xapic [bit 0]
2020-03-12T15:06:22.561208Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48DH).vmx-vnmi [bit 5]
2020-03-12T15:06:22.561392Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(480H).vmx-ins-outs [bit 54]
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88


By using host-model Andreas also was able to get the same signature:

2020-03-12T15:06:22.560159Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-vnmi-pending [bit 22]
2020-03-12T15:06:22.560708Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-secondary-ctls [bit 31]
2020-03-12T15:06:22.560971Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48BH).vmx-apicv-xapic [bit 0]
2020-03-12T15:06:22.561208Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48DH).vmx-vnmi [bit 5]
2020-03-12T15:06:22.561392Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(480H).vmx-ins-outs [bit 54]
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88

So the warnings seem to depend a bit on which chip type we try to be to the guest.
We can ignore them for now.

What stays is the emulation error on this kind of chip.
I'll try to write up some tests to check different qemu and kernel levels to further corner what we are looking at.

Penryn's architecture confirmed


Seems to work fine on i4790 (Haswell) box.

Yeah @Boris - it really seems to be an issue bound to the Merom/Penryn processor generation.

I asked Andreas to check through some kernels and qemu versions so that we maybe eventually can consider bisecting something. But that will take a bit of time.

Of course everyone able to spend some time can consider checking a few kernels of [1] as well (probably the easiest test to begin with).

Still if there is an x86-microarchitecture expert out there that say "ah penryn, I know we added/dropped ... " please speak up :-)

[1]: https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D

The vmx things make me wonder about a fix Paolo did a while ago for enabling inidivudal vmx features rather than vmx as a whole;  but I can't remember if that was a kernel or qemu fix.

One thing I notice, that may be a red-herring, all of the machine code in the errors are 'f3 0f 1e fb' which is the new 'endbr32' security instruction - but that's really a rep nop, that I thought old instructions can handle anyway??

Andreas was so kind to try kenels 4.4, 4.15 and 5.6 all fail (with qemu 4.2)
He then tried Eoan (qemu 4.0) and Focal (qemu 4.2).
4.0 worked and 4.2 failed.

We will set up a bisect run on Monday and hopefully find the offending change.

@David - I agree that the messages might be a red-herring, but to be sure was that fix between 4.0 and 4.2?

I think the one I was thinking of is 0723cc8a5558c94388db75ae1f4991314914edd3  which is in a 4.2.0 rc
and there was 2605188240f939fa9ae9353f53a0985620b34769  - but that's a different crash to what you have.

So hmm.


Thanks David!

While bisecting on upstream git with just "-cpu Penryn" we have seen that it always works there.
So it might be an interaction with some Ubuntu build/packaging/configure detail together with these old chips.

While we still can't be sure if the VMX warnings are a red-herring chances are that only "-cpu Penryn,vmx=on" will trigger the issue - Andreas will test and bisect with that once he is back online - we will see if that is any different.

I'll also build a Ubuntu'esque 4.2 with the Penryn changes of [1] reverted just to complete the interim picture of our testing. That is available for testing at [2]. Further I added a Ubuntu build with rather crude reverts of almost all VMX related 4.2 changes.

[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=0723cc8a5558c94388db75ae1f4991314914edd3
[2]: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1866870-qemu-penryn-crash
[3]: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1866870-qemu-penryn-crash-fullreverts

No luck when testing [2]. Reports are attached

Log file for f31wks guest

Verification new packages to be installed



The package from the PPA failed the same way for me:

ubuntu@f1:~$ qemu-system-x86_64 --enable-kvm -cpu Penryn,vmx=on -m 512 --nodefaults --nographic
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.sse4.1 [bit 19]
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88
^Cqemu-system-x86_64: terminating on signal 2

ubuntu@f1:~$ qemu-system-x86_64 --help 2>&1|head -n 1
QEMU emulator version 4.2.0 (Debian 1:4.2-3ubuntu3~ppa1)
ubuntu@f1:~$ 


Also crashed with the packages from the other ppa:

ubuntu@f1:~$ qemu-system-x86_64 --help 2>&1|head -n 1
QEMU emulator version 4.2.0 (Debian 1:4.2-3ubuntu3~exp1)

ubuntu@f1:~$ qemu-system-x86_64 --enable-kvm -cpu Penryn,vmx=on -m 512 --nodefaults --nographic
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.sse4.1 [bit 19]
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88


ubuntu@f1:~$ apt-cache policy qemu-kvm
qemu-kvm:
  Installed: 1:4.2-3ubuntu3~exp1
  Candidate: 1:4.2-3ubuntu3~exp1
  Version table:
 *** 1:4.2-3ubuntu3~exp1 500
        500 http://ppa.launchpad.net/paelzer/bug-1866870-qemu-penryn-crash-fullreverts/ubuntu focal/main amd64 Packages
        100 /var/lib/dpkg/status
     1:4.2-3ubuntu2 500
        500 http://br.archive.ubuntu.com/ubuntu focal/main amd64 Packages


Ok, upstream tag v4.2.0 and these configure options reproduced the crash:

export LDFLAGS="-Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g  -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,--as-needed"
export CFLAGS="-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g"

Full configure output: https://paste.ubuntu.com/p/Tzq6pDWD9R/

$ ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -cpu Penryn,vmx=on -m 512 --nodefaults --nographic
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.sse4.1 [bit 19]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-vnmi-pending [bit 22]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48EH).vmx-secondary-ctls [bit 31]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48BH).vmx-apicv-xapic [bit 0]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48BH).vmx-wbinvd-exit [bit 6]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48DH).vmx-vnmi [bit 5]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
qemu-system-x86_64: warning: host doesn't support requested feature: MSR(480H).vmx-ins-outs [bit 54]
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=000086d4 EDX=00000000
ESI=00000000 EDI=00000000 EBP=000086d4 ESP=00006d7c
EIP=00007acf EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =f000 000f0000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     000f6200 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=b8 90 d9 00 00 66 e8 6b f7 ff ff 66 b8 0a 00 00 00 e9 61 f2 <f3> 0f 1e fb 66 57 66 56 66 53 66 53 66 89 c7 67 66 89 14 24 66 89 ce 66 e8 15 f8 ff ff 88
^Cqemu-system-x86_64: terminating on signal 2


I've tested smoe more cmbinations and found that I van have v4.2 work on focal.
Eventually I have realized that when I install start the qemu from Ubuntu not only that but also the formerly working build of v4.2.0 from git start to fail (without rebuilding).

A bit of package bisect later I found seabios to be related.
Focal is at 1.13.0-1
Eoan is at 1.12.0-1

After I knew that I verified and found it really only triggers on seabios 1.13.0.

With 1.13 I was also able to break the qemu v4.0.0 git build on eoan.
As well as the packaged qemu in Eoan.

So it seems we are actually looking for a problem of seabios (instead of qemu) with the Penryn chip.

I'll look at their changelog and bisect that tomorrow as time permits

I wanted to make sure why different qemu configs make it trigger or not, and after finding seabios to be related the candidates were obvious.

Default config gets us:
BIOS directory    /usr/local/share/qemu

The long conf had:
--firmwarepath=/usr/share/qemu:/usr/share/seabios:/usr/lib/ipxe/qemu

Adding that to the short config which had most things disabled made it break as well.
Since it has much less moving parts having most other features disabled I'll continue to use that.


With that confirmed I checked if I can just point to a bios to break it, and indeed adding 
  -bios /root/seabios_1.12.0-1/usr/share/seabios/bios.bin
  -bios /root/seabios_1.13.0-1/usr/share/seabios/bios.bin
respectively is a make or break change.



As a next step I reproduced the error with seabios rel-1.13.0 from https://review.coreboot.org/seabios.git.
It crashes as well.

But to make this puzzle even more interesting rel-1.12.0 from the same git crashes as well.
I wonder where this trip might end, from qemu to seabios to ... compiler?

Turns out 1.12 is a fairly old build and the working version in Ubuntu was from in Disco, therefore about a year ago.
=> https://launchpad.net/ubuntu/+source/seabios/1.12.0-1/+build/16284605

Therefore I built it in Eoan and even Disco.
As an overview:
Disco: gcc 4:8.3.0-1ubuntu3   binutils 2.32-7ubuntu4
Eoan:  gcc 4:9.2.1-3.1ubuntu1 binutils 2.33-2ubuntu1.2
Focal: gcc 4:9.2.1-3.1ubuntu1 binutils 2.34-4ubuntu1

I ended up with these binaries to test:
./git-built-in-eoan/rel-1.12/bios.bin Breaks
./git-built-in-eoan/rel-1.13/bios.bin Breaks
./git-built-in-disco/rel-1.13/bios.bin Works
./git-built-in-disco/rel-1.12/bios.bin Works
./git-built-in-focal/head/bios.bin Breaks
./git-built-in-focal/rel-1.12/bios.bin Breaks
./git-built-in-focal/rel-1.13/bios.bin Breaks
./packaging/disco-seabios_1.12.0-1/bios.bin Works
./packaging/focal-seabios_1.13.0-1/bios.bin Breaks

To summarize:
- qemu breaks on chips of the Penryn generation
- it only breaks if the seabios bios is executed
- does not really depend on seabios or qemu version
- but it depends on seabios build environment

That's getting more fun :-)
You could look at whether seabios's config works out hte same in the two environments, or whether something makes use of new build flags - try looking at the gcc lines that are invoked in the good/bad cases and see if they're passing any options that the other doesn't.


Starting from the Disco build env that I had I changed the packages

Step #1 binutils:
Unpacking binutils-x86-64-linux-gnu (2.33-2ubuntu1.2) over (2.32-7ubuntu4) ...
Unpacking libbinutils:amd64 (2.33-2ubuntu1.2) over (2.32-7ubuntu4) ...
Unpacking binutils (2.33-2ubuntu1.2) over (2.32-7ubuntu4) ...
Unpacking binutils-common:amd64 (2.33-2ubuntu1.2) over (2.32-7ubuntu4) ...

=> Still working

Step #2 gcc:
Unpacking libubsan1:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libtsan0:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking gcc-9-base:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libstdc++6:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libquadmath0:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking liblsan0:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libitm1:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libgomp1:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libcc1-0:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libatomic1:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libasan5:amd64 (9.2.1-9ubuntu2) over (9.1.0-2ubuntu2~19.04) ...
Unpacking libgcc1:amd64 (1:9.2.1-9ubuntu2) over (1:9.1.0-2ubuntu2~19.04) ...
Unpacking libisl21:amd64 (0.21-2) ...
Unpacking cpp-9 (9.2.1-9ubuntu2) ...
Unpacking libgcc-9-dev:amd64 (9.2.1-9ubuntu2) ...
Unpacking gcc-9 (9.2.1-9ubuntu2) ...
Unpacking libstdc++-9-dev:amd64 (9.2.1-9ubuntu2) ...
Unpacking g++-9 (9.2.1-9ubuntu2) ...
Unpacking g++ (4:9.2.1-3.1ubuntu1) over (4:8.3.0-1ubuntu3) ...
Unpacking gcc (4:9.2.1-3.1ubuntu1) over (4:8.3.0-1ubuntu3) ...
Unpacking cpp (4:9.2.1-3.1ubuntu1) over (4:8.3.0-1ubuntu3) ...

=> now it is breaking

One thing that we have seen to cause breakage in other cases was the new default to enable:
  -fcf-protection

The code already carries quite a bunch of similar "no" rules:
COMMONCFLAGS += $(call cc-option,$(CC),-nopie,)
COMMONCFLAGS += $(call cc-option,$(CC),-fno-pie,)
COMMONCFLAGS += $(call cc-option,$(CC),-fno-stack-protector,)
COMMONCFLAGS += $(call cc-option,$(CC),-fno-stack-protector-all,)
COMMONCFLAGS += $(call cc-option,$(CC),-fstack-check=no,)
COMMONCFLAGS += $(call cc-option,$(CC),-Wno-address-of-packed-member,)

Lets add to that:
COMMONCFLAGS += $(call cc-option,$(CC),-fcf-protection=none,)

=> That made it work \o/ !

I *think* it's the cf-protection that's adding the endbr32 instructions that I spotted as being the failing instruction each time;  but I don't understand why they would be CPU type specific.

Sent to seabios for their consideration:
=> https://<email address hidden>/thread/IXAWMA2HWW75LSR3NBBYQKWT3TI5WVVP/

I deleted the experimental PPAs that we had and created a new one with a new seabios:
=> https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3982

An MP is open to fix this in Focal:
=> https://code.launchpad.net/~paelzer/ubuntu/+source/seabios/+git/seabios/+merge/380881

This bug was fixed in the package seabios - 1.13.0-1ubuntu1

---------------
seabios (1.13.0-1ubuntu1) focal; urgency=medium

  * d/p/lp-1866870-build-use-fcf-protection-none-when-available.patch
    fix breakage on older chips due to fcf-protection (LP: #1866870)

 -- Christian Ehrhardt <email address hidden>  Thu, 19 Mar 2020 13:10:10 +0100

Can I just apt update && apt upgrade to get this fix or do I need to patch?

apt-get will be enough once it's published, and looks like it just was published.

Works for me. F31 KVM guest is installing on Q9550 box.

I can confirm that this bug has been fixed (zapped). Thank you all for your hard work and determination. A job well done indeed! As a former programmer I love you all's zeal for attacking this bug.

As a side note, knowing what you all go through, I always look things up, walk through at least level 1 stuff and provide logfiles. I hope what little I did, help you all resolve this.

Again my thanks, and I believe is this is where we say, "Please close the bug marked SOLVED".

Thank you Boris and Tstrike for the report and your help.
It was a great bug to identify and fix before the release of 20.04, I appreciate you using (and hereby testing) it ahead of time!

Hello!
Unfortunately the bug has apparently reappeared. I have a Windows 10 running in a VM, which after my today's "apt upgrade" goes into pause mode after a few seconds of running time.

Tail output of my /var/log/libvirt/qemu/win10.log
char device redirected to /dev/pts/1 (label charserial0)
2020-05-05T08:53:23.733051Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-05-05T08:53:23.733122Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
2020-05-05T08:53:23.736093Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-05-05T08:53:23.736110Z qemu-system-x86_64: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
2020-05-05T08:54:04.912098Z qemu-system-x86_64: terminating on signal 15 from pid 1593 (/usr/sbin/libvirtd)
2020-05-05 08:54:05.112+0000: shutting down, reason=destroyed


Apt upgraded packages (from /var/log/apt/history.log):
Start-Date: 2020-05-05  09:32:02
Commandline: apt upgrade
Requested-By: andreas (1000)
Install: linux-image-5.4.0-29-generic:amd64 (5.4.0-29.33, automatic), linux-modules-extra-5.4.0-29-generic:amd64 (5.4.0-29.33, automatic), linux-headers-5.4.0-29-generic:amd64 (5.4.0-29.33, automatic), linux-modules-5.4.0-29-generic:amd64 (5.4.0-29.33, automatic), linux-headers-5.4.0-29:amd64 (5.4.0-29.33, automatic)
Upgrade: linux-headers-generic:amd64 (5.4.0.28.33, 5.4.0.29.34), linux-libc-dev:amd64 (5.4.0-28.32, 5.4.0-29.33), linux-image-generic:amd64 (5.4.0.28.33, 5.4.0.29.34), linux-generic:amd64 (5.4.0.28.33, 5.4.0.29.34)
End-Date: 2020-05-05  09:33:11


Kind regards,
   Andreas

Hi Andreas,
so the only upgrade you did to trigger this for you was to bump the kernel from 5.4.0-28.33 to 5.4.0-29.34 - nothing else? I have not (yet?) heard other similar reports, but it might be just too early?
At least on my system for now things still work with the new kernel like before.

I'd recommend filing a new bug, refer to this one as maybe being related and adding the following right away:
- kernel version (you have this here I know)
- qemu/libvirt/seabios/ovmf version (if you don't mind just attach `dpkg -l`)
- guest XML (if using libvirt) otherwise the qemu command-line
- add a cross check and report what happens with other guests configs (e.g. non windows, using 
  another bios as the former issue was tied to seabios, use different guest CPU types)
- the full /var/log/apt/history.log
- a date when you last started the VM successfully (not just still-had-it-running, but started it) 
  and the date when it started to fail (probably yesterday then I guess)

Hi Christian.

Just filed bug: #1877052

Same issue here. I've upgraded my IBM Power ppc64le system to ubuntu 20.04. Now I'm trying to create KVM VMs and whatever I'm doing, the VM is created but before any installation step starts, it's falling into "paused" mode. When trying to resume it, I get:
"
Error unpausing domain: internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required
"

Any workaround ? Do I need to reinstall Ubuntu 18.04 ?

Fabrice: That's probably a different error given that this lp seems to be with x86 vmx flags.
Check your /var/log/libvirt/qemu/ on the host to see if there's a particular error shown in the destination qemu after migration.

David: Indeed ! How stupid I am. I missed the root cause inside QUEMU log file. This was clear enough...
error: kvm run failed Device or resource busy
This is probably because your SMT is enabled.

So I switch SMT (Power Simultaneous Multi-Threading) off and now it's OK; VMs are running and installing.

It had been years since I last touched KVM on Power and I lost my reflexes.
So please forget my precedent comment telling I had the same issue on Power platform. It was similar symptoms but not the same problem.