1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
|
register: 0.843
user-level: 0.819
graphic: 0.799
mistranslation: 0.796
peripherals: 0.796
performance: 0.793
debug: 0.791
network: 0.787
KVM: 0.786
hypervisor: 0.786
device: 0.781
virtual: 0.778
boot: 0.778
arm: 0.774
socket: 0.773
risc-v: 0.773
TCG: 0.772
assembly: 0.766
permissions: 0.764
architecture: 0.762
semantic: 0.761
x86: 0.759
VMM: 0.756
PID: 0.754
kernel: 0.738
ppc: 0.732
files: 0.722
vnc: 0.721
i386: 0.688
qemu 3.1/i386 crashes/guest hangs when MTTCG is enabled
When MTTCG is enabled, QEMU 3.1.0 sometimes crashes when running the following command line:
qemu-system-i386 -kernel /home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/bootstrap -append bootstrap -initrd "/home/jermar/work/software/l4/fiasco/.build-i386/fiasco -serial_esc,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/sigma0 ,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/moe rom/ahci.cfg,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/ned ,test_env.lua ,/home/jermar/Kernkonzept/software/l4/pkg/ahci-driver/examples/md5sum/ahci.cfg ,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/l4re ,/home/jermar/Kernkonzept/software/l4/pkg/ahci-driver/examples/md5sum/ahci.io ,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/io ,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/ahci-drv ,/home/jermar/Kernkonzept/software/l4/.build-i386/bin/x86_gen/l4f/ahci-md5-sync" -smp 4 -accel tcg,thread=multi -device ahci,id=ahci0 -drive if=none,file=/home/jermar/Kernkonzept/software/l4/.build-i386/pkg/ahci-driver/test/examples/test_ahci.img,format=raw,id=drive-sata0-0-0 -device ide-drive,bus=ahci0.0,drive=drive-sata0-0-0,id=sata0-0-0 -serial stdio -nographic -monitor none
The host is x86_64.
The stack at the time of the crash (core dump and debug binary linked below[1]):
Core was generated by `qemu-system-i386 -kernel /home/jermar/Kernkonzept/software/l4/.build-i386/bin/x'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 io_writex (env=env@entry=0x565355ca0140, iotlbentry=iotlbentry@entry=0x565355ca9120, mmu_idx=2, val=val@entry=0, addr=addr@entry=3938451632, retaddr=retaddr@entry=140487132809203, recheck=false, size=4)
at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/cputlb.c:791
791 if (mr->global_locking && !qemu_mutex_iothread_locked()) {
[Current thread is 1 (Thread 0x7fc5af7fe700 (LWP 3625719))]
Missing separate debuginfos, use: dnf debuginfo-install SDL2-2.0.9-1.fc29.x86_64 at-spi2-atk-2.30.0-1.fc29.x86_64 at-spi2-core-2.30.0-2.fc29.x86_64 atk-2.30.0-1.fc29.x86_64 bzip2-libs-1.0.6-28.fc29.x86_64 cairo4
(gdb) bt
#0 0x0000565354f5f365 in io_writex
(env=env@entry=0x565355ca0140, iotlbentry=iotlbentry@entry=0x565355ca9120, mmu_idx=2, val=val@entry=0, addr=addr@entry=3938451632, retaddr=retaddr@entry=140487132809203, recheck=false, size=4)
at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/cputlb.c:791
#1 0x0000565354f621b2 in io_writel (recheck=<optimized out>, retaddr=140487132809203, addr=3938451632, val=0, index=0, mmu_idx=2, env=0x565355ca0140)
at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/softmmu_template.h:310
#2 0x0000565354f621b2 in helper_le_stl_mmu (env=0x565355ca0140, addr=<optimized out>, val=0, oi=34, retaddr=140487132809203)
at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/softmmu_template.h:310
#3 0x00007fc5b5a587f3 in code_gen_buffer ()
#4 0x0000565354f75fd0 in cpu_tb_exec (itb=<optimized out>, cpu=0x7fc5b5a5aa40 <code_gen_buffer+12266006>) at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/cpu-exec.c:171
#5 0x0000565354f75fd0 in cpu_loop_exec_tb (tb_exit=<synthetic pointer>, last_tb=<synthetic pointer>, tb=<optimized out>, cpu=0x7fc5b5a5aa40 <code_gen_buffer+12266006>)
at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/cpu-exec.c:615
#6 0x0000565354f75fd0 in cpu_exec (cpu=cpu@entry=0x565355c97e90) at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/accel/tcg/cpu-exec.c:725
#7 0x0000565354f33b1f in tcg_cpu_exec (cpu=0x565355c97e90) at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/cpus.c:1429
#8 0x0000565354f35e83 in qemu_tcg_cpu_thread_fn (arg=0x565355c97e90) at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/cpus.c:1733
#9 0x0000565354f35e83 in qemu_tcg_cpu_thread_fn (arg=arg@entry=0x565355c97e90) at /home/jermar/software/HelenOS/helenos.git/contrib/qemu/qemu-3.1.0/cpus.c:1707
#10 0x00005653552ec5da in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:498
#11 0x00007fc5b858a58e in start_thread () at /lib64/libpthread.so.0
#12 0x00007fc5b84b96a3 in clone () at /lib64/libc.so.6
Another symptom that occurs more often than this crash is that the guest hangs while waiting for another CPU to complete a cross-CPU call. Disabling MTTCG makes both symptoms go away.
[1] Core file + debug binary: http://jermar.eu/ref/qemu-mttcg-core.tar.xz
As for the other outcome, when the guest hangs (instead of QEMU crashing), the guest CPUs that block forward progress are halted in an idle loop, have interrupts enabled and have a queued timer IRQ 248 and a pending software IPI IRQ 250. It appears another timer IRQ is currently being serviced (but the CPU is idling).
(qemu) cpu 1
(qemu) info registers
EAX=ff8b7000 EBX=ff8b7000 ECX=00000003 EDX=00000003
ESI=00000001 EDI=ff8b5240 EBP=ff8b7000 ESP=ff8b7fac
EIP=f0029707 EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA]
CS =0008 00000000 ffffffff 00cf9b00 DPL=0 CS32 [-RA]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA]
FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA]
GS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA]
LDT=0000 00000000 00000000 00008200 DPL=0 LDT
TR =0068 efbfe280 00003d80 00008900 DPL=0 TSS32-avl
GDT= ffbd8400 00000077
IDT= eacfe000 000007ff
CR0=8001003b CR2=00000000 CR3=03fde000 CR4=00000690
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
(qemu) info lapic
dumping local APIC state for CPU 1
LVT0 0x0001003f active-hi edge masked Fixed (vec 63)
LVT1 0x0001003f active-hi edge masked Fixed (vec 63)
LVTPC 0x000100ff active-hi edge masked Fixed (vec 255)
LVTERR 0x000000fb active-hi edge Fixed (vec 251)
LVTTHMR 0x000100ff active-hi edge masked Fixed (vec 255)
LVTT 0x000200f8 active-hi edge periodic Fixed (vec 248)
Timer DCR=0xb (divide by 1) initial_count = 997376
SPIV 0x00000107 APIC enabled, focus=off, spurious vec 7
ICR 0x00000000 physical edge de-assert no-shorthand
ICR2 0x00000000 cpu 0 (APIC ID)
ESR 0x00000000
ISR 248
IRR 248 250
APR 0x00 TPR 0x00 DFR 0x0f LDR 0x00 PPR 0xf0
(gdb) set $eip=f0029707
(gdb) set $esp=ff8b7fac
(gdb) bt
#0 0xf0029707 in Proc::halt () at /home/jermar/Kernkonzept/software/l4/fiasco/src/drivers/ia32/processor-ia32.cpp:47
#1 0xf00193b8 in Kernel_thread::idle_op (this=this@entry=0xffb66da4) at /home/jermar/Kernkonzept/software/l4/fiasco/src/kern/kernel_thread.cpp:134
#2 0xf001bc11 in call_ap_bootstrap (this=0xffb66da4, resume=0xf001bc11) at /home/jermar/Kernkonzept/software/l4/fiasco/src/kern/app_cpu_thread.cpp:111
#3 0x00000001 in ?? ()
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]
|