1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
qemu hangs when guest is using linux kernel 4.16+
I have been using qemu on daily basis 5+ years in order to do btrfs development and testing and it always worked perfectly, until I upgraded the linux kernel of the guests to 4.16. With 4.16+ kernels, when running all the fstests (previously called xfstests), the qemu process hangs (console unresponsive, can't ping or ssh the guest anymore, etc) and stays in a state Sl+ according to 'ps'.
This happens on two different physical machines, one running openSUSE tumbleweed (which I don't access at the moment to check kernel version) and another running xubuntu (tried kernels 4.15.0-32-generic and vanilla 4.18.0). Using any kernel from 4.16 to 4.19-rc5 in the guests (they use different debian versions) makes qemu hang running the fstests suite (after about 30 to 40 minutes, either at test generic/299 or test generic/451).
I tried different qemu versions, 2.11.2, 2.12.1 and 3.0.0, and it happens with all of them (all built from the sources available at https://www.qemu.org/download/#source).
I built 3.0.0 with debug enabled, using the following parameters for 'configure':
--prefix=/home/fdmanana/qemu-3.0.0 --enable-tools --enable-linux-aio --enable-kvm --enable-vnc --enable-vnc-png --enable-debug --extra-cflags="-O0 -g3 -fno-omit-frame-pointer"
And captured a coredump of the qemu process, available at:
https://www.dropbox.com/s/d1tlsimahykwhla/core_dump_debug.tar.xz?dl=0
the stack traces of all threads, for a quick look:
https://friendpaste.com/zqkz2pD0WgSdeSKITHPDf
qemu is being invoked with the following script:
#!/bin/bash -x
sudo modprobe tun
sudo modprobe kvm
sudo modprobe kvm-intel
sudo tunctl -t tap5 -u fdmanana
sudo ifconfig tap5 up
sudo brctl addif br0 tap5
sudo umount /mnt/tmp5
sudo mkdir -p /mnt/tmp5
sudo mount -t tmpfs -o size=14G tmpfs /mnt/tmp5
for ((i = 2; i <= 7; i++)); do
sudo qemu-img create -f qcow2 /mnt/tmp5/disk$i 13G
done
sudo chown fdmanana /mnt/tmp5/disk*
qemu-system-x86_64 -m 4G \
-device virtio-scsi-pci \
-boot c \
\
-drive if=none,file=debian5.qcow2,cache=none,aio=native,cache.direct=on,format=qcow2,id=drive0,discard=on \
-device scsi-hd,drive=drive0,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk2,cache=writeback,format=qcow2,id=drive1,discard=on \
-device scsi-hd,drive=drive1,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk3,cache=writeback,format=qcow2,id=drive2,discard=on \
-device scsi-hd,drive=drive2,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk4,cache=writeback,format=qcow2,id=drive3,discard=on \
-device scsi-hd,drive=drive3,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk5,cache=writeback,format=qcow2,id=drive4,discard=on \
-device scsi-hd,drive=drive4,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk6,cache=writeback,format=qcow2,id=drive5,discard=on \
-device scsi-hd,drive=drive5,bus=scsi.0 \
\
-drive if=none,file=/mnt/tmp5/disk7,cache=writeback,format=qcow2,id=drive6,discard=on \
-device scsi-hd,drive=drive6,bus=scsi.0 \
\
-drive if=none,file=disk8,cache=writeback,aio=native,cache.direct=on,format=qcow2,id=drive7,discard=on \
-device scsi-hd,drive=drive7,bus=scsi.0 \
\
-drive if=none,file=disk9,cache=writeback,aio=native,cache.direct=on,format=qcow2,id=drive8,discard=on \
-device scsi-hd,drive=drive8,bus=scsi.0 \
\
-drive if=none,file=disk10,cache=writeback,aio=native,cache.direct=on,format=qcow2,id=drive9,discard=on \
-device scsi-hd,drive=drive9,bus=scsi.0 \
\
-net nic,macaddr=52:54:00:12:34:fa -net tap,ifname=tap5,script=no,downscript=no \
-rtc base=localtime -enable-kvm -machine accel=kvm -smp 4 -cpu host \
-k pt -serial tcp:127.0.0.1:9997 -display vnc=:5
Is there anything else I can provided to help debug this?
Thanks.
|