device: 0.096 KVM: 0.087 boot: 0.082 PID: 0.077 debug: 0.077 semantic: 0.071 socket: 0.070 other: 0.068 graphic: 0.067 network: 0.067 vnc: 0.063 permissions: 0.062 performance: 0.060 files: 0.054 debug: 0.850 PID: 0.040 files: 0.031 other: 0.016 performance: 0.013 KVM: 0.008 network: 0.008 device: 0.007 semantic: 0.006 graphic: 0.006 socket: 0.006 boot: 0.004 vnc: 0.004 permissions: 0.003 Race condition during shutdown I ran into a bug when I started several VMs in parallel using libvirt. The VMs are using only a kernel and a initrd (which includes a minimal OS). The guest OS itself does a 'poweroff -f' as soon as the login prompt shows up. So the expectaction is that the VMs will start, the shutdown will be initiated, and the QEMU processes will then end. But instead some of the QEMU processes get stuck in ppoll(). A bisect showed that the first bad commit was 0f12264e7a41458179ad10276a7c33c72024861a ("block: Allow graph changes in bdrv_drain_all_begin/end sections"). I've already tried the current master (13b7b188501d419a7d63c016e00065bcc693b7d4) since the problem might be related to the commit a1405acddeb0af6625dd9c30e8277b08e0885bd3 ("aio: Do aio_notify_accept only during blocking aio_poll"). But the bug is still there. I’ve reproduced the bug on x86_64 and on s390x. The backtrace of a hanging QEMU process: (gdb) bt #0 0x00007f5d0e251b36 in ppoll () from target:/lib64/libc.so.6 #1 0x0000560191052014 in qemu_poll_ns (fds=0x560193b23d60, nfds=5, timeout=55774838936000) at /home/user/git/qemu/util/qemu-timer.c:334 #2 0x00005601910531fa in os_host_main_loop_wait (timeout=55774838936000) at /home/user/git/qemu/util/main-loop.c:233 #3 0x0000560191053119 in main_loop_wait (nonblocking=0) at /home/user/git/qemu/util/main-loop.c:497 #4 0x0000560190baf454 in main_loop () at /home/user/git/qemu/vl.c:1866 #5 0x0000560190baa552 in main (argc=71, argv=0x7ffde10e41c8, envp=0x7ffde10e4408) at /home/user/git/qemu/vl.c:4644 The used domain definition is: test 716800 2 8 hvm /var/lib/libvirt/images/vmlinuz-4.14.13-200.fc26.x86_64 /var/lib/libvirt/images/test-image-qemux86_64+modules-4.14.13-200.fc26.x86_64.cpio.gz console=hvc0 STARTUP=shutdown.sh destroy restart preserve /usr/local/qemu/master/bin/qemu-system-x86_64
Do you find the cause of the bug and fix it? It was fixed with commit 4cf077b59fc73eec29f8b7 (see patch series https://lists.gnu.org/archive/html/qemu-block/2018-09/msg00504.html). Ok, so marking this bug as "fixed" according to Marc's comment.