permissions: 0.893 performance: 0.881 other: 0.871 vnc: 0.867 device: 0.839 KVM: 0.838 graphic: 0.831 debug: 0.828 semantic: 0.824 network: 0.816 files: 0.785 socket: 0.763 PID: 0.747 boot: 0.704 QEMU 5.0: Guest VM hangs/freeze when unplugging USB device Setup: Host: Debian/SID, Kernel 5.6, QEMU 5.0 Guest: Windows 10 VM with PCI and USB device passthrough. Problem: Guest VM suddenly hangs when pulling USB device out from the Host. Observations: - Issue appears to be related to QEMU 5.0 - It started after an upgrade to QEMU 5.0. - Downgrading only QEMU on multiple systems fixes the issue. - Issue is very reproducible. - Most of the time within a few attempts of pulling/reconnecting the device. - Issue happens with multiple devices (I did try standard HID devices, a webcam and an x-ray sensor). - Guest just hangs. - Display output remains on last frame shown. - Ping to Guest immediately stops working. - Logs in the Guest stop logging immediately. - Host is fine and thinks the Guest is fine. - Guest continues to show as running in "virsh list". - No suspicious entries in the QEMU logs. - No suspicious entries in Host syslogs/messages. - Host can can kill guest "virsh destroy" and respawn fine. - Issue seems widespread. - Multiple similar reports from ProxMox users after upgrade to ProxMox 6.2 for both Windows and Linux guests (First version that uses QEMU 5.0) https://forum.proxmox.com/threads/vm-freezes-when-disconnecting-usb-keyboard-and-mouse.70287/ https://forum.proxmox.com/threads/usb-drive-crashes-vm.70214/ https://forum.proxmox.com/threads/latest-proxmox-usb-disconnects-freeze-kvm.70398/ https://forum.proxmox.com/threads/vm-with-gpu-passthrough-freezes-when-turning-off-monitor-after-proxmox-6-2-upgrade.69821/ https://forum.proxmox.com/threads/vm-with-gpu-passthrough-freezes-when-turning-off-monitor-after-proxmox-6-2-upgrade.69824/ I'd be more than happy any debugs that might be helpful. Following reports on Proxmox forums, this is still very much seen by multiple users with no known workaround. I was able to run QEMU 5.0.13 (Debian) with all traces turned on and capture the following: - Behavior is reproducible by unbinding usb device on the host (ex. "echo '1-8' > /sys/bus/usb/drivers/usb/unbind") - qemu trace logs stops at exactly the same time when VM freezes - Last few lines of the qemu trace: 1592303@1596123157.254134:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010 1592320@1596123158.822309:usb_xhci_oper_read off 0x0004, ret 0x00000008 1592320@1596123158.822397:usb_xhci_port_read port 1, off 0x0000, ret 0x0e0002a0 1592320@1596123158.822459:usb_xhci_port_read port 2, off 0x0000, ret 0x0e0002a0 1592320@1596123158.822513:usb_xhci_port_read port 3, off 0x0000, ret 0x0e0002a0 1592320@1596123158.822565:usb_xhci_port_read port 4, off 0x0000, ret 0x0e0002a0 1592303@1596123159.858372:virtqueue_alloc_element elem 0x56193c8c8990 size 56 in_num 2 out_num 0 1592303@1596123159.858435:virtqueue_pop vq 0x7fcd5c48a010 elem 0x56193c8c8990 in_num 2 out_num 0 1592303@1596123159.858482:virtqueue_fill vq 0x7fcd5c48a010 elem 0x56193c8c8990 len 72 idx 0 1592303@1596123159.858533:virtqueue_flush vq 0x7fcd5c48a010 count 1 1592303@1596123159.858565:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010 1592303@1596123160.272641:virtqueue_alloc_element elem 0x56193c8c8990 size 56 in_num 2 out_num 0 1592303@1596123160.272702:virtqueue_pop vq 0x7fcd5c48a010 elem 0x56193c8c8990 in_num 2 out_num 0 1592303@1596123160.272751:virtqueue_fill vq 0x7fcd5c48a010 elem 0x56193c8c8990 len 104 idx 0 1592303@1596123160.272802:virtqueue_flush vq 0x7fcd5c48a010 count 1 1592303@1596123160.272833:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010 1592303@1596123160.845694:lockcnt_unlock_attempt lockcnt 0x56193bea6514 unlock 5->4 1592303@1596123160.846821:lockcnt_unlock_success lockcnt 0x56193bea6514 unlock 5->4 succeeded 1592303@1596123160.847923:usb_host_req_complete dev 1:4, packet 0x7fcb84000ea8, status 0, length 0 1592303@1596123160.849369:usb_packet_state_change bus 0, port 1, ep 2, packet 0x7fcb84000ea8, state async -> complete 1592303@1596123160.851157:usb_xhci_xfer_success 0x7fcb84000ea0: len 0 1592303@1596123160.851214:usb_xhci_queue_event v 3, idx 5, ER_TRANSFER, CC_SHORT_PACKET, p 0xffffac0c62444ae3, s 0x0d000000, c 0x02058005 1592303@1596123160.851285:usb_xhci_irq_msix nr 3 1592303@1596123160.851331:usb_xhci_ep_kick slotid 2, epid 5, streamid 0 1592303@1596123160.851374:usb_host_req_data dev 1:4, packet 0x56193cce8da8, in 1, ep 2, size 4 1592303@1596123160.851434:usb_host_req_complete dev 1:4, packet 0x56193cce8da8, status -1, length 0 1592303@1596123160.851485:usb_packet_state_change bus 0, port 1, ep 2, packet 0x56193cce8da8, state queued -> complete 1592303@1596123160.851541:usb_xhci_xfer_error 0x56193cce8da0: ret -1 1592303@1596123160.851577:usb_xhci_queue_event v 3, idx 6, ER_TRANSFER, CC_USB_TRANSACTION_ERROR, p 0x00000001c18a4e20, s 0x04000004, c 0x02058001 1592303@1596123160.851647:usb_xhci_ep_state slotid 2, epid 5, running -> halted 1592303@1596123160.852700:usb_xhci_ep_kick slotid 2, epid 5, streamid 0 1592303@1596123160.852744:usb_host_req_complete dev 1:4, packet 0x7fcb84000b98, status 0, length 0 1592303@1596123160.852788:usb_packet_state_change bus 0, port 1, ep 1, packet 0x7fcb84000b98, state async -> complete 1592303@1596123160.852845:usb_xhci_xfer_success 0x7fcb84000b90: len 0 1592303@1596123160.852879:usb_xhci_queue_event v 3, idx 7, ER_TRANSFER, CC_SHORT_PACKET, p 0xffffac0c6229aae3, s 0x0d000000, c 0x02038005 1592303@1596123160.852945:usb_xhci_ep_kick slotid 2, epid 3, streamid 0 1592303@1596123160.852977:usb_host_req_data dev 1:4, packet 0x56193c9da348, in 1, ep 1, size 8 1592303@1596123160.853031:usb_host_req_complete dev 1:4, packet 0x56193c9da348, status -1, length 0 1592303@1596123160.853080:usb_packet_state_change bus 0, port 1, ep 1, packet 0x56193c9da348, state queued -> complete 1592303@1596123160.853136:usb_xhci_xfer_error 0x56193c9da340: ret -1 1592303@1596123160.853170:usb_xhci_queue_event v 3, idx 8, ER_TRANSFER, CC_USB_TRANSACTION_ERROR, p 0x00000001c18a4c20, s 0x04000008, c 0x02038001 1592303@1596123160.853240:usb_xhci_ep_state slotid 2, epid 3, running -> halted 1592303@1596123160.853280:usb_xhci_ep_kick slotid 2, epid 3, streamid 0 1592303@1596123160.853316:lockcnt_unlock_attempt lockcnt 0x56193bea6514 unlock 1->4 1592303@1596123160.853352:lockcnt_unlock_success lockcnt 0x56193bea6514 unlock 1->4 succeeded 1592303@1596123160.853564:usb_host_close dev 1:4 libusb: error [udev_hotplug_event] ignoring udev action unbind Link to bug on the proxmox side: https://bugzilla.proxmox.com/show_bug.cgi?id=2781 I do get get the same backtrace in gdb every time every time when we reproduce the hang: (gdb) thread apply all bt Thread 9 (Thread 0x7fd1415ff700 (LWP 3202)): #0 0x00007fd323d154bf in __GI___poll (fds=0x7fd1415fe6c0, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 #1 0x00007fd324978bb2 in ?? () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0 #2 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #3 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 8 (Thread 0x7fd1437fe700 (LWP 3171)): #0 0x00007fd323d16d87 in ioctl () at ../sysdeps/unix/syscall-template.S:120 #1 0x000055a5daef74f7 in kvm_vcpu_ioctl () #2 0x000055a5daef7631 in kvm_cpu_exec () #3 0x000055a5daedaede in ?? () #4 0x000055a5db32194b in ?? () #5 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #6 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 7 (Thread 0x7fd143fff700 (LWP 3170)): #0 0x00007fd323d16d87 in ioctl () at ../sysdeps/unix/syscall-template.S:120 #1 0x000055a5daef74f7 in kvm_vcpu_ioctl () #2 0x000055a5daef7631 in kvm_cpu_exec () #3 0x000055a5daedaede in ?? () #4 0x000055a5db32194b in ?? () #5 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #6 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 6 (Thread 0x7fd150dfd700 (LWP 3169)): #0 __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52 #1 0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80 #2 0x000055a5db321b43 in qemu_mutex_lock_impl () #3 0x000055a5daedac8e in qemu_mutex_lock_iothread_impl () #4 0x000055a5dae92ac9 in ?? () #5 0x000055a5dae97de7 in flatview_read_continue () #6 0x000055a5dae98023 in ?? () #7 0x000055a5dae9813b in address_space_read_full () #8 0x000055a5daef78cf in kvm_cpu_exec () #9 0x000055a5daedaede in ?? () #10 0x000055a5db32194b in ?? () #11 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 5 (Thread 0x7fd1515fe700 (LWP 3168)): #0 __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52 #1 0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80 #2 0x000055a5db321b43 in qemu_mutex_lock_impl () #3 0x000055a5daedac8e in qemu_mutex_lock_iothread_impl () #4 0x000055a5dae92ac9 in ?? () #5 0x000055a5dae97de7 in flatview_read_continue () #6 0x000055a5dae98023 in ?? () #7 0x000055a5dae9813b in address_space_read_full () #8 0x000055a5daef78cf in kvm_cpu_exec () #9 0x000055a5daedaede in ?? () #10 0x000055a5db32194b in ?? () #11 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 4 (Thread 0x7fd151dff700 (LWP 3167)): #0 __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52 #1 0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80 --Type for more, q to quit, c to continue without paging-- #2 0x000055a5db321b43 in qemu_mutex_lock_impl () #3 0x000055a5daedac8e in qemu_mutex_lock_iothread_impl () #4 0x000055a5dae92ac9 in ?? () #5 0x000055a5dae97de7 in flatview_read_continue () #6 0x000055a5dae98023 in ?? () #7 0x000055a5dae9813b in address_space_read_full () #8 0x000055a5daef78cf in kvm_cpu_exec () #9 0x000055a5daedaede in ?? () #10 0x000055a5db32194b in ?? () #11 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 3 (Thread 0x7fd320d97700 (LWP 3166)): #0 0x00007fd323d154bf in __GI___poll (fds=0x7fd318003180, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 #1 0x00007fd324a097ee in ?? () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0 #2 0x00007fd324a09b53 in g_main_loop_run () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0 #3 0x000055a5db016c71 in ?? () #4 0x000055a5db32194b in ?? () #5 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #6 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 2 (Thread 0x7fd3224de700 (LWP 3156)): #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x000055a5db3226fa in qemu_event_wait () #2 0x000055a5db33466a in ?? () #3 0x000055a5db32194b in ?? () #4 0x00007fd323defea7 in start_thread (arg=) at pthread_create.c:477 #5 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7fd3224dff40 (LWP 3148)): #0 0x00007fd323d154bf in __GI___poll (fds=0x55a5dca30150, nfds=3, timeout=3) at ../sysdeps/unix/sysv/linux/poll.c:29 #1 0x00007fd324971f4d in ?? () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0 #2 0x00007fd32497316c in libusb_handle_events_timeout_completed () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0 #3 0x000055a5db18edc7 in ?? () #4 0x000055a5db18efab in ?? () #5 0x000055a5db31abf7 in aio_bh_poll () #6 0x000055a5db31e3fe in aio_dispatch () #7 0x000055a5db31aace in ?? () #8 0x00007fd324a095fd in g_main_context_dispatch () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0 #9 0x000055a5db31d638 in main_loop_wait () #10 0x000055a5dafad309 in qemu_main_loop () #11 0x000055a5dae9125e in main () (gdb) The QEMU project is currently moving its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If the bug has already been fixed in the latest upstream version of QEMU, then please close this ticket as "Fix released". If it is not fixed yet and you think that this bug report here is still valid, then you have two options: 1) If you already have an account on gitlab.com, please open a new ticket for this problem in our new tracker here: https://gitlab.com/qemu-project/qemu/-/issues and then close this ticket here on Launchpad (or let it expire auto- matically after 60 days). Please mention the URL of this bug ticket on Launchpad in the new ticket on GitLab. 2) If you don't have an account on gitlab.com and don't intend to get one, but still would like to keep this ticket opened, then please switch the state back to "New" within the next 60 days (otherwise it will get closed as "Expired"). We will then eventually migrate the ticket auto- matically to the new system (but you won't be the reporter of the bug in the new system and thus won't get notified on changes anymore). Thank you and sorry for the inconvenience. Issue does not occur in latest version of QEMU anymore.