summary refs log tree commit diff stats
path: root/results/classifier/zero-shot/118/unknown/1884684
blob: 840515b6cf5006386415acaf80d3bf797ab14a02 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
permissions: 0.893
user-level: 0.885
performance: 0.881
ppc: 0.880
virtual: 0.877
peripherals: 0.873
hypervisor: 0.871
vnc: 0.867
register: 0.859
TCG: 0.852
mistranslation: 0.841
device: 0.839
x86: 0.839
KVM: 0.838
VMM: 0.837
architecture: 0.833
risc-v: 0.832
graphic: 0.831
debug: 0.828
semantic: 0.824
arm: 0.816
network: 0.816
i386: 0.802
assembly: 0.794
files: 0.785
socket: 0.763
kernel: 0.753
PID: 0.747
boot: 0.704

QEMU 5.0: Guest VM hangs/freeze when unplugging USB device

Setup:

Host: Debian/SID, Kernel 5.6, QEMU 5.0
Guest: Windows 10 VM with PCI and USB device passthrough.

Problem: Guest VM suddenly hangs when pulling USB device out from the Host.

Observations:
 - Issue appears to be related to QEMU 5.0
   - It started after an upgrade to QEMU 5.0.
   - Downgrading only QEMU on multiple systems fixes the issue.

 - Issue is very reproducible.
   - Most of the time within a few attempts of pulling/reconnecting the device.
   - Issue happens with multiple devices (I did try standard HID devices, a webcam and an x-ray sensor).

 - Guest just hangs.
   - Display output remains on last frame shown.
   - Ping to Guest immediately stops working.
   - Logs in the Guest stop logging immediately.

 - Host is fine and thinks the Guest is fine. 
   - Guest continues to show as running in "virsh list".
   - No suspicious entries in the QEMU logs.
   - No suspicious entries in Host syslogs/messages.
   - Host can can kill guest "virsh destroy" and respawn fine.

 - Issue seems widespread.
   - Multiple similar reports from ProxMox users after upgrade to ProxMox 6.2 for both Windows and Linux guests (First version that uses QEMU 5.0)

https://forum.proxmox.com/threads/vm-freezes-when-disconnecting-usb-keyboard-and-mouse.70287/
https://forum.proxmox.com/threads/usb-drive-crashes-vm.70214/
https://forum.proxmox.com/threads/latest-proxmox-usb-disconnects-freeze-kvm.70398/
https://forum.proxmox.com/threads/vm-with-gpu-passthrough-freezes-when-turning-off-monitor-after-proxmox-6-2-upgrade.69821/
https://forum.proxmox.com/threads/vm-with-gpu-passthrough-freezes-when-turning-off-monitor-after-proxmox-6-2-upgrade.69824/

I'd be more than happy any debugs that might be helpful.

Following reports on Proxmox forums, this is still very much seen by multiple users with no known workaround.

I was able to run QEMU 5.0.13 (Debian) with all traces turned on and capture the following:

- Behavior is reproducible by unbinding usb device on the host (ex. "echo '1-8' > /sys/bus/usb/drivers/usb/unbind")
- qemu trace logs stops at exactly the same time when VM freezes
- Last few lines of the qemu trace:

1592303@1596123157.254134:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010
1592320@1596123158.822309:usb_xhci_oper_read off 0x0004, ret 0x00000008
1592320@1596123158.822397:usb_xhci_port_read port 1, off 0x0000, ret 0x0e0002a0
1592320@1596123158.822459:usb_xhci_port_read port 2, off 0x0000, ret 0x0e0002a0
1592320@1596123158.822513:usb_xhci_port_read port 3, off 0x0000, ret 0x0e0002a0
1592320@1596123158.822565:usb_xhci_port_read port 4, off 0x0000, ret 0x0e0002a0
1592303@1596123159.858372:virtqueue_alloc_element elem 0x56193c8c8990 size 56 in_num 2 out_num 0
1592303@1596123159.858435:virtqueue_pop vq 0x7fcd5c48a010 elem 0x56193c8c8990 in_num 2 out_num 0
1592303@1596123159.858482:virtqueue_fill vq 0x7fcd5c48a010 elem 0x56193c8c8990 len 72 idx 0
1592303@1596123159.858533:virtqueue_flush vq 0x7fcd5c48a010 count 1
1592303@1596123159.858565:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010
1592303@1596123160.272641:virtqueue_alloc_element elem 0x56193c8c8990 size 56 in_num 2 out_num 0
1592303@1596123160.272702:virtqueue_pop vq 0x7fcd5c48a010 elem 0x56193c8c8990 in_num 2 out_num 0
1592303@1596123160.272751:virtqueue_fill vq 0x7fcd5c48a010 elem 0x56193c8c8990 len 104 idx 0
1592303@1596123160.272802:virtqueue_flush vq 0x7fcd5c48a010 count 1
1592303@1596123160.272833:virtio_notify vdev 0x56193d04d820 vq 0x7fcd5c48a010
1592303@1596123160.845694:lockcnt_unlock_attempt lockcnt 0x56193bea6514 unlock 5->4
1592303@1596123160.846821:lockcnt_unlock_success lockcnt 0x56193bea6514 unlock 5->4 succeeded
1592303@1596123160.847923:usb_host_req_complete dev 1:4, packet 0x7fcb84000ea8, status 0, length 0
1592303@1596123160.849369:usb_packet_state_change bus 0, port 1, ep 2, packet 0x7fcb84000ea8, state async -> complete
1592303@1596123160.851157:usb_xhci_xfer_success 0x7fcb84000ea0: len 0
1592303@1596123160.851214:usb_xhci_queue_event v 3, idx 5, ER_TRANSFER, CC_SHORT_PACKET, p 0xffffac0c62444ae3, s 0x0d000000, c 0x02058005
1592303@1596123160.851285:usb_xhci_irq_msix nr 3
1592303@1596123160.851331:usb_xhci_ep_kick slotid 2, epid 5, streamid 0
1592303@1596123160.851374:usb_host_req_data dev 1:4, packet 0x56193cce8da8, in 1, ep 2, size 4
1592303@1596123160.851434:usb_host_req_complete dev 1:4, packet 0x56193cce8da8, status -1, length 0
1592303@1596123160.851485:usb_packet_state_change bus 0, port 1, ep 2, packet 0x56193cce8da8, state queued -> complete
1592303@1596123160.851541:usb_xhci_xfer_error 0x56193cce8da0: ret -1
1592303@1596123160.851577:usb_xhci_queue_event v 3, idx 6, ER_TRANSFER, CC_USB_TRANSACTION_ERROR, p 0x00000001c18a4e20, s 0x04000004, c 0x02058001
1592303@1596123160.851647:usb_xhci_ep_state slotid 2, epid 5, running -> halted
1592303@1596123160.852700:usb_xhci_ep_kick slotid 2, epid 5, streamid 0
1592303@1596123160.852744:usb_host_req_complete dev 1:4, packet 0x7fcb84000b98, status 0, length 0
1592303@1596123160.852788:usb_packet_state_change bus 0, port 1, ep 1, packet 0x7fcb84000b98, state async -> complete
1592303@1596123160.852845:usb_xhci_xfer_success 0x7fcb84000b90: len 0
1592303@1596123160.852879:usb_xhci_queue_event v 3, idx 7, ER_TRANSFER, CC_SHORT_PACKET, p 0xffffac0c6229aae3, s 0x0d000000, c 0x02038005
1592303@1596123160.852945:usb_xhci_ep_kick slotid 2, epid 3, streamid 0
1592303@1596123160.852977:usb_host_req_data dev 1:4, packet 0x56193c9da348, in 1, ep 1, size 8
1592303@1596123160.853031:usb_host_req_complete dev 1:4, packet 0x56193c9da348, status -1, length 0
1592303@1596123160.853080:usb_packet_state_change bus 0, port 1, ep 1, packet 0x56193c9da348, state queued -> complete
1592303@1596123160.853136:usb_xhci_xfer_error 0x56193c9da340: ret -1
1592303@1596123160.853170:usb_xhci_queue_event v 3, idx 8, ER_TRANSFER, CC_USB_TRANSACTION_ERROR, p 0x00000001c18a4c20, s 0x04000008, c 0x02038001
1592303@1596123160.853240:usb_xhci_ep_state slotid 2, epid 3, running -> halted
1592303@1596123160.853280:usb_xhci_ep_kick slotid 2, epid 3, streamid 0
1592303@1596123160.853316:lockcnt_unlock_attempt lockcnt 0x56193bea6514 unlock 1->4
1592303@1596123160.853352:lockcnt_unlock_success lockcnt 0x56193bea6514 unlock 1->4 succeeded
1592303@1596123160.853564:usb_host_close dev 1:4
libusb: error [udev_hotplug_event] ignoring udev action unbind


Link to bug on the proxmox side:

https://bugzilla.proxmox.com/show_bug.cgi?id=2781

I do get get the same backtrace in gdb every time every time when we reproduce the hang:

(gdb) thread apply all bt

Thread 9 (Thread 0x7fd1415ff700 (LWP 3202)):
#0  0x00007fd323d154bf in __GI___poll (fds=0x7fd1415fe6c0, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fd324978bb2 in ?? () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0
#2  0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#3  0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 8 (Thread 0x7fd1437fe700 (LWP 3171)):
#0  0x00007fd323d16d87 in ioctl () at ../sysdeps/unix/syscall-template.S:120
#1  0x000055a5daef74f7 in kvm_vcpu_ioctl ()
#2  0x000055a5daef7631 in kvm_cpu_exec ()
#3  0x000055a5daedaede in ?? ()
#4  0x000055a5db32194b in ?? ()
#5  0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 7 (Thread 0x7fd143fff700 (LWP 3170)):
#0  0x00007fd323d16d87 in ioctl () at ../sysdeps/unix/syscall-template.S:120
#1  0x000055a5daef74f7 in kvm_vcpu_ioctl ()
#2  0x000055a5daef7631 in kvm_cpu_exec ()
#3  0x000055a5daedaede in ?? ()
#4  0x000055a5db32194b in ?? ()
#5  0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7fd150dfd700 (LWP 3169)):
#0  __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52
#1  0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80
#2  0x000055a5db321b43 in qemu_mutex_lock_impl ()
#3  0x000055a5daedac8e in qemu_mutex_lock_iothread_impl ()
#4  0x000055a5dae92ac9 in ?? ()
#5  0x000055a5dae97de7 in flatview_read_continue ()
#6  0x000055a5dae98023 in ?? ()
#7  0x000055a5dae9813b in address_space_read_full ()
#8  0x000055a5daef78cf in kvm_cpu_exec ()
#9  0x000055a5daedaede in ?? ()
#10 0x000055a5db32194b in ?? ()
#11 0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7fd1515fe700 (LWP 3168)):
#0  __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52
#1  0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80
#2  0x000055a5db321b43 in qemu_mutex_lock_impl ()
#3  0x000055a5daedac8e in qemu_mutex_lock_iothread_impl ()
#4  0x000055a5dae92ac9 in ?? ()
#5  0x000055a5dae97de7 in flatview_read_continue ()
#6  0x000055a5dae98023 in ?? ()
#7  0x000055a5dae9813b in address_space_read_full ()
#8  0x000055a5daef78cf in kvm_cpu_exec ()
#9  0x000055a5daedaede in ?? ()
#10 0x000055a5db32194b in ?? ()
#11 0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7fd151dff700 (LWP 3167)):
#0  __lll_lock_wait (futex=futex@entry=0x55a5db80a540, private=0) at lowlevellock.c:52
#1  0x00007fd323df2843 in __GI___pthread_mutex_lock (mutex=0x55a5db80a540) at ../nptl/pthread_mutex_lock.c:80
--Type <RET> for more, q to quit, c to continue without paging--
#2  0x000055a5db321b43 in qemu_mutex_lock_impl ()
#3  0x000055a5daedac8e in qemu_mutex_lock_iothread_impl ()
#4  0x000055a5dae92ac9 in ?? ()
#5  0x000055a5dae97de7 in flatview_read_continue ()
#6  0x000055a5dae98023 in ?? ()
#7  0x000055a5dae9813b in address_space_read_full ()
#8  0x000055a5daef78cf in kvm_cpu_exec ()
#9  0x000055a5daedaede in ?? ()
#10 0x000055a5db32194b in ?? ()
#11 0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#12 0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7fd320d97700 (LWP 3166)):
#0  0x00007fd323d154bf in __GI___poll (fds=0x7fd318003180, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fd324a097ee in ?? () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fd324a09b53 in g_main_loop_run () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x000055a5db016c71 in ?? ()
#4  0x000055a5db32194b in ?? ()
#5  0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7fd3224de700 (LWP 3156)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x000055a5db3226fa in qemu_event_wait ()
#2  0x000055a5db33466a in ?? ()
#3  0x000055a5db32194b in ?? ()
#4  0x00007fd323defea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#5  0x00007fd323d1feaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7fd3224dff40 (LWP 3148)):
#0  0x00007fd323d154bf in __GI___poll (fds=0x55a5dca30150, nfds=3, timeout=3) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fd324971f4d in ?? () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0
#2  0x00007fd32497316c in libusb_handle_events_timeout_completed () from target:/lib/x86_64-linux-gnu/libusb-1.0.so.0
#3  0x000055a5db18edc7 in ?? ()
#4  0x000055a5db18efab in ?? ()
#5  0x000055a5db31abf7 in aio_bh_poll ()
#6  0x000055a5db31e3fe in aio_dispatch ()
#7  0x000055a5db31aace in ?? ()
#8  0x00007fd324a095fd in g_main_context_dispatch () from target:/lib/x86_64-linux-gnu/libglib-2.0.so.0
#9  0x000055a5db31d638 in main_loop_wait ()
#10 0x000055a5dafad309 in qemu_main_loop ()
#11 0x000055a5dae9125e in main ()
(gdb)

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting older bugs to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" within the next 60 days (otherwise it will get
closed as "Expired"). We will then eventually migrate the ticket auto-
matically to the new system (but you won't be the reporter of the bug
in the new system and thus won't get notified on changes anymore).

Thank you and sorry for the inconvenience.


Issue does not occur in latest version of QEMU anymore.