other: 0.664 graphic: 0.601 performance: 0.498 boot: 0.497 device: 0.488 semantic: 0.481 network: 0.475 KVM: 0.469 vnc: 0.463 debug: 0.457 files: 0.423 permissions: 0.420 socket: 0.380 PID: 0.367 Assertion failed with USB pass through with XHCI controller Starting qemu 2.8.0 with XHCI controller and host device passed through results in an assertion failure: qemu-system-x86_64: hw/usb/core.c:623: usb_packet_cleanup: Assertion `!usb_packet_is_inflight(p)' failed. Can be reproduced with the following command (passing through a Lenovo keyboard): qemu-system-x86_64 -usb -device nec-usb-xhci,id=usb -device usb-host,vendorid=0x04b3,productid=0x3025,id=hostdev0,bus=usb.0,port=1 If nec-usb-xhci is changed to usb-ehci, qemu tries to boot without assertion failures. Can be reproduced with the latest master (commit dbe2b65) and v2.8.0. Bisected the issue to following commit: first bad commit: [94b037f2a451b3dc855f9f2c346e5049a361bd55] xhci: use linked list for transfers Backtrace from commit dbe2b65: #0 0x00007f2eb4657227 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 resultvar = 0 pid = 3453 selftid = 3453 #1 0x00007f2eb465867a in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, sa_mask = {__val = {140734740550528, 93876690035339, 140734740550624, 48833659808, 0, 0, 0, 21474836480, 140734740550792, 139838573009553, 140734740550560, 139838573043008, 139838573024160, 93876666665872, 139838702616576, 139838573024160}}, sa_flags = 1528954938, sa_restorer = 0x55615b2202c0 <__PRETTY_FUNCTION__.38612>} sigs = {__val = {32, 0 }} #2 0x00007f2eb46502cd in __assert_fail_base (fmt=0x7f2eb47893a0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x55615b22003a "!usb_packet_is_inflight(p)", file=file@entry=0x55615b21fdf0 "hw/usb/core.c", line=line@entry=619, function=function@entry=0x55615b2202c0 <__PRETTY_FUNCTION__.38612> "usb_packet_cleanup") at assert.c:92 str = 0x55615cfdf510 "" total = 4096 #3 0x00007f2eb4650382 in __GI___assert_fail (assertion=0x55615b22003a "!usb_packet_is_inflight(p)", file=0x55615b21fdf0 "hw/usb/core.c", line=619, function=0x55615b2202c0 <__PRETTY_FUNCTION__.38612> "usb_packet_cleanup") at assert.c:101 No locals. #4 0x000055615afc385e in usb_packet_cleanup () No symbol table info available. #5 0x000055615afda555 in xhci_ep_free_xfer () No symbol table info available. #6 0x000055615afdc156 in xhci_kick_epctx () No symbol table info available. #7 0x000055615afda099 in xhci_ep_kick_timer () No symbol table info available. #8 0x000055615b08ceee in timerlist_run_timers () No symbol table info available. #9 0x000055615b08cf36 in qemu_clock_run_timers () No symbol table info available. #10 0x000055615b08d2df in qemu_clock_run_all_timers () No symbol table info available. #11 0x000055615b08be40 in main_loop_wait () No symbol table info available. #12 0x000055615ae3870f in main_loop () No symbol table info available. #13 0x000055615ae4027b in main () This behaviour was introduced by commit: 94b037f2a451b3dc855f9f2c346e5049a361bd55 xhci: use linked list for transfers However, QEMU does not crash yet, but linux' xhci_hcd reports errors like "ERROR Transfer event TRB DMA...". The following commit 5612564ea9cf5b9636438a1b58ae9a2ab6ca16ae xhci: drop XHCITransfer->xhci finally makes QEMU crash on the assertion check. I tried to dig into the code, but I'm not an expert in usb stuff so I don't understand it. usb_packet_is_inflight checks if USBPacket.state is USB_PACKET_QUEUED or USB_PACKET_ASYNC. I suppose that somewhere in the code changed by 94b037f2 finished usb transfers do not have the packet state changed. Hi, > qemu-system-x86_64: hw/usb/core.c:623: usb_packet_cleanup: Assertion > `!usb_packet_is_inflight(p)' failed. We are trying to free a in-flight transfer. Hmm. > Bisected the issue to following commit: > first bad commit: [94b037f2a451b3dc855f9f2c346e5049a361bd55] xhci: use linked list for transfers Ok. > #5 0x000055615afda555 in xhci_ep_free_xfer () > No symbol table info available. > #6 0x000055615afdc156 in xhci_kick_epctx () > No symbol table info available. Can you rebuild with debug into and try again? There are multiple xhci_ep_free_xfer() callsites in xhci_kick_epctx() and it would be useful to know which one is it. thanks, Gerd Hi, using qemu commit f634151b02ce5c80605383894f1f63f2c12e0033 configured with --python=/usr/bin/python2 --target-list=x86_64-softmmu --audio-drv-list="oss alsa sdl pa" --enable-debug running with -m 1024 -drive if=pflash,file=ovmf-arch.bin,format=raw -drive file=arch.raw,format=raw,if=virtio -device nec-usb-xhci,id=xhci -device usb-host,bus=xhci.0,vendorid=0x046d,productid=0x0a4d I get: (gdb) bt full #0 0x00007fffdccb304f in raise () at /usr/lib/libc.so.6 #1 0x00007fffdccb447a in abort () at /usr/lib/libc.so.6 #2 0x00007fffdccabea7 in __assert_fail_base () at /usr/lib/libc.so.6 #3 0x00007fffdccabf52 in () at /usr/lib/libc.so.6 #4 0x0000555555a8ab6e in usb_packet_cleanup (p=0x7fff5c125c18) at hw/usb/core.c:619 __PRETTY_FUNCTION__ = "usb_packet_cleanup" #5 0x0000555555aa8d97 in xhci_ep_free_xfer (xfer=0x7fff5c125c10) at hw/usb/hcd-xhci.c:1465 #6 0x0000555555aaa9a8 in xhci_kick_epctx (epctx=0x7fff5c0205d0, streamid=0) at hw/usb/hcd-xhci.c:2201 xfer = 0x7fff5c125c10 xhci = 0x7fff76538010 stctx = 0x7fffffffd960 xfer = 0x2ffffd920 ring = 0x5555562aeed0 ep = 0x0 mfindex = 113160 length = 1434054766 i = 32767 __PRETTY_FUNCTION__ = "xhci_kick_epctx" #7 0x0000555555aa88d9 in xhci_ep_kick_timer (opaque=0x7fff5c0205d0) at hw/usb/hcd-xhci.c:1363 epctx = 0x7fff5c0205d0 #8 0x0000555555b6217f in timerlist_run_timers (timer_list=0x5555567a25a0) at qemu-timer.c:540 ts = 0x7fff5cb8f210 current_time = 31951197002 progress = false cb = 0x555555aa88b4 opaque = 0x7fff5c0205d0 #9 0x0000555555b621cb in qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at qemu-timer.c:551 #10 0x0000555555b62564 in qemu_clock_run_all_timers () at qemu-timer.c:665 progress = false type = QEMU_CLOCK_VIRTUAL #11 0x0000555555b610be in main_loop_wait (nonblocking=0) at main-loop.c:516 ret = 0 timeout = 1000 timeout_ns = 289955 #12 0x00005555558f0b97 in main_loop () at vl.c:1966 nonblocking = false last_io = 0 #13 0x00005555558f847c in main (argc=11, argv=0x7fffffffde18, envp=0x7fffffffde78) at vl.c:4685 i = 0 snapshot = 0 linux_boot = 0 initrd_filename = 0x0 kernel_filename = 0x0 kernel_cmdline = 0x555555c8ecb6 "" boot_order = 0x555555c7c5b5 "cad" boot_once = 0x0 ds = 0x555557777de0 cyls = 0 heads = 0 secs = 0 translation = 0 hda_opts = 0x0 opts = 0x0 machine_opts = 0x5555567a2480 icount_opts = 0x0 olist = 0x0 optind = 11 optarg = 0x7fffffffe296 "usb-host,bus=xhci.0,vendorid=0x046d,productid=0x0a4d" loadvm = 0x0 machine_class = 0x55555676bef0 cpu_model = 0x0 vga_model = 0x555555c7cece "std" qtest_chrdev = 0x0 qtest_log = 0x0 pid_file = 0x0 incoming = 0x0 defconfig = true userconfig = true nographic = false display_type = DT_GTK display_remote = 0 log_mask = 0x0 log_file = 0x0 trace_file = 0x0 maxram_size = 1073741824 ram_slots = 0 vmstate_dump_file = 0x0 main_loop_err = 0x0 err = 0x0 list_data_dirs = false __func__ = "main" Hope this helps. I examined xhci_kick_epctx (frame 6) and looked into xfer and xfer->packet, maybe this helps: (gdb) bt #0 0x00007fffdccb304f in raise () at /usr/lib/libc.so.6 #1 0x00007fffdccb447a in abort () at /usr/lib/libc.so.6 #2 0x00007fffdccabea7 in __assert_fail_base () at /usr/lib/libc.so.6 #3 0x00007fffdccabf52 in () at /usr/lib/libc.so.6 #4 0x0000555555a8ab6e in usb_packet_cleanup (p=0x7fff5c3bcd88) at hw/usb/core.c:619 #5 0x0000555555aa8d97 in xhci_ep_free_xfer (xfer=0x7fff5c3bcd80) at hw/usb/hcd-xhci.c:1465 #6 0x0000555555aaa9a8 in xhci_kick_epctx (epctx=0x7fff5c745290, streamid=0) at hw/usb/hcd-xhci.c:2201 #7 0x0000555555aa88d9 in xhci_ep_kick_timer (opaque=0x7fff5c745290) at hw/usb/hcd-xhci.c:1363 #8 0x0000555555b6217f in timerlist_run_timers (timer_list=0x5555567a25a0) at qemu-timer.c:540 #9 0x0000555555b621cb in qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at qemu-timer.c:551 #10 0x0000555555b62564 in qemu_clock_run_all_timers () at qemu-timer.c:665 #11 0x0000555555b610be in main_loop_wait (nonblocking=0) at main-loop.c:516 #12 0x00005555558f0b97 in main_loop () at vl.c:1966 #13 0x00005555558f847c in main (argc=11, argv=0x7fffffffde18, envp=0x7fffffffde78) at vl.c:4685 (gdb) f 6 #6 0x0000555555aaa9a8 in xhci_kick_epctx (epctx=0x7fff5c745290, streamid=0) at hw/usb/hcd-xhci.c:2201 2201 xhci_ep_free_xfer(epctx->retry); (gdb) info local xfer = 0x7fff5c3bcd80 xhci = 0x7fff76538010 stctx = 0x7fffffffd960 xfer = 0x2ffffd920 ring = 0x5555562aeed0 ep = 0x0 mfindex = 126425 length = 1434054766 i = 32767 __PRETTY_FUNCTION__ = "xhci_kick_epctx" (gdb) print xfer $1 = (XHCITransfer *) 0x7fff5c3bcd80 (gdb) print *xfer $2 = {epctx = 0x7fff5c745290, packet = {pid = 105, id = 1028964352, ep = 0x555558342660, stream = 0, iov = {iov = 0x7fff5c138960, niov = 1, nalloc = 1, size = 5}, parameter = 0, short_not_ok = false, int_req = true, status = -6, actual_length = 0, state = USB_PACKET_ASYNC, combined = 0x0, queue = {tqe_next = 0x0, tqe_prev = 0x555558342678}, combined_entry = {tqe_next = 0x0, tqe_prev = 0x0}}, sgl = {sg = 0x5555586401e0, nsg = 1, nalloc = 1, size = 5, dev = 0x7fff76538010, as = 0x7fff76538220}, running_async = true, running_retry = false, complete = false, int_req = true, iso_pkts = 0, streamid = 0, in_xfer = true, iso_xfer = false, timed_xfer = false, trb_count = 1, trbs = 0x7fff5c025690, status = CC_INVALID, pkts = 0, pktsize = 0, cur_pkt = 0, mfindex_kick = 126424, next = {tqe_next = 0x0, tqe_prev = 0x0}} (gdb) print xfer->packet $3 = {pid = 105, id = 1028964352, ep = 0x555558342660, stream = 0, iov = {iov = 0x7fff5c138960, niov = 1, nalloc = 1, size = 5}, parameter = 0, short_not_ok = false, int_req = true, status = -6, actual_length = 0, state = USB_PACKET_ASYNC, combined = 0x0, queue = {tqe_next = 0x0, tqe_prev = 0x555558342678}, combined_entry = {tqe_next = 0x0, tqe_prev = 0x0}} Hi, > #6 0x0000555555aaa9a8 in xhci_kick_epctx (epctx=0x7fff5c0205d0, streamid=0) at hw/usb/hcd-xhci.c:2201 Ok, suspected already it will be there. thanks, Gerd On Mi, 2017-01-11 at 16:35 +0000, Fabian Lesniak wrote: > I examined xhci_kick_epctx (frame 6) and looked into xfer and > xfer->packet, maybe this helps: > (gdb) print *xfer > $2 = {epctx = 0x7fff5c745290, packet = {pid = 105, id = 1028964352, ep > = 0x555558342660, stream = 0, iov = {iov = 0x7fff5c138960, niov = 1, > nalloc = 1, size = 5}, parameter = 0, short_not_ok = false, int_req = > true, status = -6, actual_length = 0, state = USB_PACKET_ASYNC, ^^^^^^^^^^^^^^^^^^^^^^^^ Yep, packed still being processed at that point. Most callsites check already, one was missed. Cc: