diff options
Diffstat (limited to 'classification_output/01/other/88225572')
| -rw-r--r-- | classification_output/01/other/88225572 | 2900 |
1 files changed, 0 insertions, 2900 deletions
diff --git a/classification_output/01/other/88225572 b/classification_output/01/other/88225572 deleted file mode 100644 index b3cb5b26..00000000 --- a/classification_output/01/other/88225572 +++ /dev/null @@ -1,2900 +0,0 @@ -other: 0.987 -semantic: 0.976 -instruction: 0.974 -mistranslation: 0.942 - -[BUG qemu 4.0] segfault when unplugging virtio-blk-pci device - -Hi, - -I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -think it's because io completion hits use-after-free when device is -already gone. Is this a known bug that has been fixed? (I went through -the git log but didn't find anything obvious). - -gdb backtrace is: - -Core was generated by `/usr/local/libexec/qemu-kvm -name -sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -Program terminated with signal 11, Segmentation fault. -#0 object_get_class (obj=obj@entry=0x0) at -/usr/src/debug/qemu-4.0/qom/object.c:903 -903 return obj->class; -(gdb) bt -#0 object_get_class (obj=obj@entry=0x0) at -/usr/src/debug/qemu-4.0/qom/object.c:903 -#1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -Â Â vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -#2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -Â Â opaque=0x558a2f2fd420, ret=0) -Â Â at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -#3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -Â Â at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -#4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -Â Â i1=<optimized out>) at /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -#5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -#6 Â 0x00007fff9ed75780 in ?? () -#7 Â 0x0000000000000000 in ?? () - -It seems like qemu was completing a discard/write_zero request, but -parent BusState was already freed & set to NULL. - -Do we need to drain all pending request before unrealizing virtio-blk -device? Like the following patch proposed? -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -If more info is needed, please let me know. - -Thanks, -Eryu - -On Tue, 31 Dec 2019 18:34:34 +0800 -Eryu Guan <address@hidden> wrote: - -> -Hi, -> -> -I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -think it's because io completion hits use-after-free when device is -> -already gone. Is this a known bug that has been fixed? (I went through -> -the git log but didn't find anything obvious). -> -> -gdb backtrace is: -> -> -Core was generated by `/usr/local/libexec/qemu-kvm -name -> -sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -Program terminated with signal 11, Segmentation fault. -> -#0 object_get_class (obj=obj@entry=0x0) at -> -/usr/src/debug/qemu-4.0/qom/object.c:903 -> -903 return obj->class; -> -(gdb) bt -> -#0 object_get_class (obj=obj@entry=0x0) at -> -/usr/src/debug/qemu-4.0/qom/object.c:903 -> -#1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -Â Â vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -#2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -Â Â opaque=0x558a2f2fd420, ret=0) -> -Â Â at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -#3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -Â Â at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -#4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -Â Â i1=<optimized out>) at -> -/usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -#5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -#6 Â 0x00007fff9ed75780 in ?? () -> -#7 Â 0x0000000000000000 in ?? () -> -> -It seems like qemu was completing a discard/write_zero request, but -> -parent BusState was already freed & set to NULL. -> -> -Do we need to drain all pending request before unrealizing virtio-blk -> -device? Like the following patch proposed? -> -> -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> -If more info is needed, please let me know. -may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -Thanks, -> -Eryu -> - -On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -On Tue, 31 Dec 2019 18:34:34 +0800 -> -Eryu Guan <address@hidden> wrote: -> -> -> Hi, -> -> -> -> I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> think it's because io completion hits use-after-free when device is -> -> already gone. Is this a known bug that has been fixed? (I went through -> -> the git log but didn't find anything obvious). -> -> -> -> gdb backtrace is: -> -> -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 object_get_class (obj=obj@entry=0x0) at -> -> /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> 903 return obj->class; -> -> (gdb) bt -> -> #0 object_get_class (obj=obj@entry=0x0) at -> -> /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> #1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> Â Â vector=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> #2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> Â Â opaque=0x558a2f2fd420, ret=0) -> -> Â Â at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> #3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> Â Â at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> #4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> Â Â i1=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> #5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> #6 Â 0x00007fff9ed75780 in ?? () -> -> #7 Â 0x0000000000000000 in ?? () -> -> -> -> It seems like qemu was completing a discard/write_zero request, but -> -> parent BusState was already freed & set to NULL. -> -> -> -> Do we need to drain all pending request before unrealizing virtio-blk -> -> device? Like the following patch proposed? -> -> -> -> -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> -> -> If more info is needed, please let me know. -> -> -may be this will help: -https://patchwork.kernel.org/patch/11213047/ -Yeah, this looks promising! I'll try it out (though it's a one-time -crash for me). Thanks! - -Eryu - -On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> On Tue, 31 Dec 2019 18:34:34 +0800 -> -> Eryu Guan <address@hidden> wrote: -> -> -> -> > Hi, -> -> > -> -> > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > think it's because io completion hits use-after-free when device is -> -> > already gone. Is this a known bug that has been fixed? (I went through -> -> > the git log but didn't find anything obvious). -> -> > -> -> > gdb backtrace is: -> -> > -> -> > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > Program terminated with signal 11, Segmentation fault. -> -> > #0 object_get_class (obj=obj@entry=0x0) at -> -> > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > 903 return obj->class; -> -> > (gdb) bt -> -> > #0 object_get_class (obj=obj@entry=0x0) at -> -> > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > #1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > Â Â vector=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > #2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > Â Â opaque=0x558a2f2fd420, ret=0) -> -> > Â Â at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > #3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > Â Â at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > #4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > Â Â i1=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > #5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > #6 Â 0x00007fff9ed75780 in ?? () -> -> > #7 Â 0x0000000000000000 in ?? () -> -> > -> -> > It seems like qemu was completing a discard/write_zero request, but -> -> > parent BusState was already freed & set to NULL. -> -> > -> -> > Do we need to drain all pending request before unrealizing virtio-blk -> -> > device? Like the following patch proposed? -> -> > -> -> > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > -> -> > If more info is needed, please let me know. -> -> -> -> may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -Yeah, this looks promising! I'll try it out (though it's a one-time -> -crash for me). Thanks! -After applying this patch, I don't see the original segfaut and -backtrace, but I see this crash - -[Thread debugging using libthread_db enabled] -Using host libthread_db library "/lib64/libthread_db.so.1". -Core was generated by `/usr/local/libexec/qemu-kvm -name -sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -Program terminated with signal 11, Segmentation fault. -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -addr=0, val=<optimized out>, size=<optimized out>) at -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -1324 VirtIOPCIProxy *proxy = -VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -Missing separate debuginfos, use: debuginfo-install -glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -(gdb) bt -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -addr=0, val=<optimized out>, size=<optimized out>) at -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -#1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -addr=<optimized out>, value=<optimized out>, size=<optimized out>, -shift=<optimized out>, mask=<optimized out>, attrs=...) at -/usr/src/debug/qemu-4.0/memory.c:502 -#2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, access_size_min=<optimized -out>, access_size_max=<optimized out>, access_fn=0x561216835ac0 -<memory_region_write_accessor>, mr=0x56121846d340, attrs=...) - at /usr/src/debug/qemu-4.0/memory.c:568 -#3 0x0000561216837c66 in memory_region_dispatch_write -(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -#4 0x00005612167e036f in flatview_write_continue (fv=fv@entry=0x56121852edd0, -addr=addr@entry=841813602304, attrs=..., buf=buf@entry=0x7fce7dd97028 <Address -0x7fce7dd97028 out of bounds>, len=len@entry=2, addr1=<optimized out>, -l=<optimized out>, mr=0x56121846d340) - at /usr/src/debug/qemu-4.0/exec.c:3279 -#5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, addr=841813602304, -attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, len=2) at -/usr/src/debug/qemu-4.0/exec.c:3318 -#6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at -/usr/src/debug/qemu-4.0/exec.c:3408 -#7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, addr=<optimized -out>, attrs=..., attrs@entry=..., buf=buf@entry=0x7fce7dd97028 <Address -0x7fce7dd97028 out of bounds>, len=<optimized out>, is_write=<optimized out>) -at /usr/src/debug/qemu-4.0/exec.c:3419 -#8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -#9 0x000056121682255e in qemu_kvm_cpu_thread_fn (arg=arg@entry=0x56121849aa00) -at /usr/src/debug/qemu-4.0/cpus.c:1281 -#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 - -And I searched and found -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -bug. - -But I can still hit the bug even after applying the commit. Do I miss -anything? - -Thanks, -Eryu -> -Eryu - -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > Eryu Guan <address@hidden> wrote: -> -> > -> -> > > Hi, -> -> > > -> -> > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > think it's because io completion hits use-after-free when device is -> -> > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > the git log but didn't find anything obvious). -> -> > > -> -> > > gdb backtrace is: -> -> > > -> -> > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > Program terminated with signal 11, Segmentation fault. -> -> > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > 903 return obj->class; -> -> > > (gdb) bt -> -> > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > vector=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > opaque=0x558a2f2fd420, ret=0) -> -> > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > i1=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > #6 0x00007fff9ed75780 in ?? () -> -> > > #7 0x0000000000000000 in ?? () -> -> > > -> -> > > It seems like qemu was completing a discard/write_zero request, but -> -> > > parent BusState was already freed & set to NULL. -> -> > > -> -> > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > device? Like the following patch proposed? -> -> > > -> -> > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > -> -> > > If more info is needed, please let me know. -> -> > -> -> > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -> -> Yeah, this looks promising! I'll try it out (though it's a one-time -> -> crash for me). Thanks! -> -> -After applying this patch, I don't see the original segfaut and -> -backtrace, but I see this crash -> -> -[Thread debugging using libthread_db enabled] -> -Using host libthread_db library "/lib64/libthread_db.so.1". -> -Core was generated by `/usr/local/libexec/qemu-kvm -name -> -sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -Program terminated with signal 11, Segmentation fault. -> -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -addr=0, val=<optimized out>, size=<optimized out>) at -> -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -1324 VirtIOPCIProxy *proxy = -> -VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -Missing separate debuginfos, use: debuginfo-install -> -glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -(gdb) bt -> -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -addr=0, val=<optimized out>, size=<optimized out>) at -> -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -#1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -/usr/src/debug/qemu-4.0/memory.c:502 -> -#2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -access_size_min=<optimized out>, access_size_max=<optimized out>, -> -access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -attrs=...) -> -at /usr/src/debug/qemu-4.0/memory.c:568 -> -#3 0x0000561216837c66 in memory_region_dispatch_write -> -(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -#4 0x00005612167e036f in flatview_write_continue -> -(fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -len=len@entry=2, addr1=<optimized out>, l=<optimized out>, mr=0x56121846d340) -> -at /usr/src/debug/qemu-4.0/exec.c:3279 -> -#5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out -> -of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -#6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at -> -/usr/src/debug/qemu-4.0/exec.c:3408 -> -#7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -addr=<optimized out>, attrs=..., attrs@entry=..., -> -buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -len=<optimized out>, is_write=<optimized out>) at -> -/usr/src/debug/qemu-4.0/exec.c:3419 -> -#8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -#9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -(arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -And I searched and found -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -bug. -> -> -But I can still hit the bug even after applying the commit. Do I miss -> -anything? -Hi Eryu, -This backtrace seems to be caused by this bug (there were two bugs in -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -Although the solution hasn't been tested on virtio-blk yet, you may -want to apply this patch: -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -Let me know if this works. - -Best regards, Julia Suvorova. - -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -> -> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > > Hi, -> -> > > > -> -> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > > think it's because io completion hits use-after-free when device is -> -> > > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > > the git log but didn't find anything obvious). -> -> > > > -> -> > > > gdb backtrace is: -> -> > > > -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > 903 return obj->class; -> -> > > > (gdb) bt -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > vector=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > i1=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > #7 0x0000000000000000 in ?? () -> -> > > > -> -> > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > parent BusState was already freed & set to NULL. -> -> > > > -> -> > > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > > device? Like the following patch proposed? -> -> > > > -> -> > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > -> -> > > > If more info is needed, please let me know. -> -> > > -> -> > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > -> -> > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > crash for me). Thanks! -> -> -> -> After applying this patch, I don't see the original segfaut and -> -> backtrace, but I see this crash -> -> -> -> [Thread debugging using libthread_db enabled] -> -> Using host libthread_db library "/lib64/libthread_db.so.1". -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> 1324 VirtIOPCIProxy *proxy = -> -> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> Missing separate debuginfos, use: debuginfo-install -> -> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> (gdb) bt -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -> addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> /usr/src/debug/qemu-4.0/memory.c:502 -> -> #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -> attrs=...) -> -> at /usr/src/debug/qemu-4.0/memory.c:568 -> -> #3 0x0000561216837c66 in memory_region_dispatch_write -> -> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> #4 0x00005612167e036f in flatview_write_continue -> -> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> mr=0x56121846d340) -> -> at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) -> -> at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., attrs@entry=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=<optimized out>, is_write=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/exec.c:3419 -> -> #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -> -> And I searched and found -> -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> bug. -> -> -> -> But I can still hit the bug even after applying the commit. Do I miss -> -> anything? -> -> -Hi Eryu, -> -This backtrace seems to be caused by this bug (there were two bugs in -> -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -Although the solution hasn't been tested on virtio-blk yet, you may -> -want to apply this patch: -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -Let me know if this works. -Will try it out, thanks a lot! - -Eryu - -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -> -> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > > Hi, -> -> > > > -> -> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > > think it's because io completion hits use-after-free when device is -> -> > > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > > the git log but didn't find anything obvious). -> -> > > > -> -> > > > gdb backtrace is: -> -> > > > -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > 903 return obj->class; -> -> > > > (gdb) bt -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > vector=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > i1=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > #7 0x0000000000000000 in ?? () -> -> > > > -> -> > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > parent BusState was already freed & set to NULL. -> -> > > > -> -> > > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > > device? Like the following patch proposed? -> -> > > > -> -> > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > -> -> > > > If more info is needed, please let me know. -> -> > > -> -> > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > -> -> > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > crash for me). Thanks! -> -> -> -> After applying this patch, I don't see the original segfaut and -> -> backtrace, but I see this crash -> -> -> -> [Thread debugging using libthread_db enabled] -> -> Using host libthread_db library "/lib64/libthread_db.so.1". -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> 1324 VirtIOPCIProxy *proxy = -> -> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> Missing separate debuginfos, use: debuginfo-install -> -> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> (gdb) bt -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -> addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> /usr/src/debug/qemu-4.0/memory.c:502 -> -> #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -> attrs=...) -> -> at /usr/src/debug/qemu-4.0/memory.c:568 -> -> #3 0x0000561216837c66 in memory_region_dispatch_write -> -> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> #4 0x00005612167e036f in flatview_write_continue -> -> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> mr=0x56121846d340) -> -> at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) -> -> at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., attrs@entry=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=<optimized out>, is_write=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/exec.c:3419 -> -> #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -> -> And I searched and found -> -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> bug. -> -> -> -> But I can still hit the bug even after applying the commit. Do I miss -> -> anything? -> -> -Hi Eryu, -> -This backtrace seems to be caused by this bug (there were two bugs in -> -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -Although the solution hasn't been tested on virtio-blk yet, you may -> -want to apply this patch: -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -Let me know if this works. -Unfortunately, I still see the same segfault & backtrace after applying -commit 421afd2fe8dd ("virtio: reset region cache when on queue -deletion") - -Anything I can help to debug? - -Thanks, -Eryu - -On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > -> -> > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > Eryu Guan <address@hidden> wrote: -> -> > > > -> -> > > > > Hi, -> -> > > > > -> -> > > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, -> -> > > > > I -> -> > > > > think it's because io completion hits use-after-free when device is -> -> > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > through -> -> > > > > the git log but didn't find anything obvious). -> -> > > > > -> -> > > > > gdb backtrace is: -> -> > > > > -> -> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > 903 return obj->class; -> -> > > > > (gdb) bt -> -> > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > > vector=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > > i1=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > #7 0x0000000000000000 in ?? () -> -> > > > > -> -> > > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > > parent BusState was already freed & set to NULL. -> -> > > > > -> -> > > > > Do we need to drain all pending request before unrealizing -> -> > > > > virtio-blk -> -> > > > > device? Like the following patch proposed? -> -> > > > > -> -> > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > -> -> > > > > If more info is needed, please let me know. -> -> > > > -> -> > > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > > -> -> > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > crash for me). Thanks! -> -> > -> -> > After applying this patch, I don't see the original segfaut and -> -> > backtrace, but I see this crash -> -> > -> -> > [Thread debugging using libthread_db enabled] -> -> > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > Program terminated with signal 11, Segmentation fault. -> -> > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> > addr=0, val=<optimized out>, size=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > 1324 VirtIOPCIProxy *proxy = -> -> > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > Missing separate debuginfos, use: debuginfo-install -> -> > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > (gdb) bt -> -> > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> > addr=0, val=<optimized out>, size=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> > shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> > access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> > access_fn=0x561216835ac0 <memory_region_write_accessor>, -> -> > mr=0x56121846d340, attrs=...) -> -> > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > #4 0x00005612167e036f in flatview_write_continue -> -> > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > mr=0x56121846d340) -> -> > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> > out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > len=<optimized out>, is_write=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > -> -> > And I searched and found -> -> > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> > bug. -> -> > -> -> > But I can still hit the bug even after applying the commit. Do I miss -> -> > anything? -> -> -> -> Hi Eryu, -> -> This backtrace seems to be caused by this bug (there were two bugs in -> -> 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> Although the solution hasn't been tested on virtio-blk yet, you may -> -> want to apply this patch: -> -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> Let me know if this works. -> -> -Unfortunately, I still see the same segfault & backtrace after applying -> -commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -deletion") -> -> -Anything I can help to debug? -Please post the QEMU command-line and the QMP commands use to remove the -device. - -The backtrace shows a vcpu thread submitting a request. The device -seems to be partially destroyed. That's surprising because the monitor -and the vcpu thread should use the QEMU global mutex to avoid race -conditions. Maybe seeing the QMP commands will make it clearer... - -Stefan -signature.asc -Description: -PGP signature - -On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > -> -> > > > > > Hi, -> -> > > > > > -> -> > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > sandbox, I -> -> > > > > > think it's because io completion hits use-after-free when device -> -> > > > > > is -> -> > > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > > through -> -> > > > > > the git log but didn't find anything obvious). -> -> > > > > > -> -> > > > > > gdb backtrace is: -> -> > > > > > -> -> > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > 903 return obj->class; -> -> > > > > > (gdb) bt -> -> > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > vector=<optimized out>) at -> -> > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > out>, -> -> > > > > > i1=<optimized out>) at -> -> > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > -> -> > > > > > It seems like qemu was completing a discard/write_zero request, -> -> > > > > > but -> -> > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > -> -> > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > virtio-blk -> -> > > > > > device? Like the following patch proposed? -> -> > > > > > -> -> > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > -> -> > > > > > If more info is needed, please let me know. -> -> > > > > -> -> > > > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > > > -> -> > > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > > crash for me). Thanks! -> -> > > -> -> > > After applying this patch, I don't see the original segfaut and -> -> > > backtrace, but I see this crash -> -> > > -> -> > > [Thread debugging using libthread_db enabled] -> -> > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > Program terminated with signal 11, Segmentation fault. -> -> > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > 1324 VirtIOPCIProxy *proxy = -> -> > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > Missing separate debuginfos, use: debuginfo-install -> -> > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > > (gdb) bt -> -> > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized -> -> > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> > > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> > > access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> > > access_fn=0x561216835ac0 <memory_region_write_accessor>, -> -> > > mr=0x56121846d340, attrs=...) -> -> > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > #4 0x00005612167e036f in flatview_write_continue -> -> > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > > mr=0x56121846d340) -> -> > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > len=<optimized out>, is_write=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) -> -> > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > -> -> > > And I searched and found -> -> > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> > > bug. -> -> > > -> -> > > But I can still hit the bug even after applying the commit. Do I miss -> -> > > anything? -> -> > -> -> > Hi Eryu, -> -> > This backtrace seems to be caused by this bug (there were two bugs in -> -> > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > want to apply this patch: -> -> > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > Let me know if this works. -> -> -> -> Unfortunately, I still see the same segfault & backtrace after applying -> -> commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> deletion") -> -> -> -> Anything I can help to debug? -> -> -Please post the QEMU command-line and the QMP commands use to remove the -> -device. -It's a normal kata instance using virtio-fs as rootfs. - -/usr/local/libexec/qemu-kvm -name -sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ - -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ - -cpu host -qmp -unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait - \ - -qmp -unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait - \ - -m 2048M,slots=10,maxmem=773893M -device -pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ - -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -virtconsole,chardev=charconsole0,id=console0 \ - -chardev -socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait - \ - -device -virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \ - -chardev -socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait - \ - -device nvdimm,id=nv0,memdev=mem0 -object -memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 - \ - -object rng-random,id=rng0,filename=/dev/urandom -device -virtio-rng,rng=rng0,romfile= \ - -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ - -chardev -socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait - \ - -chardev -socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock - \ - -device -vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M --netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ - -device -driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= - \ - -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults --nographic -daemonize \ - -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on --numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ - -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 -i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 -console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 -root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro -rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96 -agent.use_vsock=false init=/usr/lib/systemd/systemd -systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service -systemd.mask=systemd-networkd.socket \ - -pidfile -/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -\ - -smp 1,cores=1,threads=1,sockets=96,maxcpus=96 - -QMP command to delete device (the device id is just an example, not the -one caused the crash): - -"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" - -which has been hot plugged by: -"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -"{\"return\": {}}" -"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -"{\"return\": {}}" - -> -> -The backtrace shows a vcpu thread submitting a request. The device -> -seems to be partially destroyed. That's surprising because the monitor -> -and the vcpu thread should use the QEMU global mutex to avoid race -> -conditions. Maybe seeing the QMP commands will make it clearer... -> -> -Stefan -Thanks! - -Eryu - -On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote: -> -On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -> On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > > -> -> > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > > -> -> > > > > > > Hi, -> -> > > > > > > -> -> > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > > sandbox, I -> -> > > > > > > think it's because io completion hits use-after-free when -> -> > > > > > > device is -> -> > > > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > > > through -> -> > > > > > > the git log but didn't find anything obvious). -> -> > > > > > > -> -> > > > > > > gdb backtrace is: -> -> > > > > > > -> -> > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > 903 return obj->class; -> -> > > > > > > (gdb) bt -> -> > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > > vector=<optimized out>) at -> -> > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > > out>, -> -> > > > > > > i1=<optimized out>) at -> -> > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > > -> -> > > > > > > It seems like qemu was completing a discard/write_zero request, -> -> > > > > > > but -> -> > > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > > -> -> > > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > > virtio-blk -> -> > > > > > > device? Like the following patch proposed? -> -> > > > > > > -> -> > > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > > -> -> > > > > > > If more info is needed, please let me know. -> -> > > > > > -> -> > > > > > may be this will help: -> -> > > > > > -https://patchwork.kernel.org/patch/11213047/ -> -> > > > > -> -> > > > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > > > crash for me). Thanks! -> -> > > > -> -> > > > After applying this patch, I don't see the original segfaut and -> -> > > > backtrace, but I see this crash -> -> > > > -> -> > > > [Thread debugging using libthread_db enabled] -> -> > > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > 1324 VirtIOPCIProxy *proxy = -> -> > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > > Missing separate debuginfos, use: debuginfo-install -> -> > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > > libstdc++-4.8.5-28.alios7.1.x86_64 -> -> > > > numactl-libs-2.0.9-5.1.alios7.x86_64 pixman-0.32.6-3.1.alios7.x86_64 -> -> > > > zlib-1.2.7-16.2.alios7.x86_64 -> -> > > > (gdb) bt -> -> > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized -> -> > > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > > > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > > #2 0x0000561216833c5d in access_with_adjusted_size -> -> > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8, -> -> > > > size=size@entry=2, access_size_min=<optimized out>, -> -> > > > access_size_max=<optimized out>, access_fn=0x561216835ac0 -> -> > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...) -> -> > > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > > #4 0x00005612167e036f in flatview_write_continue -> -> > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > > > mr=0x56121846d340) -> -> > > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > > len=<optimized out>, is_write=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) -> -> > > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > > -> -> > > > And I searched and found -> -> > > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the -> -> > > > same -> -> > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this -> -> > > > particular -> -> > > > bug. -> -> > > > -> -> > > > But I can still hit the bug even after applying the commit. Do I miss -> -> > > > anything? -> -> > > -> -> > > Hi Eryu, -> -> > > This backtrace seems to be caused by this bug (there were two bugs in -> -> > > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > > want to apply this patch: -> -> > > -> -> > > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > > Let me know if this works. -> -> > -> -> > Unfortunately, I still see the same segfault & backtrace after applying -> -> > commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> > deletion") -> -> > -> -> > Anything I can help to debug? -> -> -> -> Please post the QEMU command-line and the QMP commands use to remove the -> -> device. -> -> -It's a normal kata instance using virtio-fs as rootfs. -> -> -/usr/local/libexec/qemu-kvm -name -> -sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ -> --uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -> -q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ -> --cpu host -qmp -> -unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -\ -> --qmp -> -unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -\ -> --m 2048M,slots=10,maxmem=773893M -device -> -pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ -> --device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -> -virtconsole,chardev=charconsole0,id=console0 \ -> --chardev -> -socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait -> -\ -> --device -> -virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \ -> --chardev -> -socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait -> -\ -> --device nvdimm,id=nv0,memdev=mem0 -object -> -memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 -> -\ -> --object rng-random,id=rng0,filename=/dev/urandom -device -> -virtio-rng,rng=rng0,romfile= \ -> --device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ -> --chardev -> -socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait -> -\ -> --chardev -> -socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock -> -\ -> --device -> -vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M -> --netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ -> --device -> -driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= -> -\ -> --global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -> --nodefaults -nographic -daemonize \ -> --object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -> --numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ -> --append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 -> -i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k -> -console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 -> -pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro -> -ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96 -> -agent.use_vsock=false init=/usr/lib/systemd/systemd -> -systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service -> -systemd.mask=systemd-networkd.socket \ -> --pidfile -> -/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -> -\ -> --smp 1,cores=1,threads=1,sockets=96,maxcpus=96 -> -> -QMP command to delete device (the device id is just an example, not the -> -one caused the crash): -> -> -"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" -> -> -which has been hot plugged by: -> -"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -> -"{\"return\": {}}" -> -"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -> -"{\"return\": {}}" -Thanks. I wasn't able to reproduce this crash with qemu.git/master. - -One thing that is strange about the latest backtrace you posted: QEMU is -dispatching the memory access instead of using the ioeventfd code that -that virtio-blk-pci normally takes when a virtqueue is notified. I -guess this means ioeventfd has already been disabled due to the hot -unplug. - -Could you try with machine type "i440fx" instead of "q35"? I wonder if -pci-bridge/shpc is part of the problem. - -Stefan -signature.asc -Description: -PGP signature - -On Tue, Jan 14, 2020 at 04:16:24PM +0000, Stefan Hajnoczi wrote: -> -On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote: -> -> On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -> > On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> > > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > > > -> -> > > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > > > -> -> > > > > > > > Hi, -> -> > > > > > > > -> -> > > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > > > sandbox, I -> -> > > > > > > > think it's because io completion hits use-after-free when -> -> > > > > > > > device is -> -> > > > > > > > already gone. Is this a known bug that has been fixed? (I -> -> > > > > > > > went through -> -> > > > > > > > the git log but didn't find anything obvious). -> -> > > > > > > > -> -> > > > > > > > gdb backtrace is: -> -> > > > > > > > -> -> > > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > > 903 return obj->class; -> -> > > > > > > > (gdb) bt -> -> > > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > > > vector=<optimized out>) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > > > #3 0x0000558a2c261c7e in blk_aio_complete -> -> > > > > > > > (acb=0x558a2eed7420) -> -> > > > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > > > out>, -> -> > > > > > > > i1=<optimized out>) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > > > -> -> > > > > > > > It seems like qemu was completing a discard/write_zero -> -> > > > > > > > request, but -> -> > > > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > > > -> -> > > > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > > > virtio-blk -> -> > > > > > > > device? Like the following patch proposed? -> -> > > > > > > > -> -> > > > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > > > -> -> > > > > > > > If more info is needed, please let me know. -> -> > > > > > > -> -> > > > > > > may be this will help: -> -> > > > > > > -https://patchwork.kernel.org/patch/11213047/ -> -> > > > > > -> -> > > > > > Yeah, this looks promising! I'll try it out (though it's a -> -> > > > > > one-time -> -> > > > > > crash for me). Thanks! -> -> > > > > -> -> > > > > After applying this patch, I don't see the original segfaut and -> -> > > > > backtrace, but I see this crash -> -> > > > > -> -> > > > > [Thread debugging using libthread_db enabled] -> -> > > > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, -> -> > > > > size=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > > 1324 VirtIOPCIProxy *proxy = -> -> > > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > > > Missing separate debuginfos, use: debuginfo-install -> -> > > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > > > libstdc++-4.8.5-28.alios7.1.x86_64 -> -> > > > > numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > > > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > > > > (gdb) bt -> -> > > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, -> -> > > > > size=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > > #1 0x0000561216835b22 in memory_region_write_accessor -> -> > > > > (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, -> -> > > > > size=<optimized out>, shift=<optimized out>, mask=<optimized out>, -> -> > > > > attrs=...) at /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > > > #2 0x0000561216833c5d in access_with_adjusted_size -> -> > > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8, -> -> > > > > size=size@entry=2, access_size_min=<optimized out>, -> -> > > > > access_size_max=<optimized out>, access_fn=0x561216835ac0 -> -> > > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...) -> -> > > > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > > > #4 0x00005612167e036f in flatview_write_continue -> -> > > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, -> -> > > > > attrs=..., buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out -> -> > > > > of bounds>, len=len@entry=2, addr1=<optimized out>, l=<optimized -> -> > > > > out>, mr=0x56121846d340) -> -> > > > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > > > addr=<optimized out>, attrs=..., buf=<optimized out>, -> -> > > > > len=<optimized out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of -> -> > > > > bounds>, len=<optimized out>, is_write=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > > > #8 0x0000561216849da1 in kvm_cpu_exec -> -> > > > > (cpu=cpu@entry=0x56121849aa00) at -> -> > > > > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > > > (arg=arg@entry=0x56121849aa00) at -> -> > > > > /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) -> -> > > > > at /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > > > #11 0x00007fce7bef6e25 in start_thread () from -> -> > > > > /lib64/libpthread.so.0 -> -> > > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > > > -> -> > > > > And I searched and found -> -> > > > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the -> -> > > > > same -> -> > > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: -> -> > > > > Add -> -> > > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this -> -> > > > > particular -> -> > > > > bug. -> -> > > > > -> -> > > > > But I can still hit the bug even after applying the commit. Do I -> -> > > > > miss -> -> > > > > anything? -> -> > > > -> -> > > > Hi Eryu, -> -> > > > This backtrace seems to be caused by this bug (there were two bugs in -> -> > > > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > > > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > > > want to apply this patch: -> -> > > > -> -> > > > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > > > Let me know if this works. -> -> > > -> -> > > Unfortunately, I still see the same segfault & backtrace after applying -> -> > > commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> > > deletion") -> -> > > -> -> > > Anything I can help to debug? -> -> > -> -> > Please post the QEMU command-line and the QMP commands use to remove the -> -> > device. -> -> -> -> It's a normal kata instance using virtio-fs as rootfs. -> -> -> -> /usr/local/libexec/qemu-kvm -name -> -> sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ -> -> -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -> -> q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ -> -> -cpu host -qmp -> -> unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -> \ -> -> -qmp -> -> unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -> \ -> -> -m 2048M,slots=10,maxmem=773893M -device -> -> pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ -> -> -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -> -> virtconsole,chardev=charconsole0,id=console0 \ -> -> -chardev -> -> socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait -> -> \ -> -> -device -> -> virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 -> -> \ -> -> -chardev -> -> socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait -> -> \ -> -> -device nvdimm,id=nv0,memdev=mem0 -object -> -> memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 -> -> \ -> -> -object rng-random,id=rng0,filename=/dev/urandom -device -> -> virtio-rng,rng=rng0,romfile= \ -> -> -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ -> -> -chardev -> -> socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait -> -> \ -> -> -chardev -> -> socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock -> -> \ -> -> -device -> -> vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M -> -> -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ -> -> -device -> -> driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= -> -> \ -> -> -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -> -> -nodefaults -nographic -daemonize \ -> -> -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -> -> -numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ -> -> -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 -> -> i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp -> -> reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests -> -> net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 -> -> rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 quiet -> -> systemd.show_status=false panic=1 nr_cpus=96 agent.use_vsock=false -> -> init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target -> -> systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket \ -> -> -pidfile -> -> /run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -> -> \ -> -> -smp 1,cores=1,threads=1,sockets=96,maxcpus=96 -> -> -> -> QMP command to delete device (the device id is just an example, not the -> -> one caused the crash): -> -> -> -> "{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" -> -> -> -> which has been hot plugged by: -> -> "{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -> -> "{\"return\": {}}" -> -> "{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -> -> "{\"return\": {}}" -> -> -Thanks. I wasn't able to reproduce this crash with qemu.git/master. -> -> -One thing that is strange about the latest backtrace you posted: QEMU is -> -dispatching the memory access instead of using the ioeventfd code that -> -that virtio-blk-pci normally takes when a virtqueue is notified. I -> -guess this means ioeventfd has already been disabled due to the hot -> -unplug. -> -> -Could you try with machine type "i440fx" instead of "q35"? I wonder if -> -pci-bridge/shpc is part of the problem. -Sure, will try it. But it may take some time, as the test bed is busy -with other testing tasks. I'll report back once I got the results. - -Thanks, -Eryu - |