diff options
| author | Christian Krinitsin <mail@krinitsin.com> | 2025-06-03 12:04:13 +0000 |
|---|---|---|
| committer | Christian Krinitsin <mail@krinitsin.com> | 2025-06-03 12:04:13 +0000 |
| commit | 256709d2eb3fd80d768a99964be5caa61effa2a0 (patch) | |
| tree | 05b2352fba70923126836a64b6a0de43902e976a /results/classifier/02 | |
| parent | 2ab14fa96a6c5484b5e4ba8337551bb8dcc79cc5 (diff) | |
| download | qemu-analysis-256709d2eb3fd80d768a99964be5caa61effa2a0.tar.gz qemu-analysis-256709d2eb3fd80d768a99964be5caa61effa2a0.zip | |
add new classifier result
Diffstat (limited to 'results/classifier/02')
89 files changed, 0 insertions, 59542 deletions
diff --git a/results/classifier/02/boot/42226390 b/results/classifier/02/boot/42226390 deleted file mode 100644 index 516d9a4b3..000000000 --- a/results/classifier/02/boot/42226390 +++ /dev/null @@ -1,188 +0,0 @@ -boot: 0.943 -instruction: 0.925 -semantic: 0.924 -other: 0.894 -mistranslation: 0.826 - -[BUG] AArch64 boot hang with -icount and -smp >1 (iothread locking issue?) - -Hello, - -I am encountering one or more bugs when using -icount and -smp >1 that I am -attempting to sort out. My current theory is that it is an iothread locking -issue. - -I am using a command-line like the following where $kernel is a recent upstream -AArch64 Linux kernel Image (I can provide a binary if that would be helpful - -let me know how is best to post): - - qemu-system-aarch64 \ - -M virt -cpu cortex-a57 -m 1G \ - -nographic \ - -smp 2 \ - -icount 0 \ - -kernel $kernel - -For any/all of the symptoms described below, they seem to disappear when I -either remove `-icount 0` or change smp to `-smp 1`. In other words, it is the -combination of `-smp >1` and `-icount` which triggers what I'm seeing. - -I am seeing two different (but seemingly related) behaviors. The first (and -what I originally started debugging) shows up as a boot hang. When booting -using the above command after Peter's "icount: Take iothread lock when running -QEMU timers" patch [1], The kernel boots for a while and then hangs after: - -> -...snip... -> -[ 0.010764] Serial: AMBA PL011 UART driver -> -[ 0.016334] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 13, base_baud -> -= 0) is a PL011 rev1 -> -[ 0.016907] printk: console [ttyAMA0] enabled -> -[ 0.017624] KASLR enabled -> -[ 0.031986] HugeTLB: registered 16.0 GiB page size, pre-allocated 0 pages -> -[ 0.031986] HugeTLB: 16320 KiB vmemmap can be freed for a 16.0 GiB page -> -[ 0.031986] HugeTLB: registered 512 MiB page size, pre-allocated 0 pages -> -[ 0.031986] HugeTLB: 448 KiB vmemmap can be freed for a 512 MiB page -> -[ 0.031986] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages -> -[ 0.031986] HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page -When it hangs here, I drop into QEMU's console, attach to the gdbserver, and it -always reports that it is at address 0xffff800008dc42e8 (as shown below from an -objdump of the vmlinux). I note this is in the middle of messing with timer -system registers - which makes me suspect we're attempting to take the iothread -lock when its already held: - -> -ffff800008dc42b8 <arch_timer_set_next_event_virt>: -> -ffff800008dc42b8: d503201f nop -> -ffff800008dc42bc: d503201f nop -> -ffff800008dc42c0: d503233f paciasp -> -ffff800008dc42c4: d53be321 mrs x1, cntv_ctl_el0 -> -ffff800008dc42c8: 32000021 orr w1, w1, #0x1 -> -ffff800008dc42cc: d5033fdf isb -> -ffff800008dc42d0: d53be042 mrs x2, cntvct_el0 -> -ffff800008dc42d4: ca020043 eor x3, x2, x2 -> -ffff800008dc42d8: 8b2363e3 add x3, sp, x3 -> -ffff800008dc42dc: f940007f ldr xzr, [x3] -> -ffff800008dc42e0: 8b020000 add x0, x0, x2 -> -ffff800008dc42e4: d51be340 msr cntv_cval_el0, x0 -> -* ffff800008dc42e8: 927ef820 and x0, x1, #0xfffffffffffffffd -> -ffff800008dc42ec: d51be320 msr cntv_ctl_el0, x0 -> -ffff800008dc42f0: d5033fdf isb -> -ffff800008dc42f4: 52800000 mov w0, #0x0 -> -// #0 -> -ffff800008dc42f8: d50323bf autiasp -> -ffff800008dc42fc: d65f03c0 ret -The second behavior is that prior to Peter's "icount: Take iothread lock when -running QEMU timers" patch [1], I observe the following message (same command -as above): - -> -ERROR:../accel/tcg/tcg-accel-ops.c:79:tcg_handle_interrupt: assertion failed: -> -(qemu_mutex_iothread_locked()) -> -Aborted (core dumped) -This is the same behavior described in Gitlab issue 1130 [0] and addressed by -[1]. I bisected the appearance of this assertion, and found it was introduced -by Pavel's "replay: rewrite async event handling" commit [2]. Commits prior to -that one boot successfully (neither assertions nor hangs) with `-icount 0 -smp -2`. - -I've looked over these two commits ([1], [2]), but it is not obvious to me -how/why they might be interacting to produce the boot hangs I'm seeing and -I welcome any help investigating further. - -Thanks! - --Aaron Lindsay - -[0] - -https://gitlab.com/qemu-project/qemu/-/issues/1130 -[1] - -https://gitlab.com/qemu-project/qemu/-/commit/c7f26ded6d5065e4116f630f6a490b55f6c5f58e -[2] - -https://gitlab.com/qemu-project/qemu/-/commit/60618e2d77691e44bb78e23b2b0cf07b5c405e56 - -On Fri, 21 Oct 2022 at 16:48, Aaron Lindsay -<aaron@os.amperecomputing.com> wrote: -> -> -Hello, -> -> -I am encountering one or more bugs when using -icount and -smp >1 that I am -> -attempting to sort out. My current theory is that it is an iothread locking -> -issue. -Weird coincidence, that is a bug that's been in the tree for months -but was only reported to me earlier this week. Try reverting -commit a82fd5a4ec24d923ff1e -- that should fix it. -CAFEAcA_i8x00hD-4XX18ySLNbCB6ds1-DSazVb4yDnF8skjd9A@mail.gmail.com -/">https://lore.kernel.org/qemu-devel/ -CAFEAcA_i8x00hD-4XX18ySLNbCB6ds1-DSazVb4yDnF8skjd9A@mail.gmail.com -/ -has the explanation. - -thanks --- PMM - -On Oct 21 17:00, Peter Maydell wrote: -> -On Fri, 21 Oct 2022 at 16:48, Aaron Lindsay -> -<aaron@os.amperecomputing.com> wrote: -> -> -> -> Hello, -> -> -> -> I am encountering one or more bugs when using -icount and -smp >1 that I am -> -> attempting to sort out. My current theory is that it is an iothread locking -> -> issue. -> -> -Weird coincidence, that is a bug that's been in the tree for months -> -but was only reported to me earlier this week. Try reverting -> -commit a82fd5a4ec24d923ff1e -- that should fix it. -I can confirm that reverting a82fd5a4ec24d923ff1e fixes it for me. -Thanks for the help and fast response! - --Aaron - diff --git a/results/classifier/02/boot/51610399 b/results/classifier/02/boot/51610399 deleted file mode 100644 index d63d68255..000000000 --- a/results/classifier/02/boot/51610399 +++ /dev/null @@ -1,309 +0,0 @@ -boot: 0.986 -instruction: 0.985 -other: 0.985 -semantic: 0.984 -mistranslation: 0.983 - -[BUG][powerpc] KVM Guest Boot Failure – Hangs at "Booting Linux via __start()” - -Bug Description: -Encountering a boot failure when launching a KVM guest with -qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor -crashes. -Reproduction Steps: -# qemu-system-ppc64 --version -QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f) -Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers -# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine -pseries,accel=kvm \ --m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \ - -device virtio-scsi-pci,id=scsi \ --drive -file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2 -\ --device scsi-hd,drive=drive0,bus=scsi.0 \ - -netdev bridge,id=net0,br=virbr0 \ - -device virtio-net-pci,netdev=net0 \ - -serial pty \ - -device virtio-balloon-pci \ - -cpu host -QEMU 9.2.50 monitor - type 'help' for more information -char device redirected to /dev/pts/2 (label serial0) -(qemu) -(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but -unavailable: IRQ_XIVE capability must be present for KVM -Falling back to kernel-irqchip=off -** Qemu Hang - -(In another ssh session) -# screen /dev/pts/2 -Preparing to boot Linux version 6.10.4-200.fc40.ppc64le -(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801 -(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11 -15:20:17 UTC 2024 -Detected machine type: 0000000000000101 -command line: -BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le -root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M -Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) -Calling ibm,client-architecture-support... done -memory layout at init: - memory_limit : 0000000000000000 (16 MB aligned) - alloc_bottom : 0000000008200000 - alloc_top : 0000000030000000 - alloc_top_hi : 0000000800000000 - rmo_top : 0000000030000000 - ram_top : 0000000800000000 -instantiating rtas at 0x000000002fff0000... done -prom_hold_cpus: skipped -copying OF device tree... -Building dt strings... -Building dt structure... -Device tree strings 0x0000000008210000 -> 0x0000000008210bd0 -Device tree struct 0x0000000008220000 -> 0x0000000008230000 -Quiescing Open Firmware ... -Booting Linux via __start() @ 0x0000000000440000 ... -** Guest Console Hang - - -Git Bisect: -Performing git bisect points to the following patch: -# git bisect bad -e8291ec16da80566c121c68d9112be458954d90b is the first bad commit -commit e8291ec16da80566c121c68d9112be458954d90b (HEAD) -Author: Nicholas Piggin <npiggin@gmail.com> -Date: Thu Dec 19 13:40:31 2024 +1000 - - target/ppc: fix timebase register reset state -(H)DEC and PURR get reset before icount does, which causes them to -be -skewed and not match the init state. This can cause replay to not -match the recorded trace exactly. For DEC and HDEC this is usually -not -noticable since they tend to get programmed before affecting the - target machine. PURR has been observed to cause replay bugs when - running Linux. - - Fix this by resetting using a time of 0. - - Message-ID: <20241219034035.1826173-2-npiggin@gmail.com> - Signed-off-by: Nicholas Piggin <npiggin@gmail.com> - - hw/ppc/ppc.c | 11 ++++++++--- - 1 file changed, 8 insertions(+), 3 deletions(-) - - -Reverting the patch helps boot the guest. -Thanks, -Misbah Anjum N - -Thanks for the report. - -Tricky problem. A secondary CPU is hanging before it is started by the -primary via rtas call. - -That secondary keeps calling kvm_cpu_exec(), which keeps exiting out -early with EXCP_HLT because kvm_arch_process_async_events() returns -true because that cpu has ->halted=1. That just goes around he run -loop because there is an interrupt pending (DEC). - -So it never runs. It also never releases the BQL, and another CPU, -the primary which is actually supposed to be running, is stuck in -spapr_set_all_lpcrs() in run_on_cpu() waiting for the BQL. - -This patch just exposes the bug I think, by causing the interrupt. -although I'm not quite sure why it's okay previously (-ve decrementer -values should be causing a timer exception too). The timer exception -should not be taken as an interrupt by those secondary CPUs, and it -doesn't because it is masked, until set_all_lpcrs sets an LPCR value -that enables powersave wakeup on decrementer interrupt. - -The start_powered_off sate just sets ->halted, which makes it look -like a powersaving state. Logically I think it's not the same thing -as far as spapr goes. I don't know why start_powered_off only sets -->halted, and not ->stop/stopped as well. - -Not sure how best to solve it cleanly. I'll send a revert if I can't -get something working soon. - -Thanks, -Nick - -On Tue Mar 18, 2025 at 7:09 AM AEST, misanjum wrote: -> -Bug Description: -> -Encountering a boot failure when launching a KVM guest with -> -qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor -> -crashes. -> -> -> -Reproduction Steps: -> -# qemu-system-ppc64 --version -> -QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f) -> -Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers -> -> -# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine -> -pseries,accel=kvm \ -> --m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \ -> --device virtio-scsi-pci,id=scsi \ -> --drive -> -file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2 -> -> -\ -> --device scsi-hd,drive=drive0,bus=scsi.0 \ -> --netdev bridge,id=net0,br=virbr0 \ -> --device virtio-net-pci,netdev=net0 \ -> --serial pty \ -> --device virtio-balloon-pci \ -> --cpu host -> -QEMU 9.2.50 monitor - type 'help' for more information -> -char device redirected to /dev/pts/2 (label serial0) -> -(qemu) -> -(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but -> -unavailable: IRQ_XIVE capability must be present for KVM -> -Falling back to kernel-irqchip=off -> -** Qemu Hang -> -> -(In another ssh session) -> -# screen /dev/pts/2 -> -Preparing to boot Linux version 6.10.4-200.fc40.ppc64le -> -(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801 -> -(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11 -> -15:20:17 UTC 2024 -> -Detected machine type: 0000000000000101 -> -command line: -> -BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le -> -root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M -> -Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) -> -Calling ibm,client-architecture-support... done -> -memory layout at init: -> -memory_limit : 0000000000000000 (16 MB aligned) -> -alloc_bottom : 0000000008200000 -> -alloc_top : 0000000030000000 -> -alloc_top_hi : 0000000800000000 -> -rmo_top : 0000000030000000 -> -ram_top : 0000000800000000 -> -instantiating rtas at 0x000000002fff0000... done -> -prom_hold_cpus: skipped -> -copying OF device tree... -> -Building dt strings... -> -Building dt structure... -> -Device tree strings 0x0000000008210000 -> 0x0000000008210bd0 -> -Device tree struct 0x0000000008220000 -> 0x0000000008230000 -> -Quiescing Open Firmware ... -> -Booting Linux via __start() @ 0x0000000000440000 ... -> -** Guest Console Hang -> -> -> -Git Bisect: -> -Performing git bisect points to the following patch: -> -# git bisect bad -> -e8291ec16da80566c121c68d9112be458954d90b is the first bad commit -> -commit e8291ec16da80566c121c68d9112be458954d90b (HEAD) -> -Author: Nicholas Piggin <npiggin@gmail.com> -> -Date: Thu Dec 19 13:40:31 2024 +1000 -> -> -target/ppc: fix timebase register reset state -> -> -(H)DEC and PURR get reset before icount does, which causes them to -> -be -> -skewed and not match the init state. This can cause replay to not -> -match the recorded trace exactly. For DEC and HDEC this is usually -> -not -> -noticable since they tend to get programmed before affecting the -> -target machine. PURR has been observed to cause replay bugs when -> -running Linux. -> -> -Fix this by resetting using a time of 0. -> -> -Message-ID: <20241219034035.1826173-2-npiggin@gmail.com> -> -Signed-off-by: Nicholas Piggin <npiggin@gmail.com> -> -> -hw/ppc/ppc.c | 11 ++++++++--- -> -1 file changed, 8 insertions(+), 3 deletions(-) -> -> -> -Reverting the patch helps boot the guest. -> -Thanks, -> -Misbah Anjum N - diff --git a/results/classifier/02/boot/60339453 b/results/classifier/02/boot/60339453 deleted file mode 100644 index f42e1fe05..000000000 --- a/results/classifier/02/boot/60339453 +++ /dev/null @@ -1,62 +0,0 @@ -boot: 0.782 -other: 0.776 -instruction: 0.713 -mistranslation: 0.699 -semantic: 0.662 - -[BUG] scsi: vmw_pvscsi: Boot hangs during scsi under qemu, post commit e662502b3a78 - -Hi, - -Commit e662502b3a78 ("scsi: vmw_pvscsi: Set correct residual data length"), -and its backports to stable trees, makes kernel hang during boot, when -ran as a VM under qemu with following parameters: - - -drive file=$DISKFILE,if=none,id=sda - -device pvscsi - -device scsi-hd,bus=scsi.0,drive=sda - -Diving deeper, commit e662502b3a78 - - @@ -585,7 +585,13 @@ static void pvscsi_complete_request(struct -pvscsi_adapter *adapter, - case BTSTAT_SUCCESS: - + /* - + * Commands like INQUIRY may transfer less data than - + * requested by the initiator via bufflen. Set residual - + * count to make upper layer aware of the actual amount - + * of data returned. - + */ - + scsi_set_resid(cmd, scsi_bufflen(cmd) - e->dataLen); - -assumes 'e->dataLen' is properly armed with actual num of bytes -transferred; alas qemu's hw/scsi/vmw_pvscsi.c never arms the 'dataLen' -field of the completion descriptor (kept zero). - -As a result, the residual count is set as the *entire* 'scsi_bufflen' of a -good transfer, which makes upper scsi layers repeatedly ignore this -valid transfer. - -Not properly arming 'dataLen' seems as an oversight in qemu, which needs -to be fixed. - -However, since kernels with commit e662502b3a78 (and backports) now fail -to boot under qemu's "-device pvscsi", a suggested workaround is to set -the residual count *only* if 'e->dataLen' is armed, e.g: - - @@ -588,7 +588,8 @@ static void pvscsi_complete_request(struct pvscsi_adapter -*adapter, - * count to make upper layer aware of the actual -amount - * of data returned. - */ - - scsi_set_resid(cmd, scsi_bufflen(cmd) - e->dataLen); - + if (e->dataLen) - + scsi_set_resid(cmd, scsi_bufflen(cmd) - -e->dataLen); - -in order to make kernels boot on old qemu binaries. - -Best, -Shmulik - diff --git a/results/classifier/02/boot/67821138 b/results/classifier/02/boot/67821138 deleted file mode 100644 index f8b22ef20..000000000 --- a/results/classifier/02/boot/67821138 +++ /dev/null @@ -1,200 +0,0 @@ -boot: 0.881 -other: 0.853 -semantic: 0.843 -instruction: 0.821 -mistranslation: 0.768 - -[BUG, RFC] Base node is in RW after making external snapshot - -Hi everyone, - -When making an external snapshot, we end up in a situation when 2 block -graph nodes related to the same image file (format and storage nodes) -have different RO flags set on them. - -E.g. - -# ls -la /proc/PID/fd -lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd - -# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}' ---pretty | egrep '"node-name"|"ro"' - "ro": false, - "node-name": "libvirt-1-format", - "ro": false, - "node-name": "libvirt-1-storage", - -# virsh snapshot-create-as VM --name snap --disk-only -Domain snapshot snap created - -# ls -la /proc/PID/fd -lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd -lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap - -# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}' ---pretty | egrep '"node-name"|"ro"' - "ro": false, - "node-name": "libvirt-2-format", - "ro": false, - "node-name": "libvirt-2-storage", - "ro": true, - "node-name": "libvirt-1-format", - "ro": false, <-------------- - "node-name": "libvirt-1-storage", - -File descriptor has been reopened in RO, but "libvirt-1-storage" node -still has RW permissions set. - -I'm wondering it this a bug or this is intended? Looks like a bug to -me, although I see that some iotests (e.g. 273) expect 2 nodes related -to the same image file to have different RO flags. - -bdrv_reopen_set_read_only() - bdrv_reopen() - bdrv_reopen_queue() - bdrv_reopen_queue_child() - bdrv_reopen_multiple() - bdrv_list_refresh_perms() - bdrv_topological_dfs() - bdrv_do_refresh_perms() - bdrv_reopen_commit() - -In the stack above bdrv_reopen_set_read_only() is only being called for -the parent (libvirt-1-format) node. There're 2 lists: BDSs from -refresh_list are used by bdrv_drv_set_perm and this leads to actual -reopen with RO of the file descriptor. And then there's reopen queue -bs_queue -- BDSs from this queue get their parameters updated. While -refresh_list ends up having the whole subtree (including children, this -is done in bdrv_topological_dfs()) bs_queue only has the parent. And -that is because storage (child) node's (bs->inherits_from == NULL), so -bdrv_reopen_queue_child() never adds it to the queue. Could it be the -source of this bug? - -Anyway, would greatly appreciate a clarification. - -Andrey - -On 4/24/24 21:00, Andrey Drobyshev wrote: -> -Hi everyone, -> -> -When making an external snapshot, we end up in a situation when 2 block -> -graph nodes related to the same image file (format and storage nodes) -> -have different RO flags set on them. -> -> -E.g. -> -> -# ls -la /proc/PID/fd -> -lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd -> -> -# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}' -> ---pretty | egrep '"node-name"|"ro"' -> -"ro": false, -> -"node-name": "libvirt-1-format", -> -"ro": false, -> -"node-name": "libvirt-1-storage", -> -> -# virsh snapshot-create-as VM --name snap --disk-only -> -Domain snapshot snap created -> -> -# ls -la /proc/PID/fd -> -lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd -> -lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap -> -> -# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}' -> ---pretty | egrep '"node-name"|"ro"' -> -"ro": false, -> -"node-name": "libvirt-2-format", -> -"ro": false, -> -"node-name": "libvirt-2-storage", -> -"ro": true, -> -"node-name": "libvirt-1-format", -> -"ro": false, <-------------- -> -"node-name": "libvirt-1-storage", -> -> -File descriptor has been reopened in RO, but "libvirt-1-storage" node -> -still has RW permissions set. -> -> -I'm wondering it this a bug or this is intended? Looks like a bug to -> -me, although I see that some iotests (e.g. 273) expect 2 nodes related -> -to the same image file to have different RO flags. -> -> -bdrv_reopen_set_read_only() -> -bdrv_reopen() -> -bdrv_reopen_queue() -> -bdrv_reopen_queue_child() -> -bdrv_reopen_multiple() -> -bdrv_list_refresh_perms() -> -bdrv_topological_dfs() -> -bdrv_do_refresh_perms() -> -bdrv_reopen_commit() -> -> -In the stack above bdrv_reopen_set_read_only() is only being called for -> -the parent (libvirt-1-format) node. There're 2 lists: BDSs from -> -refresh_list are used by bdrv_drv_set_perm and this leads to actual -> -reopen with RO of the file descriptor. And then there's reopen queue -> -bs_queue -- BDSs from this queue get their parameters updated. While -> -refresh_list ends up having the whole subtree (including children, this -> -is done in bdrv_topological_dfs()) bs_queue only has the parent. And -> -that is because storage (child) node's (bs->inherits_from == NULL), so -> -bdrv_reopen_queue_child() never adds it to the queue. Could it be the -> -source of this bug? -> -> -Anyway, would greatly appreciate a clarification. -> -> -Andrey -Friendly ping. Could somebody confirm that it is a bug indeed? - diff --git a/results/classifier/02/instruction/11357571 b/results/classifier/02/instruction/11357571 deleted file mode 100644 index d281760db..000000000 --- a/results/classifier/02/instruction/11357571 +++ /dev/null @@ -1,48 +0,0 @@ -instruction: 0.758 -semantic: 0.694 -other: 0.687 -boot: 0.571 -mistranslation: 0.516 - -[Qemu-devel] [BUG] VNC: client won't send FramebufferUpdateRequest if job in flight is aborted - -Hi Gerd, Daniel. - -We noticed that if VncSharePolicy was configured with -VNC_SHARE_POLICY_FORCE_SHARED mode and -multiple vnc clients opened vnc connections, some clients could go blank screen -at high probability. -This problem can be reproduced when we regularly reboot suse12sp3 in graphic -mode both -with RealVNC and noVNC client. - -Then we dig into it and find out that some clients go blank screen because they -don't -send FramebufferUpdateRequest any more. One step further, we notice that each -time -the job in flight is aborted one client go blank screen. - -The bug is triggered in the following procedure. -Guest reboot => graphic mode switch => graphic_hw_update => vga_update_display -=> vga_draw_graphic (full_update = 1) => dpy_gfx_replace_surface => -vnc_dpy_switch => -vnc_abort_display_jobs (client may have job in flight) => job removed from the -queue -If one client has vnc job in flight, *vnc_abort_display_jobs* will wait until -its job is abandoned. -This behavior is done in vnc_worker_thread_loop when 'if (job->vs->ioc == NULL -|| job->vs->abort == true)' -branch is taken. - -As we can see, *vnc_abort_display_jobs* is intended to do some optimization to -avoid unnecessary client update. -But if client sends FramebufferUpdateRequest for some graphic area and its -FramebufferUpdate response job -is abandoned, the client may wait for the response and never send new -FramebufferUpdateRequest, which may -case the client go blank screen forever. - -So I am wondering whether we should drop the *vnc_abort_display_jobs* -optimization or do some trick here -to push the client to send new FramebufferUpdateRequest. Do you have any idea ? - diff --git a/results/classifier/02/instruction/11933524 b/results/classifier/02/instruction/11933524 deleted file mode 100644 index 2be63572b..000000000 --- a/results/classifier/02/instruction/11933524 +++ /dev/null @@ -1,1126 +0,0 @@ -instruction: 0.775 -other: 0.771 -boot: 0.743 -mistranslation: 0.719 -semantic: 0.673 - -[BUG] hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios - -Early-boot e820 records will be inserted by the bios/efi/early boot -software and be reported to the kernel via insert_resource. Later, when -CXL drivers iterate through the regions again, they will insert another -resource and make the RESERVED memory area a child. - -This RESERVED memory area causes the memory region to become unusable, -and as a result attempting to create memory regions with - - `cxl create-region ...` - -Will fail due to the RESERVED area intersecting with the CXL window. - - -During boot the following traceback is observed: - -0xffffffff81101650 in insert_resource_expand_to_fit () -0xffffffff83d964c5 in e820__reserve_resources_late () -0xffffffff83e03210 in pcibios_resource_survey () -0xffffffff83e04f4a in pcibios_init () - -Which produces a call to reserve the CFMWS area: - -(gdb) p *new -$54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", - flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, - child = 0x0} - -Later the Kernel parses ACPI tables and reserves the exact same area as -the CXL Fixed Memory Window. The use of `insert_resource_conflict` -retains the RESERVED region and makes it a child of the new region. - -0xffffffff811016a4 in insert_resource_conflict () - insert_resource () -0xffffffff81a81389 in cxl_parse_cfmws () -0xffffffff818c4a81 in call_handler () - acpi_parse_entries_array () - -(gdb) p/x *new -$59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", - flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, - child = 0x0} - -This produces the following output in /proc/iomem: - -590000000-68fffffff : CXL Window 0 - 590000000-68fffffff : Reserved - -This reserved area causes `get_free_mem_region()` to fail due to a check -against `__region_intersects()`. Due to this reserved area, the -intersect check will only ever return REGION_INTERSECTS, which causes -`cxl create-region` to always fail. - -Signed-off-by: Gregory Price <gregory.price@memverge.com> ---- - hw/i386/pc.c | 2 -- - 1 file changed, 2 deletions(-) - -diff --git a/hw/i386/pc.c b/hw/i386/pc.c -index 566accf7e6..5bf5465a21 100644 ---- a/hw/i386/pc.c -+++ b/hw/i386/pc.c -@@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, - hwaddr cxl_size = MiB; - - cxl_base = pc_get_cxl_range_start(pcms); -- e820_add_entry(cxl_base, cxl_size, E820_RESERVED); - memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); - memory_region_add_subregion(system_memory, cxl_base, mr); - cxl_resv_end = cxl_base + cxl_size; -@@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, - memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, fw, - "cxl-fixed-memory-region", fw->size); - memory_region_add_subregion(system_memory, fw->base, &fw->mr); -- e820_add_entry(fw->base, fw->size, E820_RESERVED); - cxl_fmw_base += fw->size; - cxl_resv_end = cxl_fmw_base; - } --- -2.37.3 - -Early-boot e820 records will be inserted by the bios/efi/early boot -software and be reported to the kernel via insert_resource. Later, when -CXL drivers iterate through the regions again, they will insert another -resource and make the RESERVED memory area a child. - -This RESERVED memory area causes the memory region to become unusable, -and as a result attempting to create memory regions with - - `cxl create-region ...` - -Will fail due to the RESERVED area intersecting with the CXL window. - - -During boot the following traceback is observed: - -0xffffffff81101650 in insert_resource_expand_to_fit () -0xffffffff83d964c5 in e820__reserve_resources_late () -0xffffffff83e03210 in pcibios_resource_survey () -0xffffffff83e04f4a in pcibios_init () - -Which produces a call to reserve the CFMWS area: - -(gdb) p *new -$54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", - flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, - child = 0x0} - -Later the Kernel parses ACPI tables and reserves the exact same area as -the CXL Fixed Memory Window. The use of `insert_resource_conflict` -retains the RESERVED region and makes it a child of the new region. - -0xffffffff811016a4 in insert_resource_conflict () - insert_resource () -0xffffffff81a81389 in cxl_parse_cfmws () -0xffffffff818c4a81 in call_handler () - acpi_parse_entries_array () - -(gdb) p/x *new -$59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", - flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, - child = 0x0} - -This produces the following output in /proc/iomem: - -590000000-68fffffff : CXL Window 0 - 590000000-68fffffff : Reserved - -This reserved area causes `get_free_mem_region()` to fail due to a check -against `__region_intersects()`. Due to this reserved area, the -intersect check will only ever return REGION_INTERSECTS, which causes -`cxl create-region` to always fail. - -Signed-off-by: Gregory Price <gregory.price@memverge.com> ---- - hw/i386/pc.c | 2 -- - 1 file changed, 2 deletions(-) - -diff --git a/hw/i386/pc.c b/hw/i386/pc.c -index 566accf7e6..5bf5465a21 100644 ---- a/hw/i386/pc.c -+++ b/hw/i386/pc.c -@@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, - hwaddr cxl_size = MiB; -cxl_base = pc_get_cxl_range_start(pcms); -- e820_add_entry(cxl_base, cxl_size, E820_RESERVED); - memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); - memory_region_add_subregion(system_memory, cxl_base, mr); - cxl_resv_end = cxl_base + cxl_size; -@@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, - memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, -fw, - "cxl-fixed-memory-region", fw->size); - memory_region_add_subregion(system_memory, fw->base, &fw->mr); -Or will this be subregion of cxl_base? - -Thanks, -Pankaj -- e820_add_entry(fw->base, fw->size, E820_RESERVED); - cxl_fmw_base += fw->size; - cxl_resv_end = cxl_fmw_base; - } - -> -> - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -> memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); -> -> memory_region_add_subregion(system_memory, cxl_base, mr); -> -> cxl_resv_end = cxl_base + cxl_size; -> -> @@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> memory_region_init_io(&fw->mr, OBJECT(machine), -> -> &cfmws_ops, fw, -> -> "cxl-fixed-memory-region", -> -> fw->size); -> -> memory_region_add_subregion(system_memory, fw->base, -> -> &fw->mr); -> -> -Or will this be subregion of cxl_base? -> -> -Thanks, -> -Pankaj -The memory region backing this memory area still has to be initialized -and added in the QEMU system, but it will now be initialized for use by -linux after PCI/ACPI setup occurs and the CXL driver discovers it via -CDAT. - -It's also still possible to assign this area a static memory region at -bool by setting up the SRATs in the ACPI tables, but that patch is not -upstream yet. - -On Tue, Oct 18, 2022 at 5:14 AM Gregory Price <gourry.memverge@gmail.com> wrote: -> -> -Early-boot e820 records will be inserted by the bios/efi/early boot -> -software and be reported to the kernel via insert_resource. Later, when -> -CXL drivers iterate through the regions again, they will insert another -> -resource and make the RESERVED memory area a child. -I have already sent a patch -https://www.mail-archive.com/qemu-devel@nongnu.org/msg882012.html -. -When the patch is applied, there would not be any reserved entries -even with passing E820_RESERVED . -So this patch needs to be evaluated in the light of the above patch I -sent. Once you apply my patch, does the issue still exist? - -> -> -This RESERVED memory area causes the memory region to become unusable, -> -and as a result attempting to create memory regions with -> -> -`cxl create-region ...` -> -> -Will fail due to the RESERVED area intersecting with the CXL window. -> -> -> -During boot the following traceback is observed: -> -> -0xffffffff81101650 in insert_resource_expand_to_fit () -> -0xffffffff83d964c5 in e820__reserve_resources_late () -> -0xffffffff83e03210 in pcibios_resource_survey () -> -0xffffffff83e04f4a in pcibios_init () -> -> -Which produces a call to reserve the CFMWS area: -> -> -(gdb) p *new -> -$54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", -> -flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, -> -child = 0x0} -> -> -Later the Kernel parses ACPI tables and reserves the exact same area as -> -the CXL Fixed Memory Window. The use of `insert_resource_conflict` -> -retains the RESERVED region and makes it a child of the new region. -> -> -0xffffffff811016a4 in insert_resource_conflict () -> -insert_resource () -> -0xffffffff81a81389 in cxl_parse_cfmws () -> -0xffffffff818c4a81 in call_handler () -> -acpi_parse_entries_array () -> -> -(gdb) p/x *new -> -$59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", -> -flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, -> -child = 0x0} -> -> -This produces the following output in /proc/iomem: -> -> -590000000-68fffffff : CXL Window 0 -> -590000000-68fffffff : Reserved -> -> -This reserved area causes `get_free_mem_region()` to fail due to a check -> -against `__region_intersects()`. Due to this reserved area, the -> -intersect check will only ever return REGION_INTERSECTS, which causes -> -`cxl create-region` to always fail. -> -> -Signed-off-by: Gregory Price <gregory.price@memverge.com> -> ---- -> -hw/i386/pc.c | 2 -- -> -1 file changed, 2 deletions(-) -> -> -diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> -index 566accf7e6..5bf5465a21 100644 -> ---- a/hw/i386/pc.c -> -+++ b/hw/i386/pc.c -> -@@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> -hwaddr cxl_size = MiB; -> -> -cxl_base = pc_get_cxl_range_start(pcms); -> -- e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); -> -memory_region_add_subregion(system_memory, cxl_base, mr); -> -cxl_resv_end = cxl_base + cxl_size; -> -@@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, -> -memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, -> -fw, -> -"cxl-fixed-memory-region", fw->size); -> -memory_region_add_subregion(system_memory, fw->base, -> -&fw->mr); -> -- e820_add_entry(fw->base, fw->size, E820_RESERVED); -> -cxl_fmw_base += fw->size; -> -cxl_resv_end = cxl_fmw_base; -> -} -> --- -> -2.37.3 -> - -This patch does not resolve the issue, reserved entries are still created. -[  0.000000] BIOS-e820: [mem 0x0000000280000000-0x00000002800fffff] reserved -[  0.000000] BIOS-e820: [mem 0x0000000290000000-0x000000029fffffff] reserved -# cat /proc/iomem -290000000-29fffffff : CXL Window 0 - 290000000-29fffffff : Reserved -# cxl create-region -m -d decoder0.0 -w 1 -g 256 mem0 -cxl region: create_region: region0: set_size failed: Numerical result out of range -cxl region: cmd_create_region: created 0 regions -On Tue, Oct 18, 2022 at 2:05 AM Ani Sinha < -ani@anisinha.ca -> wrote: -On Tue, Oct 18, 2022 at 5:14 AM Gregory Price < -gourry.memverge@gmail.com -> wrote: -> -> Early-boot e820 records will be inserted by the bios/efi/early boot -> software and be reported to the kernel via insert_resource. Later, when -> CXL drivers iterate through the regions again, they will insert another -> resource and make the RESERVED memory area a child. -I have already sent a patch -https://www.mail-archive.com/qemu-devel@nongnu.org/msg882012.html -. -When the patch is applied, there would not be any reserved entries -even with passing E820_RESERVED . -So this patch needs to be evaluated in the light of the above patch I -sent. Once you apply my patch, does the issue still exist? -> -> This RESERVED memory area causes the memory region to become unusable, -> and as a result attempting to create memory regions with -> ->   `cxl create-region ...` -> -> Will fail due to the RESERVED area intersecting with the CXL window. -> -> -> During boot the following traceback is observed: -> -> 0xffffffff81101650 in insert_resource_expand_to_fit () -> 0xffffffff83d964c5 in e820__reserve_resources_late () -> 0xffffffff83e03210 in pcibios_resource_survey () -> 0xffffffff83e04f4a in pcibios_init () -> -> Which produces a call to reserve the CFMWS area: -> -> (gdb) p *new -> $54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", ->    flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, ->    child = 0x0} -> -> Later the Kernel parses ACPI tables and reserves the exact same area as -> the CXL Fixed Memory Window. The use of `insert_resource_conflict` -> retains the RESERVED region and makes it a child of the new region. -> -> 0xffffffff811016a4 in insert_resource_conflict () ->            insert_resource () -> 0xffffffff81a81389 in cxl_parse_cfmws () -> 0xffffffff818c4a81 in call_handler () ->            acpi_parse_entries_array () -> -> (gdb) p/x *new -> $59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", ->    flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, ->    child = 0x0} -> -> This produces the following output in /proc/iomem: -> -> 590000000-68fffffff : CXL Window 0 ->  590000000-68fffffff : Reserved -> -> This reserved area causes `get_free_mem_region()` to fail due to a check -> against `__region_intersects()`. Due to this reserved area, the -> intersect check will only ever return REGION_INTERSECTS, which causes -> `cxl create-region` to always fail. -> -> Signed-off-by: Gregory Price < -gregory.price@memverge.com -> -> --- -> hw/i386/pc.c | 2 -- -> 1 file changed, 2 deletions(-) -> -> diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> index 566accf7e6..5bf5465a21 100644 -> --- a/hw/i386/pc.c -> +++ b/hw/i386/pc.c -> @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, ->     hwaddr cxl_size = MiB; -> ->     cxl_base = pc_get_cxl_range_start(pcms); -> -    e820_add_entry(cxl_base, cxl_size, E820_RESERVED); ->     memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); ->     memory_region_add_subregion(system_memory, cxl_base, mr); ->     cxl_resv_end = cxl_base + cxl_size; -> @@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, ->         memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, fw, ->                    "cxl-fixed-memory-region", fw->size); ->         memory_region_add_subregion(system_memory, fw->base, &fw->mr); -> -        e820_add_entry(fw->base, fw->size, E820_RESERVED); ->         cxl_fmw_base += fw->size; ->         cxl_resv_end = cxl_fmw_base; ->       } -> -- -> 2.37.3 -> - -+Gerd Hoffmann - -On Tue, Oct 18, 2022 at 8:16 PM Gregory Price <gourry.memverge@gmail.com> wrote: -> -> -This patch does not resolve the issue, reserved entries are still created. -> -> -[ 0.000000] BIOS-e820: [mem 0x0000000280000000-0x00000002800fffff] reserved -> -[ 0.000000] BIOS-e820: [mem 0x0000000290000000-0x000000029fffffff] reserved -> -> -# cat /proc/iomem -> -290000000-29fffffff : CXL Window 0 -> -290000000-29fffffff : Reserved -> -> -# cxl create-region -m -d decoder0.0 -w 1 -g 256 mem0 -> -cxl region: create_region: region0: set_size failed: Numerical result out of -> -range -> -cxl region: cmd_create_region: created 0 regions -> -> -On Tue, Oct 18, 2022 at 2:05 AM Ani Sinha <ani@anisinha.ca> wrote: -> -> -> -> On Tue, Oct 18, 2022 at 5:14 AM Gregory Price <gourry.memverge@gmail.com> -> -> wrote: -> -> > -> -> > Early-boot e820 records will be inserted by the bios/efi/early boot -> -> > software and be reported to the kernel via insert_resource. Later, when -> -> > CXL drivers iterate through the regions again, they will insert another -> -> > resource and make the RESERVED memory area a child. -> -> -> -> I have already sent a patch -> -> -https://www.mail-archive.com/qemu-devel@nongnu.org/msg882012.html -. -> -> When the patch is applied, there would not be any reserved entries -> -> even with passing E820_RESERVED . -> -> So this patch needs to be evaluated in the light of the above patch I -> -> sent. Once you apply my patch, does the issue still exist? -> -> -> -> > -> -> > This RESERVED memory area causes the memory region to become unusable, -> -> > and as a result attempting to create memory regions with -> -> > -> -> > `cxl create-region ...` -> -> > -> -> > Will fail due to the RESERVED area intersecting with the CXL window. -> -> > -> -> > -> -> > During boot the following traceback is observed: -> -> > -> -> > 0xffffffff81101650 in insert_resource_expand_to_fit () -> -> > 0xffffffff83d964c5 in e820__reserve_resources_late () -> -> > 0xffffffff83e03210 in pcibios_resource_survey () -> -> > 0xffffffff83e04f4a in pcibios_init () -> -> > -> -> > Which produces a call to reserve the CFMWS area: -> -> > -> -> > (gdb) p *new -> -> > $54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", -> -> > flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, -> -> > child = 0x0} -> -> > -> -> > Later the Kernel parses ACPI tables and reserves the exact same area as -> -> > the CXL Fixed Memory Window. The use of `insert_resource_conflict` -> -> > retains the RESERVED region and makes it a child of the new region. -> -> > -> -> > 0xffffffff811016a4 in insert_resource_conflict () -> -> > insert_resource () -> -> > 0xffffffff81a81389 in cxl_parse_cfmws () -> -> > 0xffffffff818c4a81 in call_handler () -> -> > acpi_parse_entries_array () -> -> > -> -> > (gdb) p/x *new -> -> > $59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", -> -> > flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, -> -> > child = 0x0} -> -> > -> -> > This produces the following output in /proc/iomem: -> -> > -> -> > 590000000-68fffffff : CXL Window 0 -> -> > 590000000-68fffffff : Reserved -> -> > -> -> > This reserved area causes `get_free_mem_region()` to fail due to a check -> -> > against `__region_intersects()`. Due to this reserved area, the -> -> > intersect check will only ever return REGION_INTERSECTS, which causes -> -> > `cxl create-region` to always fail. -> -> > -> -> > Signed-off-by: Gregory Price <gregory.price@memverge.com> -> -> > --- -> -> > hw/i386/pc.c | 2 -- -> -> > 1 file changed, 2 deletions(-) -> -> > -> -> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> -> > index 566accf7e6..5bf5465a21 100644 -> -> > --- a/hw/i386/pc.c -> -> > +++ b/hw/i386/pc.c -> -> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> > hwaddr cxl_size = MiB; -> -> > -> -> > cxl_base = pc_get_cxl_range_start(pcms); -> -> > - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -> > memory_region_init(mr, OBJECT(machine), "cxl_host_reg", cxl_size); -> -> > memory_region_add_subregion(system_memory, cxl_base, mr); -> -> > cxl_resv_end = cxl_base + cxl_size; -> -> > @@ -1077,7 +1076,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> > memory_region_init_io(&fw->mr, OBJECT(machine), -> -> > &cfmws_ops, fw, -> -> > "cxl-fixed-memory-region", -> -> > fw->size); -> -> > memory_region_add_subregion(system_memory, fw->base, -> -> > &fw->mr); -> -> > - e820_add_entry(fw->base, fw->size, E820_RESERVED); -> -> > cxl_fmw_base += fw->size; -> -> > cxl_resv_end = cxl_fmw_base; -> -> > } -> -> > -- -> -> > 2.37.3 -> -> > - -> ->> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> ->> > index 566accf7e6..5bf5465a21 100644 -> ->> > --- a/hw/i386/pc.c -> ->> > +++ b/hw/i386/pc.c -> ->> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> ->> > hwaddr cxl_size = MiB; -> ->> > -> ->> > cxl_base = pc_get_cxl_range_start(pcms); -> ->> > - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -Just dropping it doesn't look like a good plan to me. - -You can try set etc/reserved-memory-end fw_cfg file instead. Firmware -(both seabios and ovmf) read it and will make sure the 64bit pci mmio -window is placed above that address, i.e. this effectively reserves -address space. Right now used by memory hotplug code, but should work -for cxl too I think (disclaimer: don't know much about cxl ...). - -take care & HTH, - Gerd - -On Tue, 8 Nov 2022 12:21:11 +0100 -Gerd Hoffmann <kraxel@redhat.com> wrote: - -> -> >> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> -> >> > index 566accf7e6..5bf5465a21 100644 -> -> >> > --- a/hw/i386/pc.c -> -> >> > +++ b/hw/i386/pc.c -> -> >> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> >> > hwaddr cxl_size = MiB; -> -> >> > -> -> >> > cxl_base = pc_get_cxl_range_start(pcms); -> -> >> > - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -> -Just dropping it doesn't look like a good plan to me. -> -> -You can try set etc/reserved-memory-end fw_cfg file instead. Firmware -> -(both seabios and ovmf) read it and will make sure the 64bit pci mmio -> -window is placed above that address, i.e. this effectively reserves -> -address space. Right now used by memory hotplug code, but should work -> -for cxl too I think (disclaimer: don't know much about cxl ...). -As far as I know CXL impl. in QEMU isn't using etc/reserved-memory-end -at all, it' has its own mapping. - -Regardless of that, reserved E820 entries look wrong, and looking at -commit message OS is right to bailout on them (expected according -to ACPI spec). -Also spec says - -" -E820 Assumptions and Limitations - [...] - The platform boot firmware does not return a range description for the memory -mapping of - PCI devices, ISA Option ROMs, and ISA Plug and Play cards because the OS has -mechanisms - available to detect them. -" - -so dropping reserved entries looks reasonable from ACPI spec point of view. -(disclaimer: don't know much about cxl ... either) -> -> -take care & HTH, -> -Gerd -> - -On Fri, Nov 11, 2022 at 11:51:23AM +0100, Igor Mammedov wrote: -> -On Tue, 8 Nov 2022 12:21:11 +0100 -> -Gerd Hoffmann <kraxel@redhat.com> wrote: -> -> -> > >> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> -> > >> > index 566accf7e6..5bf5465a21 100644 -> -> > >> > --- a/hw/i386/pc.c -> -> > >> > +++ b/hw/i386/pc.c -> -> > >> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> > >> > hwaddr cxl_size = MiB; -> -> > >> > -> -> > >> > cxl_base = pc_get_cxl_range_start(pcms); -> -> > >> > - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -> -> -> Just dropping it doesn't look like a good plan to me. -> -> -> -> You can try set etc/reserved-memory-end fw_cfg file instead. Firmware -> -> (both seabios and ovmf) read it and will make sure the 64bit pci mmio -> -> window is placed above that address, i.e. this effectively reserves -> -> address space. Right now used by memory hotplug code, but should work -> -> for cxl too I think (disclaimer: don't know much about cxl ...). -> -> -As far as I know CXL impl. in QEMU isn't using etc/reserved-memory-end -> -at all, it' has its own mapping. -This should be changed. cxl should make sure the highest address used -is stored in etc/reserved-memory-end to avoid the firmware mapping pci -resources there. - -> -so dropping reserved entries looks reasonable from ACPI spec point of view. -Yep, I don't want dispute that. - -I suspect the reason for these entries to exist in the first place is to -inform the firmware that it should not place stuff there, and if we -remove that to conform with the spec we need some alternative way for -that ... - -take care, - Gerd - -On Fri, 11 Nov 2022 12:40:59 +0100 -Gerd Hoffmann <kraxel@redhat.com> wrote: - -> -On Fri, Nov 11, 2022 at 11:51:23AM +0100, Igor Mammedov wrote: -> -> On Tue, 8 Nov 2022 12:21:11 +0100 -> -> Gerd Hoffmann <kraxel@redhat.com> wrote: -> -> -> -> > > >> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c -> -> > > >> > index 566accf7e6..5bf5465a21 100644 -> -> > > >> > --- a/hw/i386/pc.c -> -> > > >> > +++ b/hw/i386/pc.c -> -> > > >> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms, -> -> > > >> > hwaddr cxl_size = MiB; -> -> > > >> > -> -> > > >> > cxl_base = pc_get_cxl_range_start(pcms); -> -> > > >> > - e820_add_entry(cxl_base, cxl_size, E820_RESERVED); -> -> > -> -> > Just dropping it doesn't look like a good plan to me. -> -> > -> -> > You can try set etc/reserved-memory-end fw_cfg file instead. Firmware -> -> > (both seabios and ovmf) read it and will make sure the 64bit pci mmio -> -> > window is placed above that address, i.e. this effectively reserves -> -> > address space. Right now used by memory hotplug code, but should work -> -> > for cxl too I think (disclaimer: don't know much about cxl ...). -> -> -> -> As far as I know CXL impl. in QEMU isn't using etc/reserved-memory-end -> -> at all, it' has its own mapping. -> -> -This should be changed. cxl should make sure the highest address used -> -is stored in etc/reserved-memory-end to avoid the firmware mapping pci -> -resources there. -if (pcmc->has_reserved_memory && machine->device_memory->base) { - -[...] - - if (pcms->cxl_devices_state.is_enabled) { - - res_mem_end = cxl_resv_end; - -that should be handled by this line - - } - - *val = cpu_to_le64(ROUND_UP(res_mem_end, 1 * GiB)); - - fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, sizeof(*val)); - - } - -so SeaBIOS shouldn't intrude into CXL address space -(I assume EDK2 behave similarly here) - -> -> so dropping reserved entries looks reasonable from ACPI spec point of view. -> -> -> -> -Yep, I don't want dispute that. -> -> -I suspect the reason for these entries to exist in the first place is to -> -inform the firmware that it should not place stuff there, and if we -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -just to educate me, can you point out what SeaBIOS code does with reservations. - -> -remove that to conform with the spec we need some alternative way for -> -that ... -with etc/reserved-memory-end set as above, -is E820_RESERVED really needed here? - -(my understanding was that E820_RESERVED weren't accounted for when -initializing PCI devices) - -> -> -take care, -> -Gerd -> - -> -if (pcmc->has_reserved_memory && machine->device_memory->base) { -> -> -[...] -> -> -if (pcms->cxl_devices_state.is_enabled) { -> -> -res_mem_end = cxl_resv_end; -> -> -that should be handled by this line -> -> -} -> -> -*val = cpu_to_le64(ROUND_UP(res_mem_end, 1 * GiB)); -> -> -fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, -> -sizeof(*val)); -> -} -> -> -so SeaBIOS shouldn't intrude into CXL address space -Yes, looks good, so with this in place already everyting should be fine. - -> -(I assume EDK2 behave similarly here) -Correct, ovmf reads that fw_cfg file too. - -> -> I suspect the reason for these entries to exist in the first place is to -> -> inform the firmware that it should not place stuff there, and if we -> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -> -just to educate me, can you point out what SeaBIOS code does with -> -reservations. -They are added to the e820 map which gets passed on to the OS. seabios -uses (and updateas) the e820 map too, when allocating memory for -example. While thinking about it I'm not fully sure it actually looks -at reservations, maybe it only uses (and updates) ram entries when -allocating memory. - -> -> remove that to conform with the spec we need some alternative way for -> -> that ... -> -> -with etc/reserved-memory-end set as above, -> -is E820_RESERVED really needed here? -No. Setting etc/reserved-memory-end is enough. - -So for the original patch: -Acked-by: Gerd Hoffmann <kraxel@redhat.com> - -take care, - Gerd - -On Fri, Nov 11, 2022 at 02:36:02PM +0100, Gerd Hoffmann wrote: -> -> if (pcmc->has_reserved_memory && machine->device_memory->base) { -> -> -> -> [...] -> -> -> -> if (pcms->cxl_devices_state.is_enabled) { -> -> -> -> res_mem_end = cxl_resv_end; -> -> -> -> that should be handled by this line -> -> -> -> } -> -> -> -> *val = cpu_to_le64(ROUND_UP(res_mem_end, 1 * GiB)); -> -> -> -> fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, -> -> sizeof(*val)); -> -> } -> -> -> -> so SeaBIOS shouldn't intrude into CXL address space -> -> -Yes, looks good, so with this in place already everyting should be fine. -> -> -> (I assume EDK2 behave similarly here) -> -> -Correct, ovmf reads that fw_cfg file too. -> -> -> > I suspect the reason for these entries to exist in the first place is to -> -> > inform the firmware that it should not place stuff there, and if we -> -> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -> -> just to educate me, can you point out what SeaBIOS code does with -> -> reservations. -> -> -They are added to the e820 map which gets passed on to the OS. seabios -> -uses (and updateas) the e820 map too, when allocating memory for -> -example. While thinking about it I'm not fully sure it actually looks -> -at reservations, maybe it only uses (and updates) ram entries when -> -allocating memory. -> -> -> > remove that to conform with the spec we need some alternative way for -> -> > that ... -> -> -> -> with etc/reserved-memory-end set as above, -> -> is E820_RESERVED really needed here? -> -> -No. Setting etc/reserved-memory-end is enough. -> -> -So for the original patch: -> -Acked-by: Gerd Hoffmann <kraxel@redhat.com> -> -> -take care, -> -Gerd -It's upstream already, sorry I can't add your tag. - --- -MST - diff --git a/results/classifier/02/instruction/24190340 b/results/classifier/02/instruction/24190340 deleted file mode 100644 index 61441c89e..000000000 --- a/results/classifier/02/instruction/24190340 +++ /dev/null @@ -1,2057 +0,0 @@ -instruction: 0.818 -other: 0.811 -boot: 0.803 -semantic: 0.793 -mistranslation: 0.758 - -[BUG, RFC] Block graph deadlock on job-dismiss - -Hi all, - -There's a bug in block layer which leads to block graph deadlock. -Notably, it takes place when blockdev IO is processed within a separate -iothread. - -This was initially caught by our tests, and I was able to reduce it to a -relatively simple reproducer. Such deadlocks are probably supposed to -be covered in iotests/graph-changes-while-io, but this deadlock isn't. - -Basically what the reproducer does is launches QEMU with a drive having -'iothread' option set, creates a chain of 2 snapshots, launches -block-commit job for a snapshot and then dismisses the job, starting -from the lower snapshot. If the guest is issuing IO at the same time, -there's a race in acquiring block graph lock and a potential deadlock. - -Here's how it can be reproduced: - -1. Run QEMU: -> -SRCDIR=/path/to/srcdir -> -> -> -> -> -$SRCDIR/build/qemu-system-x86_64 -enable-kvm \ -> -> --machine q35 -cpu Nehalem \ -> -> --name guest=alma8-vm,debug-threads=on \ -> -> --m 2g -smp 2 \ -> -> --nographic -nodefaults \ -> -> --qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \ -> -> --serial unix:/var/run/alma8-serial.sock,server=on,wait=off \ -> -> --object iothread,id=iothread0 \ -> -> --blockdev -> -node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2 -> -\ -> --device virtio-blk-pci,drive=disk,iothread=iothread0 -2. Launch IO (random reads) from within the guest: -> -nc -U /var/run/alma8-serial.sock -> -... -> -[root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k -> ---size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting -> ---rw=randread --iodepth=1 --filename=/testfile -3. Run snapshots creation & removal of lower snapshot operation in a -loop (script attached): -> -while /bin/true ; do ./remove_lower_snap.sh ; done -And then it occasionally hangs. - -Note: I've tried bisecting this, and looks like deadlock occurs starting -from the following commit: - -(BAD) 5bdbaebcce virtio: Re-enable notifications after drain -(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll - -On the latest v10.0.0 it does hang as well. - - -Here's backtrace of the main thread: - -> -#0 0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1, -> -timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:43 -> -#1 0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1, -> -timeout=-1) at ../util/qemu-timer.c:329 -> -#2 0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20, -> -ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79 -> -#3 0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at -> -../util/aio-posix.c:730 -> -#4 0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950, -> -parent=0x0, poll=true) at ../block/io.c:378 -> -#5 0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at -> -../block/io.c:391 -> -#6 0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7682 -> -#7 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7608 -> -#8 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7668 -> -#9 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7608 -> -#10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7668 -> -#11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7608 -> -#12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../blockjob.c:157 -> -#13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7592 -> -#14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7661 -> -#15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx -> -(child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = -> -{...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234 -> -#16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7592 -> -#17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0, -> -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -errp=0x0) -> -at ../block.c:7661 -> -#18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0, -> -ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715 -> -#19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at -> -../block.c:3317 -> -#20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at -> -../blockjob.c:209 -> -#21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at -> -../blockjob.c:82 -> -#22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at -> -../job.c:474 -> -#23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at -> -../job.c:771 -> -#24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400, -> -errp=0x7ffd94b4f488) at ../job.c:783 -> ---Type <RET> for more, q to quit, c to continue without paging-- -> -#25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 "commit-snap1", -> -errp=0x7ffd94b4f488) at ../job-qmp.c:138 -> -#26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0, -> -ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221 -> -#27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at -> -../qapi/qmp-dispatch.c:128 -> -#28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at -> -../util/async.c:172 -> -#29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at -> -../util/async.c:219 -> -#30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at -> -../util/aio-posix.c:436 -> -#31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200, -> -callback=0x0, user_data=0x0) at ../util/async.c:361 -> -#32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at -> -../glib/gmain.c:3364 -> -#33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079 -> -#34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287 -> -#35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at -> -../util/main-loop.c:310 -> -#36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at -> -../util/main-loop.c:589 -> -#37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835 -> -#38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at -> -../system/main.c:50 -> -#39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at -> -../system/main.c:80 -And here's coroutine trying to acquire read lock: - -> -(gdb) qemu coroutine reader_queue->entries.sqh_first -> -#0 0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0, -> -to_=0x7fc537fff508, action=COROUTINE_YIELD) at -> -../util/coroutine-ucontext.c:321 -> -#1 0x0000557eb47d4d4a in qemu_coroutine_yield () at -> -../util/qemu-coroutine.c:339 -> -#2 0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0 -> -<reader_queue>, lock=0x7fc53c57de50, flags=0) at -> -../util/qemu-coroutine-lock.c:60 -> -#3 0x0000557eb461fea7 in bdrv_graph_co_rdlock () at ../block/graph-lock.c:231 -> -#4 0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at -> -/home/root/src/qemu/master/include/block/graph-lock.h:213 -> -#5 0x0000557eb460fa41 in blk_co_do_preadv_part -> -(blk=0x557eb84c0810, offset=6890553344, bytes=4096, qiov=0x7fc530006988, -> -qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at ../block/block-backend.c:1339 -> -#6 0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at -> -../block/block-backend.c:1619 -> -#7 0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886) at -> -../util/coroutine-ucontext.c:175 -> -#8 0x00007fc547c2a360 in __start_context () at -> -../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 -> -#9 0x00007ffd94b4ea40 in () -> -#10 0x0000000000000000 in () -So it looks like main thread is processing job-dismiss request and is -holding write lock taken in block_job_remove_all_bdrv() (frame #20 -above). At the same time iothread spawns a coroutine which performs IO -request. Before the coroutine is spawned, blk_aio_prwv() increases -'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -trying to acquire the read lock. But main thread isn't releasing the -lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -Here's the deadlock. - -Any comments and suggestions on the subject are welcomed. Thanks! - -Andrey -remove_lower_snap.sh -Description: -application/shellscript - -On 4/24/25 8:32 PM, Andrey Drobyshev wrote: -> -Hi all, -> -> -There's a bug in block layer which leads to block graph deadlock. -> -Notably, it takes place when blockdev IO is processed within a separate -> -iothread. -> -> -This was initially caught by our tests, and I was able to reduce it to a -> -relatively simple reproducer. Such deadlocks are probably supposed to -> -be covered in iotests/graph-changes-while-io, but this deadlock isn't. -> -> -Basically what the reproducer does is launches QEMU with a drive having -> -'iothread' option set, creates a chain of 2 snapshots, launches -> -block-commit job for a snapshot and then dismisses the job, starting -> -from the lower snapshot. If the guest is issuing IO at the same time, -> -there's a race in acquiring block graph lock and a potential deadlock. -> -> -Here's how it can be reproduced: -> -> -[...] -> -I took a closer look at iotests/graph-changes-while-io, and have managed -to reproduce the same deadlock in a much simpler setup, without a guest. - -1. Run QSD:> ./build/storage-daemon/qemu-storage-daemon --object -iothread,id=iothread0 \ -> ---blockdev null-co,node-name=node0,read-zeroes=true \ -> -> ---nbd-server addr.type=unix,addr.path=/var/run/qsd_nbd.sock \ -> -> ---export -> -nbd,id=exp0,node-name=node0,iothread=iothread0,fixed-iothread=true,writable=true -> -\ -> ---chardev -> -socket,id=qmp-sock,path=/var/run/qsd_qmp.sock,server=on,wait=off \ -> ---monitor chardev=qmp-sock -2. Launch IO: -> -qemu-img bench -f raw -c 2000000 -> -'nbd+unix:///node0?socket=/var/run/qsd_nbd.sock' -3. Add 2 snapshots and remove lower one (script attached):> while -/bin/true ; do ./rls_qsd.sh ; done - -And then it hangs. - -I'll also send a patch with corresponding test case added directly to -iotests. - -This reproduce seems to be hanging starting from Fiona's commit -67446e605dc ("blockjob: drop AioContext lock before calling -bdrv_graph_wrlock()"). AioContext locks were dropped entirely later on -in Stefan's commit b49f4755c7 ("block: remove AioContext locking"), but -the problem remains. - -Andrey -rls_qsd.sh -Description: -application/shellscript - -From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> - -This case is catching potential deadlock which takes place when job-dismiss -is issued when I/O requests are processed in a separate iothread. - -See -https://mail.gnu.org/archive/html/qemu-devel/2025-04/msg04421.html -Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> ---- - .../qemu-iotests/tests/graph-changes-while-io | 101 ++++++++++++++++-- - .../tests/graph-changes-while-io.out | 4 +- - 2 files changed, 96 insertions(+), 9 deletions(-) - -diff --git a/tests/qemu-iotests/tests/graph-changes-while-io -b/tests/qemu-iotests/tests/graph-changes-while-io -index 194fda500e..e30f823da4 100755 ---- a/tests/qemu-iotests/tests/graph-changes-while-io -+++ b/tests/qemu-iotests/tests/graph-changes-while-io -@@ -27,6 +27,8 @@ from iotests import imgfmt, qemu_img, qemu_img_create, -qemu_io, \ - - - top = os.path.join(iotests.test_dir, 'top.img') -+snap1 = os.path.join(iotests.test_dir, 'snap1.img') -+snap2 = os.path.join(iotests.test_dir, 'snap2.img') - nbd_sock = os.path.join(iotests.sock_dir, 'nbd.sock') - - -@@ -58,6 +60,15 @@ class TestGraphChangesWhileIO(QMPTestCase): - def tearDown(self) -> None: - self.qsd.stop() - -+ def _wait_for_blockjob(self, status) -> None: -+ done = False -+ while not done: -+ for event in self.qsd.get_qmp().get_events(wait=10.0): -+ if event['event'] != 'JOB_STATUS_CHANGE': -+ continue -+ if event['data']['status'] == status: -+ done = True -+ - def test_blockdev_add_while_io(self) -> None: - # Run qemu-img bench in the background - bench_thr = Thread(target=do_qemu_img_bench) -@@ -116,13 +127,89 @@ class TestGraphChangesWhileIO(QMPTestCase): - 'device': 'job0', - }) - -- cancelled = False -- while not cancelled: -- for event in self.qsd.get_qmp().get_events(wait=10.0): -- if event['event'] != 'JOB_STATUS_CHANGE': -- continue -- if event['data']['status'] == 'null': -- cancelled = True -+ self._wait_for_blockjob('null') -+ -+ bench_thr.join() -+ -+ def test_remove_lower_snapshot_while_io(self) -> None: -+ # Run qemu-img bench in the background -+ bench_thr = Thread(target=do_qemu_img_bench, args=(100000, )) -+ bench_thr.start() -+ -+ # While I/O is performed on 'node0' node, consequently add 2 snapshots -+ # on top of it, then remove (commit) them starting from lower one. -+ while bench_thr.is_alive(): -+ # Recreate snapshot images on every iteration -+ qemu_img_create('-f', imgfmt, snap1, '1G') -+ qemu_img_create('-f', imgfmt, snap2, '1G') -+ -+ self.qsd.cmd('blockdev-add', { -+ 'driver': imgfmt, -+ 'node-name': 'snap1', -+ 'file': { -+ 'driver': 'file', -+ 'filename': snap1 -+ } -+ }) -+ -+ self.qsd.cmd('blockdev-snapshot', { -+ 'node': 'node0', -+ 'overlay': 'snap1', -+ }) -+ -+ self.qsd.cmd('blockdev-add', { -+ 'driver': imgfmt, -+ 'node-name': 'snap2', -+ 'file': { -+ 'driver': 'file', -+ 'filename': snap2 -+ } -+ }) -+ -+ self.qsd.cmd('blockdev-snapshot', { -+ 'node': 'snap1', -+ 'overlay': 'snap2', -+ }) -+ -+ self.qsd.cmd('block-commit', { -+ 'job-id': 'commit-snap1', -+ 'device': 'snap2', -+ 'top-node': 'snap1', -+ 'base-node': 'node0', -+ 'auto-finalize': True, -+ 'auto-dismiss': False, -+ }) -+ -+ self._wait_for_blockjob('concluded') -+ self.qsd.cmd('job-dismiss', { -+ 'id': 'commit-snap1', -+ }) -+ -+ self.qsd.cmd('block-commit', { -+ 'job-id': 'commit-snap2', -+ 'device': 'snap2', -+ 'top-node': 'snap2', -+ 'base-node': 'node0', -+ 'auto-finalize': True, -+ 'auto-dismiss': False, -+ }) -+ -+ self._wait_for_blockjob('ready') -+ self.qsd.cmd('job-complete', { -+ 'id': 'commit-snap2', -+ }) -+ -+ self._wait_for_blockjob('concluded') -+ self.qsd.cmd('job-dismiss', { -+ 'id': 'commit-snap2', -+ }) -+ -+ self.qsd.cmd('blockdev-del', { -+ 'node-name': 'snap1' -+ }) -+ self.qsd.cmd('blockdev-del', { -+ 'node-name': 'snap2' -+ }) - - bench_thr.join() - -diff --git a/tests/qemu-iotests/tests/graph-changes-while-io.out -b/tests/qemu-iotests/tests/graph-changes-while-io.out -index fbc63e62f8..8d7e996700 100644 ---- a/tests/qemu-iotests/tests/graph-changes-while-io.out -+++ b/tests/qemu-iotests/tests/graph-changes-while-io.out -@@ -1,5 +1,5 @@ --.. -+... - ---------------------------------------------------------------------- --Ran 2 tests -+Ran 3 tests - - OK --- -2.43.5 - -Am 24.04.25 um 19:32 schrieb Andrey Drobyshev: -> -So it looks like main thread is processing job-dismiss request and is -> -holding write lock taken in block_job_remove_all_bdrv() (frame #20 -> -above). At the same time iothread spawns a coroutine which performs IO -> -request. Before the coroutine is spawned, blk_aio_prwv() increases -> -'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -> -trying to acquire the read lock. But main thread isn't releasing the -> -lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -> -Here's the deadlock. -And for the IO test you provided, it's client->nb_requests that behaves -similarly to blk->in_flight here. - -The issue also reproduces easily when issuing the following QMP command -in a loop while doing IO on a device: - -> -void qmp_block_locked_drain(const char *node_name, Error **errp) -> -{ -> -BlockDriverState *bs; -> -> -bs = bdrv_find_node(node_name); -> -if (!bs) { -> -error_setg(errp, "node not found"); -> -return; -> -} -> -> -bdrv_graph_wrlock(); -> -bdrv_drained_begin(bs); -> -bdrv_drained_end(bs); -> -bdrv_graph_wrunlock(); -> -} -It seems like either it would be necessary to require: -1. not draining inside an exclusively locked section -or -2. making sure that variables used by drained_poll routines are only set -while holding the reader lock -? - -Those seem to require rather involved changes, so a third option might -be to make draining inside an exclusively locked section possible, by -embedding such locked sections in a drained section: - -> -diff --git a/blockjob.c b/blockjob.c -> -index 32007f31a9..9b2f3b3ea9 100644 -> ---- a/blockjob.c -> -+++ b/blockjob.c -> -@@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> -* one to make sure that such a concurrent access does not attempt -> -* to process an already freed BdrvChild. -> -*/ -> -+ bdrv_drain_all_begin(); -> -bdrv_graph_wrlock(); -> -while (job->nodes) { -> -GSList *l = job->nodes; -> -@@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> -g_slist_free_1(l); -> -} -> -bdrv_graph_wrunlock(); -> -+ bdrv_drain_all_end(); -> -} -> -> -bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs) -This seems to fix the issue at hand. I can send a patch if this is -considered an acceptable approach. - -Best Regards, -Fiona - -On 4/30/25 11:47 AM, Fiona Ebner wrote: -> -Am 24.04.25 um 19:32 schrieb Andrey Drobyshev: -> -> So it looks like main thread is processing job-dismiss request and is -> -> holding write lock taken in block_job_remove_all_bdrv() (frame #20 -> -> above). At the same time iothread spawns a coroutine which performs IO -> -> request. Before the coroutine is spawned, blk_aio_prwv() increases -> -> 'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -> -> trying to acquire the read lock. But main thread isn't releasing the -> -> lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -> -> Here's the deadlock. -> -> -And for the IO test you provided, it's client->nb_requests that behaves -> -similarly to blk->in_flight here. -> -> -The issue also reproduces easily when issuing the following QMP command -> -in a loop while doing IO on a device: -> -> -> void qmp_block_locked_drain(const char *node_name, Error **errp) -> -> { -> -> BlockDriverState *bs; -> -> -> -> bs = bdrv_find_node(node_name); -> -> if (!bs) { -> -> error_setg(errp, "node not found"); -> -> return; -> -> } -> -> -> -> bdrv_graph_wrlock(); -> -> bdrv_drained_begin(bs); -> -> bdrv_drained_end(bs); -> -> bdrv_graph_wrunlock(); -> -> } -> -> -It seems like either it would be necessary to require: -> -1. not draining inside an exclusively locked section -> -or -> -2. making sure that variables used by drained_poll routines are only set -> -while holding the reader lock -> -? -> -> -Those seem to require rather involved changes, so a third option might -> -be to make draining inside an exclusively locked section possible, by -> -embedding such locked sections in a drained section: -> -> -> diff --git a/blockjob.c b/blockjob.c -> -> index 32007f31a9..9b2f3b3ea9 100644 -> -> --- a/blockjob.c -> -> +++ b/blockjob.c -> -> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> -> * one to make sure that such a concurrent access does not attempt -> -> * to process an already freed BdrvChild. -> -> */ -> -> + bdrv_drain_all_begin(); -> -> bdrv_graph_wrlock(); -> -> while (job->nodes) { -> -> GSList *l = job->nodes; -> -> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> -> g_slist_free_1(l); -> -> } -> -> bdrv_graph_wrunlock(); -> -> + bdrv_drain_all_end(); -> -> } -> -> -> -> bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs) -> -> -This seems to fix the issue at hand. I can send a patch if this is -> -considered an acceptable approach. -> -> -Best Regards, -> -Fiona -> -Hello Fiona, - -Thanks for looking into it. I've tried your 3rd option above and can -confirm it does fix the deadlock, at least I can't reproduce it. Other -iotests also don't seem to be breaking. So I personally am fine with -that patch. Would be nice to hear a word from the maintainers though on -whether there're any caveats with such approach. - -Andrey - -On Wed, Apr 30, 2025 at 10:11â¯AM Andrey Drobyshev -<andrey.drobyshev@virtuozzo.com> wrote: -> -> -On 4/30/25 11:47 AM, Fiona Ebner wrote: -> -> Am 24.04.25 um 19:32 schrieb Andrey Drobyshev: -> ->> So it looks like main thread is processing job-dismiss request and is -> ->> holding write lock taken in block_job_remove_all_bdrv() (frame #20 -> ->> above). At the same time iothread spawns a coroutine which performs IO -> ->> request. Before the coroutine is spawned, blk_aio_prwv() increases -> ->> 'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -> ->> trying to acquire the read lock. But main thread isn't releasing the -> ->> lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -> ->> Here's the deadlock. -> -> -> -> And for the IO test you provided, it's client->nb_requests that behaves -> -> similarly to blk->in_flight here. -> -> -> -> The issue also reproduces easily when issuing the following QMP command -> -> in a loop while doing IO on a device: -> -> -> ->> void qmp_block_locked_drain(const char *node_name, Error **errp) -> ->> { -> ->> BlockDriverState *bs; -> ->> -> ->> bs = bdrv_find_node(node_name); -> ->> if (!bs) { -> ->> error_setg(errp, "node not found"); -> ->> return; -> ->> } -> ->> -> ->> bdrv_graph_wrlock(); -> ->> bdrv_drained_begin(bs); -> ->> bdrv_drained_end(bs); -> ->> bdrv_graph_wrunlock(); -> ->> } -> -> -> -> It seems like either it would be necessary to require: -> -> 1. not draining inside an exclusively locked section -> -> or -> -> 2. making sure that variables used by drained_poll routines are only set -> -> while holding the reader lock -> -> ? -> -> -> -> Those seem to require rather involved changes, so a third option might -> -> be to make draining inside an exclusively locked section possible, by -> -> embedding such locked sections in a drained section: -> -> -> ->> diff --git a/blockjob.c b/blockjob.c -> ->> index 32007f31a9..9b2f3b3ea9 100644 -> ->> --- a/blockjob.c -> ->> +++ b/blockjob.c -> ->> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> ->> * one to make sure that such a concurrent access does not attempt -> ->> * to process an already freed BdrvChild. -> ->> */ -> ->> + bdrv_drain_all_begin(); -> ->> bdrv_graph_wrlock(); -> ->> while (job->nodes) { -> ->> GSList *l = job->nodes; -> ->> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job) -> ->> g_slist_free_1(l); -> ->> } -> ->> bdrv_graph_wrunlock(); -> ->> + bdrv_drain_all_end(); -> ->> } -> ->> -> ->> bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs) -> -> -> -> This seems to fix the issue at hand. I can send a patch if this is -> -> considered an acceptable approach. -Kevin is aware of this thread but it's a public holiday tomorrow so it -may be a little longer. - -Stefan - -Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben: -> -Hi all, -> -> -There's a bug in block layer which leads to block graph deadlock. -> -Notably, it takes place when blockdev IO is processed within a separate -> -iothread. -> -> -This was initially caught by our tests, and I was able to reduce it to a -> -relatively simple reproducer. Such deadlocks are probably supposed to -> -be covered in iotests/graph-changes-while-io, but this deadlock isn't. -> -> -Basically what the reproducer does is launches QEMU with a drive having -> -'iothread' option set, creates a chain of 2 snapshots, launches -> -block-commit job for a snapshot and then dismisses the job, starting -> -from the lower snapshot. If the guest is issuing IO at the same time, -> -there's a race in acquiring block graph lock and a potential deadlock. -> -> -Here's how it can be reproduced: -> -> -1. Run QEMU: -> -> SRCDIR=/path/to/srcdir -> -> -> -> -> -> -> -> -> -> $SRCDIR/build/qemu-system-x86_64 -enable-kvm \ -> -> -> -> -machine q35 -cpu Nehalem \ -> -> -> -> -name guest=alma8-vm,debug-threads=on \ -> -> -> -> -m 2g -smp 2 \ -> -> -> -> -nographic -nodefaults \ -> -> -> -> -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \ -> -> -> -> -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \ -> -> -> -> -object iothread,id=iothread0 \ -> -> -> -> -blockdev -> -> node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2 -> -> \ -> -> -device virtio-blk-pci,drive=disk,iothread=iothread0 -> -> -2. Launch IO (random reads) from within the guest: -> -> nc -U /var/run/alma8-serial.sock -> -> ... -> -> [root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k -> -> --size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting -> -> --rw=randread --iodepth=1 --filename=/testfile -> -> -3. Run snapshots creation & removal of lower snapshot operation in a -> -loop (script attached): -> -> while /bin/true ; do ./remove_lower_snap.sh ; done -> -> -And then it occasionally hangs. -> -> -Note: I've tried bisecting this, and looks like deadlock occurs starting -> -from the following commit: -> -> -(BAD) 5bdbaebcce virtio: Re-enable notifications after drain -> -(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll -> -> -On the latest v10.0.0 it does hang as well. -> -> -> -Here's backtrace of the main thread: -> -> -> #0 0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1, -> -> timeout=<optimized out>, sigmask=0x0) at -> -> ../sysdeps/unix/sysv/linux/ppoll.c:43 -> -> #1 0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1, -> -> timeout=-1) at ../util/qemu-timer.c:329 -> -> #2 0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20, -> -> ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79 -> -> #3 0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at -> -> ../util/aio-posix.c:730 -> -> #4 0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950, -> -> parent=0x0, poll=true) at ../block/io.c:378 -> -> #5 0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at -> -> ../block/io.c:391 -> -> #6 0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7682 -> -> #7 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7608 -> -> #8 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7668 -> -> #9 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7608 -> -> #10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7668 -> -> #11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7608 -> -> #12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../blockjob.c:157 -> -> #13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7592 -> -> #14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7661 -> -> #15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx -> -> (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = -> -> {...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234 -> -> #16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7592 -> -> #17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0, -> -> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -> -> errp=0x0) -> -> at ../block.c:7661 -> -> #18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0, -> -> ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715 -> -> #19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at -> -> ../block.c:3317 -> -> #20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at -> -> ../blockjob.c:209 -> -> #21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at -> -> ../blockjob.c:82 -> -> #22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at -> -> ../job.c:474 -> -> #23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at -> -> ../job.c:771 -> -> #24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400, -> -> errp=0x7ffd94b4f488) at ../job.c:783 -> -> --Type <RET> for more, q to quit, c to continue without paging-- -> -> #25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 -> -> "commit-snap1", errp=0x7ffd94b4f488) at ../job-qmp.c:138 -> -> #26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0, -> -> ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221 -> -> #27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at -> -> ../qapi/qmp-dispatch.c:128 -> -> #28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at -> -> ../util/async.c:172 -> -> #29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at -> -> ../util/async.c:219 -> -> #30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at -> -> ../util/aio-posix.c:436 -> -> #31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200, -> -> callback=0x0, user_data=0x0) at ../util/async.c:361 -> -> #32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at -> -> ../glib/gmain.c:3364 -> -> #33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079 -> -> #34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287 -> -> #35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at -> -> ../util/main-loop.c:310 -> -> #36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at -> -> ../util/main-loop.c:589 -> -> #37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835 -> -> #38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at -> -> ../system/main.c:50 -> -> #39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at -> -> ../system/main.c:80 -> -> -> -And here's coroutine trying to acquire read lock: -> -> -> (gdb) qemu coroutine reader_queue->entries.sqh_first -> -> #0 0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0, -> -> to_=0x7fc537fff508, action=COROUTINE_YIELD) at -> -> ../util/coroutine-ucontext.c:321 -> -> #1 0x0000557eb47d4d4a in qemu_coroutine_yield () at -> -> ../util/qemu-coroutine.c:339 -> -> #2 0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0 -> -> <reader_queue>, lock=0x7fc53c57de50, flags=0) at -> -> ../util/qemu-coroutine-lock.c:60 -> -> #3 0x0000557eb461fea7 in bdrv_graph_co_rdlock () at -> -> ../block/graph-lock.c:231 -> -> #4 0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at -> -> /home/root/src/qemu/master/include/block/graph-lock.h:213 -> -> #5 0x0000557eb460fa41 in blk_co_do_preadv_part -> -> (blk=0x557eb84c0810, offset=6890553344, bytes=4096, -> -> qiov=0x7fc530006988, qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at -> -> ../block/block-backend.c:1339 -> -> #6 0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at -> -> ../block/block-backend.c:1619 -> -> #7 0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886) -> -> at ../util/coroutine-ucontext.c:175 -> -> #8 0x00007fc547c2a360 in __start_context () at -> -> ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 -> -> #9 0x00007ffd94b4ea40 in () -> -> #10 0x0000000000000000 in () -> -> -> -So it looks like main thread is processing job-dismiss request and is -> -holding write lock taken in block_job_remove_all_bdrv() (frame #20 -> -above). At the same time iothread spawns a coroutine which performs IO -> -request. Before the coroutine is spawned, blk_aio_prwv() increases -> -'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -> -trying to acquire the read lock. But main thread isn't releasing the -> -lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -> -Here's the deadlock. -> -> -Any comments and suggestions on the subject are welcomed. Thanks! -I think this is what the blk_wait_while_drained() call was supposed to -address in blk_co_do_preadv_part(). However, with the use of multiple -I/O threads, this is racy. - -Do you think that in your case we hit the small race window between the -checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there -another reason why blk_wait_while_drained() didn't do its job? - -Kevin - -On 5/2/25 19:34, Kevin Wolf wrote: -Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben: -Hi all, - -There's a bug in block layer which leads to block graph deadlock. -Notably, it takes place when blockdev IO is processed within a separate -iothread. - -This was initially caught by our tests, and I was able to reduce it to a -relatively simple reproducer. Such deadlocks are probably supposed to -be covered in iotests/graph-changes-while-io, but this deadlock isn't. - -Basically what the reproducer does is launches QEMU with a drive having -'iothread' option set, creates a chain of 2 snapshots, launches -block-commit job for a snapshot and then dismisses the job, starting -from the lower snapshot. If the guest is issuing IO at the same time, -there's a race in acquiring block graph lock and a potential deadlock. - -Here's how it can be reproduced: - -1. Run QEMU: -SRCDIR=/path/to/srcdir -$SRCDIR/build/qemu-system-x86_64 -enable-kvm \ --machine q35 -cpu Nehalem \ - -name guest=alma8-vm,debug-threads=on \ - -m 2g -smp 2 \ - -nographic -nodefaults \ - -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \ - -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \ - -object iothread,id=iothread0 \ - -blockdev -node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2 - \ - -device virtio-blk-pci,drive=disk,iothread=iothread0 -2. Launch IO (random reads) from within the guest: -nc -U /var/run/alma8-serial.sock -... -[root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k ---size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting ---rw=randread --iodepth=1 --filename=/testfile -3. Run snapshots creation & removal of lower snapshot operation in a -loop (script attached): -while /bin/true ; do ./remove_lower_snap.sh ; done -And then it occasionally hangs. - -Note: I've tried bisecting this, and looks like deadlock occurs starting -from the following commit: - -(BAD) 5bdbaebcce virtio: Re-enable notifications after drain -(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll - -On the latest v10.0.0 it does hang as well. - - -Here's backtrace of the main thread: -#0 0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1, timeout=<optimized -out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:43 -#1 0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1, timeout=-1) -at ../util/qemu-timer.c:329 -#2 0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20, -ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79 -#3 0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at -../util/aio-posix.c:730 -#4 0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950, parent=0x0, -poll=true) at ../block/io.c:378 -#5 0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at -../block/io.c:391 -#6 0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7682 -#7 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7608 -#8 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7668 -#9 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7608 -#10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7668 -#11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7608 -#12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../blockjob.c:157 -#13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7592 -#14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7661 -#15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx - (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -tran=0x557eb7a87160, errp=0x0) at ../block.c:1234 -#16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7592 -#17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0, -ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, -errp=0x0) - at ../block.c:7661 -#18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0, -ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715 -#19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at -../block.c:3317 -#20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at -../blockjob.c:209 -#21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at -../blockjob.c:82 -#22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at ../job.c:474 -#23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at -../job.c:771 -#24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400, -errp=0x7ffd94b4f488) at ../job.c:783 ---Type <RET> for more, q to quit, c to continue without paging-- -#25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 "commit-snap1", -errp=0x7ffd94b4f488) at ../job-qmp.c:138 -#26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0, -ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221 -#27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at -../qapi/qmp-dispatch.c:128 -#28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at ../util/async.c:172 -#29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at -../util/async.c:219 -#30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at -../util/aio-posix.c:436 -#31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200, -callback=0x0, user_data=0x0) at ../util/async.c:361 -#32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at -../glib/gmain.c:3364 -#33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079 -#34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287 -#35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at -../util/main-loop.c:310 -#36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at -../util/main-loop.c:589 -#37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835 -#38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at ../system/main.c:50 -#39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at -../system/main.c:80 -And here's coroutine trying to acquire read lock: -(gdb) qemu coroutine reader_queue->entries.sqh_first -#0 0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0, -to_=0x7fc537fff508, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:321 -#1 0x0000557eb47d4d4a in qemu_coroutine_yield () at -../util/qemu-coroutine.c:339 -#2 0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0 -<reader_queue>, lock=0x7fc53c57de50, flags=0) at -../util/qemu-coroutine-lock.c:60 -#3 0x0000557eb461fea7 in bdrv_graph_co_rdlock () at ../block/graph-lock.c:231 -#4 0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at -/home/root/src/qemu/master/include/block/graph-lock.h:213 -#5 0x0000557eb460fa41 in blk_co_do_preadv_part - (blk=0x557eb84c0810, offset=6890553344, bytes=4096, qiov=0x7fc530006988, -qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at ../block/block-backend.c:1339 -#6 0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at -../block/block-backend.c:1619 -#7 0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886) at -../util/coroutine-ucontext.c:175 -#8 0x00007fc547c2a360 in __start_context () at -../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 -#9 0x00007ffd94b4ea40 in () -#10 0x0000000000000000 in () -So it looks like main thread is processing job-dismiss request and is -holding write lock taken in block_job_remove_all_bdrv() (frame #20 -above). At the same time iothread spawns a coroutine which performs IO -request. Before the coroutine is spawned, blk_aio_prwv() increases -'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -trying to acquire the read lock. But main thread isn't releasing the -lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -Here's the deadlock. - -Any comments and suggestions on the subject are welcomed. Thanks! -I think this is what the blk_wait_while_drained() call was supposed to -address in blk_co_do_preadv_part(). However, with the use of multiple -I/O threads, this is racy. - -Do you think that in your case we hit the small race window between the -checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there -another reason why blk_wait_while_drained() didn't do its job? - -Kevin -At my opinion there is very big race window. Main thread has -eaten graph write lock. After that another coroutine is stalled -within GRAPH_RDLOCK_GUARD() as there is no drain at the moment and only -after that main thread has started drain. That is why Fiona's idea is -looking working. Though this would mean that normally we should always -do that at the moment when we acquire write lock. May be even inside -this function. Den - -Am 02.05.2025 um 19:52 hat Denis V. Lunev geschrieben: -> -On 5/2/25 19:34, Kevin Wolf wrote: -> -> Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben: -> -> > Hi all, -> -> > -> -> > There's a bug in block layer which leads to block graph deadlock. -> -> > Notably, it takes place when blockdev IO is processed within a separate -> -> > iothread. -> -> > -> -> > This was initially caught by our tests, and I was able to reduce it to a -> -> > relatively simple reproducer. Such deadlocks are probably supposed to -> -> > be covered in iotests/graph-changes-while-io, but this deadlock isn't. -> -> > -> -> > Basically what the reproducer does is launches QEMU with a drive having -> -> > 'iothread' option set, creates a chain of 2 snapshots, launches -> -> > block-commit job for a snapshot and then dismisses the job, starting -> -> > from the lower snapshot. If the guest is issuing IO at the same time, -> -> > there's a race in acquiring block graph lock and a potential deadlock. -> -> > -> -> > Here's how it can be reproduced: -> -> > -> -> > 1. Run QEMU: -> -> > > SRCDIR=/path/to/srcdir -> -> > > $SRCDIR/build/qemu-system-x86_64 -enable-kvm \ -> -> > > -machine q35 -cpu Nehalem \ -> -> > > -name guest=alma8-vm,debug-threads=on \ -> -> > > -m 2g -smp 2 \ -> -> > > -nographic -nodefaults \ -> -> > > -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \ -> -> > > -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \ -> -> > > -object iothread,id=iothread0 \ -> -> > > -blockdev -> -> > > node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2 -> -> > > \ -> -> > > -device virtio-blk-pci,drive=disk,iothread=iothread0 -> -> > 2. Launch IO (random reads) from within the guest: -> -> > > nc -U /var/run/alma8-serial.sock -> -> > > ... -> -> > > [root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 -> -> > > --bs=4k --size=1G --numjobs=1 --time_based=1 --runtime=300 -> -> > > --group_reporting --rw=randread --iodepth=1 --filename=/testfile -> -> > 3. Run snapshots creation & removal of lower snapshot operation in a -> -> > loop (script attached): -> -> > > while /bin/true ; do ./remove_lower_snap.sh ; done -> -> > And then it occasionally hangs. -> -> > -> -> > Note: I've tried bisecting this, and looks like deadlock occurs starting -> -> > from the following commit: -> -> > -> -> > (BAD) 5bdbaebcce virtio: Re-enable notifications after drain -> -> > (GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll -> -> > -> -> > On the latest v10.0.0 it does hang as well. -> -> > -> -> > -> -> > Here's backtrace of the main thread: -> -> > -> -> > > #0 0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1, -> -> > > timeout=<optimized out>, sigmask=0x0) at -> -> > > ../sysdeps/unix/sysv/linux/ppoll.c:43 -> -> > > #1 0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1, -> -> > > timeout=-1) at ../util/qemu-timer.c:329 -> -> > > #2 0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20, -> -> > > ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79 -> -> > > #3 0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) -> -> > > at ../util/aio-posix.c:730 -> -> > > #4 0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950, -> -> > > parent=0x0, poll=true) at ../block/io.c:378 -> -> > > #5 0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at -> -> > > ../block/io.c:391 -> -> > > #6 0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7682 -> -> > > #7 0x0000557eb45ebf2b in bdrv_child_change_aio_context -> -> > > (c=0x557eb7964250, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7608 -> -> > > #8 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7668 -> -> > > #9 0x0000557eb45ebf2b in bdrv_child_change_aio_context -> -> > > (c=0x557eb7e59110, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7608 -> -> > > #10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7668 -> -> > > #11 0x0000557eb45ebf2b in bdrv_child_change_aio_context -> -> > > (c=0x557eb814ed80, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7608 -> -> > > #12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../blockjob.c:157 -> -> > > #13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context -> -> > > (c=0x557eb7c9d3f0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7592 -> -> > > #14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7661 -> -> > > #15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx -> -> > > (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 -> -> > > = {...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234 -> -> > > #16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context -> -> > > (c=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7592 -> -> > > #17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0, -> -> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, -> -> > > tran=0x557eb7a87160, errp=0x0) -> -> > > at ../block.c:7661 -> -> > > #18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context -> -> > > (bs=0x557eb79575e0, ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at -> -> > > ../block.c:7715 -> -> > > #19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) -> -> > > at ../block.c:3317 -> -> > > #20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv -> -> > > (job=0x557eb7952800) at ../blockjob.c:209 -> -> > > #21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at -> -> > > ../blockjob.c:82 -> -> > > #22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at -> -> > > ../job.c:474 -> -> > > #23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at -> -> > > ../job.c:771 -> -> > > #24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400, -> -> > > errp=0x7ffd94b4f488) at ../job.c:783 -> -> > > --Type <RET> for more, q to quit, c to continue without paging-- -> -> > > #25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 -> -> > > "commit-snap1", errp=0x7ffd94b4f488) at ../job-qmp.c:138 -> -> > > #26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0, -> -> > > ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221 -> -> > > #27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at -> -> > > ../qapi/qmp-dispatch.c:128 -> -> > > #28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at -> -> > > ../util/async.c:172 -> -> > > #29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at -> -> > > ../util/async.c:219 -> -> > > #30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at -> -> > > ../util/aio-posix.c:436 -> -> > > #31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200, -> -> > > callback=0x0, user_data=0x0) at ../util/async.c:361 -> -> > > #32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at -> -> > > ../glib/gmain.c:3364 -> -> > > #33 g_main_context_dispatch (context=0x557eb76c6430) at -> -> > > ../glib/gmain.c:4079 -> -> > > #34 0x0000557eb47d3ab1 in glib_pollfds_poll () at -> -> > > ../util/main-loop.c:287 -> -> > > #35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at -> -> > > ../util/main-loop.c:310 -> -> > > #36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at -> -> > > ../util/main-loop.c:589 -> -> > > #37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835 -> -> > > #38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at -> -> > > ../system/main.c:50 -> -> > > #39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at -> -> > > ../system/main.c:80 -> -> > -> -> > And here's coroutine trying to acquire read lock: -> -> > -> -> > > (gdb) qemu coroutine reader_queue->entries.sqh_first -> -> > > #0 0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0, -> -> > > to_=0x7fc537fff508, action=COROUTINE_YIELD) at -> -> > > ../util/coroutine-ucontext.c:321 -> -> > > #1 0x0000557eb47d4d4a in qemu_coroutine_yield () at -> -> > > ../util/qemu-coroutine.c:339 -> -> > > #2 0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0 -> -> > > <reader_queue>, lock=0x7fc53c57de50, flags=0) at -> -> > > ../util/qemu-coroutine-lock.c:60 -> -> > > #3 0x0000557eb461fea7 in bdrv_graph_co_rdlock () at -> -> > > ../block/graph-lock.c:231 -> -> > > #4 0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) -> -> > > at /home/root/src/qemu/master/include/block/graph-lock.h:213 -> -> > > #5 0x0000557eb460fa41 in blk_co_do_preadv_part -> -> > > (blk=0x557eb84c0810, offset=6890553344, bytes=4096, -> -> > > qiov=0x7fc530006988, qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at -> -> > > ../block/block-backend.c:1339 -> -> > > #6 0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at -> -> > > ../block/block-backend.c:1619 -> -> > > #7 0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, -> -> > > i1=21886) at ../util/coroutine-ucontext.c:175 -> -> > > #8 0x00007fc547c2a360 in __start_context () at -> -> > > ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 -> -> > > #9 0x00007ffd94b4ea40 in () -> -> > > #10 0x0000000000000000 in () -> -> > -> -> > So it looks like main thread is processing job-dismiss request and is -> -> > holding write lock taken in block_job_remove_all_bdrv() (frame #20 -> -> > above). At the same time iothread spawns a coroutine which performs IO -> -> > request. Before the coroutine is spawned, blk_aio_prwv() increases -> -> > 'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is -> -> > trying to acquire the read lock. But main thread isn't releasing the -> -> > lock as blk_root_drained_poll() returns true since blk->in_flight > 0. -> -> > Here's the deadlock. -> -> > -> -> > Any comments and suggestions on the subject are welcomed. Thanks! -> -> I think this is what the blk_wait_while_drained() call was supposed to -> -> address in blk_co_do_preadv_part(). However, with the use of multiple -> -> I/O threads, this is racy. -> -> -> -> Do you think that in your case we hit the small race window between the -> -> checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there -> -> another reason why blk_wait_while_drained() didn't do its job? -> -> -> -At my opinion there is very big race window. Main thread has -> -eaten graph write lock. After that another coroutine is stalled -> -within GRAPH_RDLOCK_GUARD() as there is no drain at the moment and only -> -after that main thread has started drain. -You're right, I confused taking the write lock with draining there. - -> -That is why Fiona's idea is looking working. Though this would mean -> -that normally we should always do that at the moment when we acquire -> -write lock. May be even inside this function. -I actually see now that not all of my graph locking patches were merged. -At least I did have the thought that bdrv_drained_begin() must be marked -GRAPH_UNLOCKED because it polls. That means that calling it from inside -bdrv_try_change_aio_context() is actually forbidden (and that's the part -I didn't see back then because it doesn't have TSA annotations). - -If you refactor the code to move the drain out to before the lock is -taken, I think you end up with Fiona's patch, except you'll remove the -forbidden inner drain and add more annotations for some functions and -clarify the rules around them. I don't know, but I wouldn't be surprised -if along the process we find other bugs, too. - -So Fiona's drain looks right to me, but we should probably approach it -more systematically. - -Kevin - diff --git a/results/classifier/02/instruction/26095107 b/results/classifier/02/instruction/26095107 deleted file mode 100644 index 91776bab6..000000000 --- a/results/classifier/02/instruction/26095107 +++ /dev/null @@ -1,159 +0,0 @@ -instruction: 0.991 -boot: 0.987 -other: 0.979 -semantic: 0.974 -mistranslation: 0.930 - -[Qemu-devel] [Bug Report] vm paused after succeeding to migrate - -Hi, all -I encounterd a bug when I try to migrate a windows vm. - -Enviroment information: -host A: cpu E5620(model WestmereEP without flag xsave) -host B: cpu E5-2643(model SandyBridgeEP with xsave) - -The reproduce steps is : -1. Start a windows 2008 vm with -cpu host(which means host-passthrough). -2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully). -3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1. -4. Then migrate the vm to host A (successfully), but vm was paused, and qemu -printed log as followed: - -KVM: entry failed, hardware error 0x80000021 - -If you're running a guest on an Intel machine without unrestricted mode -support, the failure can be most likely due to the guest entering an invalid -state for Intel VT. For example, the guest maybe running in big real mode -which is not supported on less recent Intel processors. - -EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=00000000 -ESI=01a62000 EDI=00000000 EBP=00000000 ESP=01718b20 -EIP=0185d982 EFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 -ES =0000 00000000 0000ffff 00009300 -CS =f000 ffff0000 0000ffff 00009b00 -SS =0000 00000000 0000ffff 00009300 -DS =0000 00000000 0000ffff 00009300 -FS =0000 00000000 0000ffff 00009300 -GS =0000 00000000 0000ffff 00009300 -LDT=0000 00000000 0000ffff 00008200 -TR =0000 00000000 0000ffff 00008b00 -GDT= 00000000 0000ffff -IDT= 00000000 0000ffff -CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 -DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 -DR3=0000000000000000 -DR6=00000000ffff0ff0 DR7=0000000000000400 -EFER=0000000000000000 -Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 -00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - -I have found that problem happened when kvm_put_sregs returns err -22(called by -kvm_arch_put_registers(qemu)). -Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has no -X86_FEATURE_XSAVE but cr4.OSXSAVE=1. -So should we cancel migration when kvm_arch_put_registers returns error? - -* linzhecheng (address@hidden) wrote: -> -Hi, all -> -I encounterd a bug when I try to migrate a windows vm. -> -> -Enviroment information: -> -host A: cpu E5620(model WestmereEP without flag xsave) -> -host B: cpu E5-2643(model SandyBridgeEP with xsave) -> -> -The reproduce steps is : -> -1. Start a windows 2008 vm with -cpu host(which means host-passthrough). -> -2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully). -> -3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1. -> -4. Then migrate the vm to host A (successfully), but vm was paused, and qemu -> -printed log as followed: -Remember that migrating using -cpu host across different CPU models is NOT -expected to work. - -> -KVM: entry failed, hardware error 0x80000021 -> -> -If you're running a guest on an Intel machine without unrestricted mode -> -support, the failure can be most likely due to the guest entering an invalid -> -state for Intel VT. For example, the guest maybe running in big real mode -> -which is not supported on less recent Intel processors. -> -> -EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=00000000 -> -ESI=01a62000 EDI=00000000 EBP=00000000 ESP=01718b20 -> -EIP=0185d982 EFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 -> -ES =0000 00000000 0000ffff 00009300 -> -CS =f000 ffff0000 0000ffff 00009b00 -> -SS =0000 00000000 0000ffff 00009300 -> -DS =0000 00000000 0000ffff 00009300 -> -FS =0000 00000000 0000ffff 00009300 -> -GS =0000 00000000 0000ffff 00009300 -> -LDT=0000 00000000 0000ffff 00008200 -> -TR =0000 00000000 0000ffff 00008b00 -> -GDT= 00000000 0000ffff -> -IDT= 00000000 0000ffff -> -CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 -> -DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 -> -DR3=0000000000000000 -> -DR6=00000000ffff0ff0 DR7=0000000000000400 -> -EFER=0000000000000000 -> -Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 -> -00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -> -00 -> -> -I have found that problem happened when kvm_put_sregs returns err -22(called -> -by kvm_arch_put_registers(qemu)). -> -Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has -> -no X86_FEATURE_XSAVE but cr4.OSXSAVE=1. -> -So should we cancel migration when kvm_arch_put_registers returns error? -It would seem good if we can make the migration fail there rather than -hitting that KVM error. -It looks like we need to do a bit of plumbing to convert the places that -call it to return a bool rather than void. - -Dave - --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - diff --git a/results/classifier/02/instruction/33802194 b/results/classifier/02/instruction/33802194 deleted file mode 100644 index 3783d67ad..000000000 --- a/results/classifier/02/instruction/33802194 +++ /dev/null @@ -1,4940 +0,0 @@ -instruction: 0.693 -mistranslation: 0.687 -semantic: 0.656 -other: 0.637 -boot: 0.631 - -[BUG] cxl can not create region - -Hi list - -I want to test cxl functions in arm64, and found some problems I can't -figure out. - -My test environment: - -1. build latest bios from -https://github.com/tianocore/edk2.git -master -branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) -2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git -master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm -support patch: -https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ -3. build Linux kernel from -https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git -preview -branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) -4. build latest ndctl tools from -https://github.com/pmem/ndctl -create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) - -And my qemu test commands: -sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ - -cpu max -smp 8 -nographic -no-reboot \ - -kernel $KERNEL -bios $BIOS_BIN \ - -drive if=none,file=$ROOTFS,format=qcow2,id=hd \ - -device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 -nokaslr dyndbg="module cxl* +p"' \ - -object memory-backend-ram,size=4G,id=mem0 \ - -numa node,nodeid=0,cpus=0-7,memdev=mem0 \ - -net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ - -object -memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M -\ - -object -memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M -\ - -object -memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M -\ - -object -memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M -\ - -object -memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M -\ - -object -memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M -\ - -object -memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M -\ - -object -memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M -\ - -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ - -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ - -device cxl-upstream,bus=root_port0,id=us0 \ - -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ - -device -cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ - -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ - -device -cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ - -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ - -device -cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ - -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ - -device -cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ - -M -cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k - -And I have got two problems. -1. When I want to create x1 region with command: "cxl create-region -d -decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer -reference. Crash log: - -[ 534.697324] cxl_region region0: config state: 0 -[ 534.697346] cxl_region region0: probe: -6 -[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 -[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -for mem0:decoder3.0 @ 0 -[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 -[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = -0000:0e:00.0 for mem0:decoder3.0 @ 0 -[ 534.699405] Unable to handle kernel NULL pointer dereference at -virtual address 0000000000000000 -[ 534.701474] Mem abort info: -[ 534.701994] ESR = 0x0000000086000004 -[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits -[ 534.703616] SET = 0, FnV = 0 -[ 534.704174] EA = 0, S1PTW = 0 -[ 534.704803] FSC = 0x04: level 0 translation fault -[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 -[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 -[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP -[ 534.710301] Modules linked in: -[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted -5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 -[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 -[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -[ 534.719190] pc : 0x0 -[ 534.719928] lr : commit_store+0x118/0x2cc -[ 534.721007] sp : ffff80000aec3c30 -[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: ffff0000c0c06b30 -[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: ffff0000c0a29400 -[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: ffff0000c0c06800 -[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: 0000000000000000 -[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffd41fe838 -[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 -[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 -[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff0000c0906e80 -[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff80000aec3bf0 -[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c155a000 -[ 534.738878] Call trace: -[ 534.739368] 0x0 -[ 534.739713] dev_attr_store+0x1c/0x30 -[ 534.740186] sysfs_kf_write+0x48/0x58 -[ 534.740961] kernfs_fop_write_iter+0x128/0x184 -[ 534.741872] new_sync_write+0xdc/0x158 -[ 534.742706] vfs_write+0x1ac/0x2a8 -[ 534.743440] ksys_write+0x68/0xf0 -[ 534.744328] __arm64_sys_write+0x1c/0x28 -[ 534.745180] invoke_syscall+0x44/0xf0 -[ 534.745989] el0_svc_common+0x4c/0xfc -[ 534.746661] do_el0_svc+0x60/0xa8 -[ 534.747378] el0_svc+0x2c/0x78 -[ 534.748066] el0t_64_sync_handler+0xb8/0x12c -[ 534.748919] el0t_64_sync+0x18c/0x190 -[ 534.749629] Code: bad PC value -[ 534.750169] ---[ end trace 0000000000000000 ]--- - -2. When I want to create x4 region with command: "cxl create-region -d -decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: - -cxl region: create_region: region0: failed to set target3 to mem3 -cxl region: cmd_create_region: created 0 regions - -And kernel log as below: -[ 60.536663] cxl_region region0: config state: 0 -[ 60.536675] cxl_region region0: probe: -6 -[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 -[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: -mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 -[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 -[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: -mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 -[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: -mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 -[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 -[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: -mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 -[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: -mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 -[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 -[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: -mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 -[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -for mem0:decoder3.0 @ 0 -[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 -[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = -0000:0e:00.0 for mem0:decoder3.0 @ 0 -[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at 1 - -I have tried to write sysfs node manually, got same errors. - -Hope I can get some helps here. - -Bob - -On Fri, 5 Aug 2022 10:20:23 +0800 -Bobo WL <lmw.bobo@gmail.com> wrote: - -> -Hi list -> -> -I want to test cxl functions in arm64, and found some problems I can't -> -figure out. -Hi Bob, - -Glad to see people testing this code. - -> -> -My test environment: -> -> -1. build latest bios from -https://github.com/tianocore/edk2.git -master -> -branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) -> -2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git -> -master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm -> -support patch: -> -https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ -> -3. build Linux kernel from -> -https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git -preview -> -branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) -> -4. build latest ndctl tools from -https://github.com/pmem/ndctl -> -create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) -> -> -And my qemu test commands: -> -sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ -> --cpu max -smp 8 -nographic -no-reboot \ -> --kernel $KERNEL -bios $BIOS_BIN \ -> --drive if=none,file=$ROOTFS,format=qcow2,id=hd \ -> --device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 -> -nokaslr dyndbg="module cxl* +p"' \ -> --object memory-backend-ram,size=4G,id=mem0 \ -> --numa node,nodeid=0,cpus=0-7,memdev=mem0 \ -> --net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ -> --object -> -memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M -> -\ -> --device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -> --device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -Probably not related to your problem, but there is a disconnect in QEMU / -kernel assumptionsaround the presence of an HDM decoder when a HB only -has a single root port. Spec allows it to be provided or not as an -implementation choice. -Kernel assumes it isn't provide. Qemu assumes it is. - -The temporary solution is to throw in a second root port on the HB and not -connect anything to it. Longer term I may special case this so that the -particular -decoder defaults to pass through settings in QEMU if there is only one root -port. - -> --device cxl-upstream,bus=root_port0,id=us0 \ -> --device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -> --device -> -cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ -> --device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -> --device -> -cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ -> --device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -> --device -> -cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ -> --device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -> --device -> -cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ -> --M -> -cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k -> -> -And I have got two problems. -> -1. When I want to create x1 region with command: "cxl create-region -d -> -decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer -> -reference. Crash log: -> -> -[ 534.697324] cxl_region region0: config state: 0 -> -[ 534.697346] cxl_region region0: probe: -6 -Seems odd this is up here. But maybe fine. - -> -[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 -> -[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: -> -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -> -[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -> -[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -> -[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -> -[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -> -for mem0:decoder3.0 @ 0 -> -[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 -> -[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = -> -0000:0e:00.0 for mem0:decoder3.0 @ 0 -> -[ 534.699405] Unable to handle kernel NULL pointer dereference at -> -virtual address 0000000000000000 -> -[ 534.701474] Mem abort info: -> -[ 534.701994] ESR = 0x0000000086000004 -> -[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits -> -[ 534.703616] SET = 0, FnV = 0 -> -[ 534.704174] EA = 0, S1PTW = 0 -> -[ 534.704803] FSC = 0x04: level 0 translation fault -> -[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 -> -[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 -> -[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP -> -[ 534.710301] Modules linked in: -> -[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted -> -5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 -> -[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 -> -[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -> -[ 534.719190] pc : 0x0 -> -[ 534.719928] lr : commit_store+0x118/0x2cc -> -[ 534.721007] sp : ffff80000aec3c30 -> -[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: -> -ffff0000c0c06b30 -> -[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: -> -ffff0000c0a29400 -> -[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: -> -ffff0000c0c06800 -> -[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: -> -0000000000000000 -> -[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: -> -0000ffffd41fe838 -> -[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: -> -0000000000000000 -> -[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : -> -0000000000000000 -> -[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : -> -ffff0000c0906e80 -> -[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : -> -ffff80000aec3bf0 -> -[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : -> -ffff0000c155a000 -> -[ 534.738878] Call trace: -> -[ 534.739368] 0x0 -> -[ 534.739713] dev_attr_store+0x1c/0x30 -> -[ 534.740186] sysfs_kf_write+0x48/0x58 -> -[ 534.740961] kernfs_fop_write_iter+0x128/0x184 -> -[ 534.741872] new_sync_write+0xdc/0x158 -> -[ 534.742706] vfs_write+0x1ac/0x2a8 -> -[ 534.743440] ksys_write+0x68/0xf0 -> -[ 534.744328] __arm64_sys_write+0x1c/0x28 -> -[ 534.745180] invoke_syscall+0x44/0xf0 -> -[ 534.745989] el0_svc_common+0x4c/0xfc -> -[ 534.746661] do_el0_svc+0x60/0xa8 -> -[ 534.747378] el0_svc+0x2c/0x78 -> -[ 534.748066] el0t_64_sync_handler+0xb8/0x12c -> -[ 534.748919] el0t_64_sync+0x18c/0x190 -> -[ 534.749629] Code: bad PC value -> -[ 534.750169] ---[ end trace 0000000000000000 ]--- -> -> -2. When I want to create x4 region with command: "cxl create-region -d -> -decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: -> -> -cxl region: create_region: region0: failed to set target3 to mem3 -> -cxl region: cmd_create_region: created 0 regions -> -> -And kernel log as below: -> -[ 60.536663] cxl_region region0: config state: 0 -> -[ 60.536675] cxl_region region0: probe: -6 -> -[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 -> -[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: -> -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -> -[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -> -[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: -> -mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 -> -[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 -> -[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: -> -mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 -> -[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 -> -[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: -> -mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 -> -[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 -> -[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -> -[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -> -for mem0:decoder3.0 @ 0 -> -[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 -This looks like off by 1 that should be fixed in the below mentioned -cxl/pending branch. That ig should be 256. Note the fix was -for a test case with a fat HB and no switch, but certainly looks -like this is the same issue. - -> -[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = -> -0000:0e:00.0 for mem0:decoder3.0 @ 0 -> -[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at -> -1 -> -> -I have tried to write sysfs node manually, got same errors. -When stepping through by hand, which sysfs write triggers the crash above? - -Not sure it's related, but I've just sent out a fix to the -target register handling in QEMU. -20220808122051.14822-1-Jonathan.Cameron@huawei.com -/T/#m47ff985412ce44559e6b04d677c302f8cd371330">https://lore.kernel.org/linux-cxl/ -20220808122051.14822-1-Jonathan.Cameron@huawei.com -/T/#m47ff985412ce44559e6b04d677c302f8cd371330 -I did have one instance last week of triggering what looked to be a race -condition but -the stack trace doesn't looks related to what you've hit. - -It will probably be a few days before I have time to take a look at replicating -what you have seen. - -If you have time, try using the kernel.org cxl/pending branch as there are -a few additional fixes on there since you sent this email. Optimistic to hope -this is covered by one of those, but at least it will mean we are trying to -replicate -on same branch. - -Jonathan - - -> -> -Hope I can get some helps here. -> -> -Bob - -Hi Jonathan - -Thanks for your reply! - -On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -<Jonathan.Cameron@huawei.com> wrote: -> -> -Probably not related to your problem, but there is a disconnect in QEMU / -> -kernel assumptionsaround the presence of an HDM decoder when a HB only -> -has a single root port. Spec allows it to be provided or not as an -> -implementation choice. -> -Kernel assumes it isn't provide. Qemu assumes it is. -> -> -The temporary solution is to throw in a second root port on the HB and not -> -connect anything to it. Longer term I may special case this so that the -> -particular -> -decoder defaults to pass through settings in QEMU if there is only one root -> -port. -> -You are right! After adding an extra HB in qemu, I can create a x1 -region successfully. -But have some errors in Nvdimm: - -[ 74.925838] Unknown online node for memory at 0x10000000000, assuming node 0 -[ 74.925846] Unknown target node for memory at 0x10000000000, assuming node 0 -[ 74.927470] nd_region region0: nmem0: is disabled, failing probe - -And x4 region still failed with same errors, using latest cxl/preview -branch don't work. -I have picked "Two CXL emulation fixes" patches in qemu, still not working. - -Bob - -On Tue, 9 Aug 2022 21:07:06 +0800 -Bobo WL <lmw.bobo@gmail.com> wrote: - -> -Hi Jonathan -> -> -Thanks for your reply! -> -> -On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -<Jonathan.Cameron@huawei.com> wrote: -> -> -> -> Probably not related to your problem, but there is a disconnect in QEMU / -> -> kernel assumptionsaround the presence of an HDM decoder when a HB only -> -> has a single root port. Spec allows it to be provided or not as an -> -> implementation choice. -> -> Kernel assumes it isn't provide. Qemu assumes it is. -> -> -> -> The temporary solution is to throw in a second root port on the HB and not -> -> connect anything to it. Longer term I may special case this so that the -> -> particular -> -> decoder defaults to pass through settings in QEMU if there is only one root -> -> port. -> -> -> -> -You are right! After adding an extra HB in qemu, I can create a x1 -> -region successfully. -> -But have some errors in Nvdimm: -> -> -[ 74.925838] Unknown online node for memory at 0x10000000000, assuming node > 0 -> -[ 74.925846] Unknown target node for memory at 0x10000000000, assuming node > 0 -> -[ 74.927470] nd_region region0: nmem0: is disabled, failing probe -Ah. I've seen this one, but not chased it down yet. Was on my todo list to -chase -down. Once I reach this state I can verify the HDM Decode is correct which is -what -I've been using to test (Which wasn't true until earlier this week). -I'm currently testing via devmem, more for historical reasons than because it -makes -that much sense anymore. - -> -> -And x4 region still failed with same errors, using latest cxl/preview -> -branch don't work. -> -I have picked "Two CXL emulation fixes" patches in qemu, still not working. -> -> -Bob - -On Tue, 9 Aug 2022 17:08:25 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Tue, 9 Aug 2022 21:07:06 +0800 -> -Bobo WL <lmw.bobo@gmail.com> wrote: -> -> -> Hi Jonathan -> -> -> -> Thanks for your reply! -> -> -> -> On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> <Jonathan.Cameron@huawei.com> wrote: -> -> > -> -> > Probably not related to your problem, but there is a disconnect in QEMU / -> -> > kernel assumptionsaround the presence of an HDM decoder when a HB only -> -> > has a single root port. Spec allows it to be provided or not as an -> -> > implementation choice. -> -> > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > -> -> > The temporary solution is to throw in a second root port on the HB and not -> -> > connect anything to it. Longer term I may special case this so that the -> -> > particular -> -> > decoder defaults to pass through settings in QEMU if there is only one -> -> > root port. -> -> > -> -> -> -> You are right! After adding an extra HB in qemu, I can create a x1 -> -> region successfully. -> -> But have some errors in Nvdimm: -> -> -> -> [ 74.925838] Unknown online node for memory at 0x10000000000, assuming -> -> node 0 -> -> [ 74.925846] Unknown target node for memory at 0x10000000000, assuming -> -> node 0 -> -> [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> -Ah. I've seen this one, but not chased it down yet. Was on my todo list to -> -chase -> -down. Once I reach this state I can verify the HDM Decode is correct which is -> -what -> -I've been using to test (Which wasn't true until earlier this week). -> -I'm currently testing via devmem, more for historical reasons than because it -> -makes -> -that much sense anymore. -*embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -I'd forgotten that was still on the todo list. I don't think it will -be particularly hard to do and will take a look in next few days. - -Very very indirectly this error is causing a driver probe fail that means that -we hit a code path that has a rather odd looking check on NDD_LABELING. -Should not have gotten near that path though - hence the problem is actually -when we call cxl_pmem_get_config_data() and it returns an error because -we haven't fully connected up the command in QEMU. - -Jonathan - - -> -> -> -> -> And x4 region still failed with same errors, using latest cxl/preview -> -> branch don't work. -> -> I have picked "Two CXL emulation fixes" patches in qemu, still not working. -> -> -> -> Bob - -On Thu, 11 Aug 2022 18:08:57 +0100 -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: - -> -On Tue, 9 Aug 2022 17:08:25 +0100 -> -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> On Tue, 9 Aug 2022 21:07:06 +0800 -> -> Bobo WL <lmw.bobo@gmail.com> wrote: -> -> -> -> > Hi Jonathan -> -> > -> -> > Thanks for your reply! -> -> > -> -> > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > <Jonathan.Cameron@huawei.com> wrote: -> -> > > -> -> > > Probably not related to your problem, but there is a disconnect in QEMU -> -> > > / -> -> > > kernel assumptionsaround the presence of an HDM decoder when a HB only -> -> > > has a single root port. Spec allows it to be provided or not as an -> -> > > implementation choice. -> -> > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > -> -> > > The temporary solution is to throw in a second root port on the HB and -> -> > > not -> -> > > connect anything to it. Longer term I may special case this so that -> -> > > the particular -> -> > > decoder defaults to pass through settings in QEMU if there is only one -> -> > > root port. -> -> > > -> -> > -> -> > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > region successfully. -> -> > But have some errors in Nvdimm: -> -> > -> -> > [ 74.925838] Unknown online node for memory at 0x10000000000, assuming -> -> > node 0 -> -> > [ 74.925846] Unknown target node for memory at 0x10000000000, assuming -> -> > node 0 -> -> > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> -> -> Ah. I've seen this one, but not chased it down yet. Was on my todo list to -> -> chase -> -> down. Once I reach this state I can verify the HDM Decode is correct which -> -> is what -> -> I've been using to test (Which wasn't true until earlier this week). -> -> I'm currently testing via devmem, more for historical reasons than because -> -> it makes -> -> that much sense anymore. -> -> -*embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -I'd forgotten that was still on the todo list. I don't think it will -> -be particularly hard to do and will take a look in next few days. -> -> -Very very indirectly this error is causing a driver probe fail that means that -> -we hit a code path that has a rather odd looking check on NDD_LABELING. -> -Should not have gotten near that path though - hence the problem is actually -> -when we call cxl_pmem_get_config_data() and it returns an error because -> -we haven't fully connected up the command in QEMU. -So a least one bug in QEMU. We were not supporting variable length payloads on -mailbox -inputs (but were on outputs). That hasn't mattered until we get to LSA writes. -We just need to relax condition on the supplied length. - -diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -index c352a935c4..fdda9529fe 100644 ---- a/hw/cxl/cxl-mailbox-utils.c -+++ b/hw/cxl/cxl-mailbox-utils.c -@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) - cxl_cmd = &cxl_cmd_set[set][cmd]; - h = cxl_cmd->handler; - if (h) { -- if (len == cxl_cmd->in) { -+ if (len == cxl_cmd->in || !cxl_cmd->in) { - cxl_cmd->payload = cxl_dstate->mbox_reg_state + - A_CXL_DEV_CMD_PAYLOAD; - ret = (*h)(cxl_cmd, cxl_dstate, &len); - - -This lets the nvdimm/region probe fine, but I'm getting some issues with -namespace capacity so I'll look at what is causing that next. -Unfortunately I'm not that familiar with the driver/nvdimm side of things -so it's take a while to figure out what kicks off what! - -Jonathan - -> -> -Jonathan -> -> -> -> -> -> > -> -> > And x4 region still failed with same errors, using latest cxl/preview -> -> > branch don't work. -> -> > I have picked "Two CXL emulation fixes" patches in qemu, still not -> -> > working. -> -> > -> -> > Bob -> -> - -Jonathan Cameron wrote: -> -On Thu, 11 Aug 2022 18:08:57 +0100 -> -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> On Tue, 9 Aug 2022 17:08:25 +0100 -> -> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> -> > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > -> -> > > Hi Jonathan -> -> > > -> -> > > Thanks for your reply! -> -> > > -> -> > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > -> -> > > > Probably not related to your problem, but there is a disconnect in -> -> > > > QEMU / -> -> > > > kernel assumptionsaround the presence of an HDM decoder when a HB only -> -> > > > has a single root port. Spec allows it to be provided or not as an -> -> > > > implementation choice. -> -> > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > -> -> > > > The temporary solution is to throw in a second root port on the HB -> -> > > > and not -> -> > > > connect anything to it. Longer term I may special case this so that -> -> > > > the particular -> -> > > > decoder defaults to pass through settings in QEMU if there is only -> -> > > > one root port. -> -> > > > -> -> > > -> -> > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > region successfully. -> -> > > But have some errors in Nvdimm: -> -> > > -> -> > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > assuming node 0 -> -> > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > assuming node 0 -> -> > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > -> -> > Ah. I've seen this one, but not chased it down yet. Was on my todo list -> -> > to chase -> -> > down. Once I reach this state I can verify the HDM Decode is correct -> -> > which is what -> -> > I've been using to test (Which wasn't true until earlier this week). -> -> > I'm currently testing via devmem, more for historical reasons than -> -> > because it makes -> -> > that much sense anymore. -> -> -> -> *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> I'd forgotten that was still on the todo list. I don't think it will -> -> be particularly hard to do and will take a look in next few days. -> -> -> -> Very very indirectly this error is causing a driver probe fail that means -> -> that -> -> we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> Should not have gotten near that path though - hence the problem is actually -> -> when we call cxl_pmem_get_config_data() and it returns an error because -> -> we haven't fully connected up the command in QEMU. -> -> -So a least one bug in QEMU. We were not supporting variable length payloads -> -on mailbox -> -inputs (but were on outputs). That hasn't mattered until we get to LSA -> -writes. -> -We just need to relax condition on the supplied length. -> -> -diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -index c352a935c4..fdda9529fe 100644 -> ---- a/hw/cxl/cxl-mailbox-utils.c -> -+++ b/hw/cxl/cxl-mailbox-utils.c -> -@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -cxl_cmd = &cxl_cmd_set[set][cmd]; -> -h = cxl_cmd->handler; -> -if (h) { -> -- if (len == cxl_cmd->in) { -> -+ if (len == cxl_cmd->in || !cxl_cmd->in) { -> -cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -A_CXL_DEV_CMD_PAYLOAD; -> -ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> -> -This lets the nvdimm/region probe fine, but I'm getting some issues with -> -namespace capacity so I'll look at what is causing that next. -> -Unfortunately I'm not that familiar with the driver/nvdimm side of things -> -so it's take a while to figure out what kicks off what! -The whirlwind tour is that 'struct nd_region' instances that represent a -persitent memory address range are composed of one more mappings of -'struct nvdimm' objects. The nvdimm object is driven by the dimm driver -in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking -the dimm (if locked) and interrogating the label area to look for -namespace labels. - -The label command calls are routed to the '->ndctl()' callback that was -registered when the CXL nvdimm_bus_descriptor was created. That callback -handles both 'bus' scope calls, currently none for CXL, and per nvdimm -calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands -to CXL commands. - -The 'struct nvdimm' objects that the CXL side registers have the -NDD_LABELING flag set which means that namespaces need to be explicitly -created / provisioned from region capacity. Otherwise, if -drivers/nvdimm/dimm.c does not find a namespace-label-index block then -the region reverts to label-less mode and a default namespace equal to -the size of the region is instantiated. - -If you are seeing small mismatches in namespace capacity then it may -just be the fact that by default 'ndctl create-namespace' results in an -'fsdax' mode namespace which just means that it is a block device where -1.5% of the capacity is reserved for 'struct page' metadata. You should -be able to see namespace capacity == region capacity by doing "ndctl -create-namespace -m raw", and disable DAX operation. - -Hope that helps. - -On Fri, 12 Aug 2022 09:03:02 -0700 -Dan Williams <dan.j.williams@intel.com> wrote: - -> -Jonathan Cameron wrote: -> -> On Thu, 11 Aug 2022 18:08:57 +0100 -> -> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> -> > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > -> -> > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > -> -> > > > Hi Jonathan -> -> > > > -> -> > > > Thanks for your reply! -> -> > > > -> -> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > -> -> > > > > Probably not related to your problem, but there is a disconnect in -> -> > > > > QEMU / -> -> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB -> -> > > > > only -> -> > > > > has a single root port. Spec allows it to be provided or not as an -> -> > > > > implementation choice. -> -> > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > -> -> > > > > The temporary solution is to throw in a second root port on the HB -> -> > > > > and not -> -> > > > > connect anything to it. Longer term I may special case this so -> -> > > > > that the particular -> -> > > > > decoder defaults to pass through settings in QEMU if there is only -> -> > > > > one root port. -> -> > > > > -> -> > > > -> -> > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > region successfully. -> -> > > > But have some errors in Nvdimm: -> -> > > > -> -> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > > -> -> > > -> -> > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > list to chase -> -> > > down. Once I reach this state I can verify the HDM Decode is correct -> -> > > which is what -> -> > > I've been using to test (Which wasn't true until earlier this week). -> -> > > I'm currently testing via devmem, more for historical reasons than -> -> > > because it makes -> -> > > that much sense anymore. -> -> > -> -> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > I'd forgotten that was still on the todo list. I don't think it will -> -> > be particularly hard to do and will take a look in next few days. -> -> > -> -> > Very very indirectly this error is causing a driver probe fail that means -> -> > that -> -> > we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> > Should not have gotten near that path though - hence the problem is -> -> > actually -> -> > when we call cxl_pmem_get_config_data() and it returns an error because -> -> > we haven't fully connected up the command in QEMU. -> -> -> -> So a least one bug in QEMU. We were not supporting variable length payloads -> -> on mailbox -> -> inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> writes. -> -> We just need to relax condition on the supplied length. -> -> -> -> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> index c352a935c4..fdda9529fe 100644 -> -> --- a/hw/cxl/cxl-mailbox-utils.c -> -> +++ b/hw/cxl/cxl-mailbox-utils.c -> -> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> h = cxl_cmd->handler; -> -> if (h) { -> -> - if (len == cxl_cmd->in) { -> -> + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -> cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -> A_CXL_DEV_CMD_PAYLOAD; -> -> ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> -> -> -> -> This lets the nvdimm/region probe fine, but I'm getting some issues with -> -> namespace capacity so I'll look at what is causing that next. -> -> Unfortunately I'm not that familiar with the driver/nvdimm side of things -> -> so it's take a while to figure out what kicks off what! -> -> -The whirlwind tour is that 'struct nd_region' instances that represent a -> -persitent memory address range are composed of one more mappings of -> -'struct nvdimm' objects. The nvdimm object is driven by the dimm driver -> -in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking -> -the dimm (if locked) and interrogating the label area to look for -> -namespace labels. -> -> -The label command calls are routed to the '->ndctl()' callback that was -> -registered when the CXL nvdimm_bus_descriptor was created. That callback -> -handles both 'bus' scope calls, currently none for CXL, and per nvdimm -> -calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands -> -to CXL commands. -> -> -The 'struct nvdimm' objects that the CXL side registers have the -> -NDD_LABELING flag set which means that namespaces need to be explicitly -> -created / provisioned from region capacity. Otherwise, if -> -drivers/nvdimm/dimm.c does not find a namespace-label-index block then -> -the region reverts to label-less mode and a default namespace equal to -> -the size of the region is instantiated. -> -> -If you are seeing small mismatches in namespace capacity then it may -> -just be the fact that by default 'ndctl create-namespace' results in an -> -'fsdax' mode namespace which just means that it is a block device where -> -1.5% of the capacity is reserved for 'struct page' metadata. You should -> -be able to see namespace capacity == region capacity by doing "ndctl -> -create-namespace -m raw", and disable DAX operation. -Currently ndctl create-namespace crashes qemu ;) -Which isn't ideal! - -> -> -Hope that helps. -Got me looking at the right code. Thanks! - -Jonathan - -On Fri, 12 Aug 2022 17:15:09 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Fri, 12 Aug 2022 09:03:02 -0700 -> -Dan Williams <dan.j.williams@intel.com> wrote: -> -> -> Jonathan Cameron wrote: -> -> > On Thu, 11 Aug 2022 18:08:57 +0100 -> -> > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> > -> -> > > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > > -> -> > > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > > -> -> > > > > Hi Jonathan -> -> > > > > -> -> > > > > Thanks for your reply! -> -> > > > > -> -> > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > > -> -> > > > > > Probably not related to your problem, but there is a disconnect -> -> > > > > > in QEMU / -> -> > > > > > kernel assumptionsaround the presence of an HDM decoder when a HB -> -> > > > > > only -> -> > > > > > has a single root port. Spec allows it to be provided or not as -> -> > > > > > an implementation choice. -> -> > > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > > -> -> > > > > > The temporary solution is to throw in a second root port on the -> -> > > > > > HB and not -> -> > > > > > connect anything to it. Longer term I may special case this so -> -> > > > > > that the particular -> -> > > > > > decoder defaults to pass through settings in QEMU if there is -> -> > > > > > only one root port. -> -> > > > > > -> -> > > > > -> -> > > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > > region successfully. -> -> > > > > But have some errors in Nvdimm: -> -> > > > > -> -> > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > > assuming node 0 -> -> > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > > assuming node 0 -> -> > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > > > -> -> > > > -> -> > > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > > list to chase -> -> > > > down. Once I reach this state I can verify the HDM Decode is correct -> -> > > > which is what -> -> > > > I've been using to test (Which wasn't true until earlier this week). -> -> > > > I'm currently testing via devmem, more for historical reasons than -> -> > > > because it makes -> -> > > > that much sense anymore. -> -> > > -> -> > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > > I'd forgotten that was still on the todo list. I don't think it will -> -> > > be particularly hard to do and will take a look in next few days. -> -> > > -> -> > > Very very indirectly this error is causing a driver probe fail that -> -> > > means that -> -> > > we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> > > Should not have gotten near that path though - hence the problem is -> -> > > actually -> -> > > when we call cxl_pmem_get_config_data() and it returns an error because -> -> > > we haven't fully connected up the command in QEMU. -> -> > -> -> > So a least one bug in QEMU. We were not supporting variable length -> -> > payloads on mailbox -> -> > inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> > writes. -> -> > We just need to relax condition on the supplied length. -> -> > -> -> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> > index c352a935c4..fdda9529fe 100644 -> -> > --- a/hw/cxl/cxl-mailbox-utils.c -> -> > +++ b/hw/cxl/cxl-mailbox-utils.c -> -> > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> > cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> > h = cxl_cmd->handler; -> -> > if (h) { -> -> > - if (len == cxl_cmd->in) { -> -> > + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -> > cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -> > A_CXL_DEV_CMD_PAYLOAD; -> -> > ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> > -> -> > -> -> > This lets the nvdimm/region probe fine, but I'm getting some issues with -> -> > namespace capacity so I'll look at what is causing that next. -> -> > Unfortunately I'm not that familiar with the driver/nvdimm side of things -> -> > so it's take a while to figure out what kicks off what! -> -> -> -> The whirlwind tour is that 'struct nd_region' instances that represent a -> -> persitent memory address range are composed of one more mappings of -> -> 'struct nvdimm' objects. The nvdimm object is driven by the dimm driver -> -> in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking -> -> the dimm (if locked) and interrogating the label area to look for -> -> namespace labels. -> -> -> -> The label command calls are routed to the '->ndctl()' callback that was -> -> registered when the CXL nvdimm_bus_descriptor was created. That callback -> -> handles both 'bus' scope calls, currently none for CXL, and per nvdimm -> -> calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands -> -> to CXL commands. -> -> -> -> The 'struct nvdimm' objects that the CXL side registers have the -> -> NDD_LABELING flag set which means that namespaces need to be explicitly -> -> created / provisioned from region capacity. Otherwise, if -> -> drivers/nvdimm/dimm.c does not find a namespace-label-index block then -> -> the region reverts to label-less mode and a default namespace equal to -> -> the size of the region is instantiated. -> -> -> -> If you are seeing small mismatches in namespace capacity then it may -> -> just be the fact that by default 'ndctl create-namespace' results in an -> -> 'fsdax' mode namespace which just means that it is a block device where -> -> 1.5% of the capacity is reserved for 'struct page' metadata. You should -> -> be able to see namespace capacity == region capacity by doing "ndctl -> -> create-namespace -m raw", and disable DAX operation. -> -> -Currently ndctl create-namespace crashes qemu ;) -> -Which isn't ideal! -> -Found a cause for this one. Mailbox payload may be as small as 256 bytes. -We have code in kernel sanity checking that output payload fits in the -mailbox, but nothing on the input payload. Symptom is that we write just -off the end whatever size the payload is. Note doing this shouldn't crash -qemu - so I need to fix a range check somewhere. - -I think this is because cxl_pmem_get_config_size() returns the mailbox -payload size as being the available LSA size, forgetting to remove the -size of the headers on the set_lsa side of things. -https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/pmem.c?h=next#n110 -I've hacked the max_payload to be -8 - -Now we still don't succeed in creating the namespace, but bonus is it doesn't -crash any more. - - -Jonathan - - - -> -> -> -> Hope that helps. -> -Got me looking at the right code. Thanks! -> -> -Jonathan -> -> - -On Mon, 15 Aug 2022 15:18:09 +0100 -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: - -> -On Fri, 12 Aug 2022 17:15:09 +0100 -> -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> On Fri, 12 Aug 2022 09:03:02 -0700 -> -> Dan Williams <dan.j.williams@intel.com> wrote: -> -> -> -> > Jonathan Cameron wrote: -> -> > > On Thu, 11 Aug 2022 18:08:57 +0100 -> -> > > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> > > -> -> > > > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > > > -> -> > > > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > > > -> -> > > > > > Hi Jonathan -> -> > > > > > -> -> > > > > > Thanks for your reply! -> -> > > > > > -> -> > > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > > > -> -> > > > > > > Probably not related to your problem, but there is a disconnect -> -> > > > > > > in QEMU / -> -> > > > > > > kernel assumptionsaround the presence of an HDM decoder when a -> -> > > > > > > HB only -> -> > > > > > > has a single root port. Spec allows it to be provided or not as -> -> > > > > > > an implementation choice. -> -> > > > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > > > -> -> > > > > > > The temporary solution is to throw in a second root port on the -> -> > > > > > > HB and not -> -> > > > > > > connect anything to it. Longer term I may special case this so -> -> > > > > > > that the particular -> -> > > > > > > decoder defaults to pass through settings in QEMU if there is -> -> > > > > > > only one root port. -> -> > > > > > > -> -> > > > > > -> -> > > > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > > > region successfully. -> -> > > > > > But have some errors in Nvdimm: -> -> > > > > > -> -> > > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > > > assuming node 0 -> -> > > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > > > assuming node 0 -> -> > > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing -> -> > > > > > probe -> -> > > > > -> -> > > > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > > > list to chase -> -> > > > > down. Once I reach this state I can verify the HDM Decode is -> -> > > > > correct which is what -> -> > > > > I've been using to test (Which wasn't true until earlier this -> -> > > > > week). -> -> > > > > I'm currently testing via devmem, more for historical reasons than -> -> > > > > because it makes -> -> > > > > that much sense anymore. -> -> > > > -> -> > > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > > > I'd forgotten that was still on the todo list. I don't think it will -> -> > > > be particularly hard to do and will take a look in next few days. -> -> > > > -> -> > > > Very very indirectly this error is causing a driver probe fail that -> -> > > > means that -> -> > > > we hit a code path that has a rather odd looking check on -> -> > > > NDD_LABELING. -> -> > > > Should not have gotten near that path though - hence the problem is -> -> > > > actually -> -> > > > when we call cxl_pmem_get_config_data() and it returns an error -> -> > > > because -> -> > > > we haven't fully connected up the command in QEMU. -> -> > > -> -> > > So a least one bug in QEMU. We were not supporting variable length -> -> > > payloads on mailbox -> -> > > inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> > > writes. -> -> > > We just need to relax condition on the supplied length. -> -> > > -> -> > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> > > index c352a935c4..fdda9529fe 100644 -> -> > > --- a/hw/cxl/cxl-mailbox-utils.c -> -> > > +++ b/hw/cxl/cxl-mailbox-utils.c -> -> > > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> > > cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> > > h = cxl_cmd->handler; -> -> > > if (h) { -> -> > > - if (len == cxl_cmd->in) { -> -> > > + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -> > > cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -> > > A_CXL_DEV_CMD_PAYLOAD; -> -> > > ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> > > -> -> > > -> -> > > This lets the nvdimm/region probe fine, but I'm getting some issues with -> -> > > namespace capacity so I'll look at what is causing that next. -> -> > > Unfortunately I'm not that familiar with the driver/nvdimm side of -> -> > > things -> -> > > so it's take a while to figure out what kicks off what! -> -> > -> -> > The whirlwind tour is that 'struct nd_region' instances that represent a -> -> > persitent memory address range are composed of one more mappings of -> -> > 'struct nvdimm' objects. The nvdimm object is driven by the dimm driver -> -> > in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking -> -> > the dimm (if locked) and interrogating the label area to look for -> -> > namespace labels. -> -> > -> -> > The label command calls are routed to the '->ndctl()' callback that was -> -> > registered when the CXL nvdimm_bus_descriptor was created. That callback -> -> > handles both 'bus' scope calls, currently none for CXL, and per nvdimm -> -> > calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands -> -> > to CXL commands. -> -> > -> -> > The 'struct nvdimm' objects that the CXL side registers have the -> -> > NDD_LABELING flag set which means that namespaces need to be explicitly -> -> > created / provisioned from region capacity. Otherwise, if -> -> > drivers/nvdimm/dimm.c does not find a namespace-label-index block then -> -> > the region reverts to label-less mode and a default namespace equal to -> -> > the size of the region is instantiated. -> -> > -> -> > If you are seeing small mismatches in namespace capacity then it may -> -> > just be the fact that by default 'ndctl create-namespace' results in an -> -> > 'fsdax' mode namespace which just means that it is a block device where -> -> > 1.5% of the capacity is reserved for 'struct page' metadata. You should -> -> > be able to see namespace capacity == region capacity by doing "ndctl -> -> > create-namespace -m raw", and disable DAX operation. -> -> -> -> Currently ndctl create-namespace crashes qemu ;) -> -> Which isn't ideal! -> -> -> -> -Found a cause for this one. Mailbox payload may be as small as 256 bytes. -> -We have code in kernel sanity checking that output payload fits in the -> -mailbox, but nothing on the input payload. Symptom is that we write just -> -off the end whatever size the payload is. Note doing this shouldn't crash -> -qemu - so I need to fix a range check somewhere. -> -> -I think this is because cxl_pmem_get_config_size() returns the mailbox -> -payload size as being the available LSA size, forgetting to remove the -> -size of the headers on the set_lsa side of things. -> -https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/pmem.c?h=next#n110 -> -> -I've hacked the max_payload to be -8 -> -> -Now we still don't succeed in creating the namespace, but bonus is it doesn't -> -crash any more. -In the interests of defensive / correct handling from QEMU I took a -look into why it was crashing. Turns out that providing a NULL write callback -for -the memory device region (that the above overlarge write was spilling into) -isn't -a safe thing to do. Needs a stub. Oops. - -On plus side we might never have noticed this was going wrong without the crash -*silver lining in every cloud* - -Fix to follow... - -Jonathan - - -> -> -> -Jonathan -> -> -> -> -> > -> -> > Hope that helps. -> -> Got me looking at the right code. Thanks! -> -> -> -> Jonathan -> -> -> -> -> -> - -On Mon, 15 Aug 2022 at 15:55, Jonathan Cameron via <qemu-arm@nongnu.org> wrote: -> -In the interests of defensive / correct handling from QEMU I took a -> -look into why it was crashing. Turns out that providing a NULL write -> -callback for -> -the memory device region (that the above overlarge write was spilling into) -> -isn't -> -a safe thing to do. Needs a stub. Oops. -Yeah. We've talked before about adding an assert so that that kind of -"missing function" bug is caught at device creation rather than only -if the guest tries to access the device, but we never quite got around -to it... - --- PMM - -On Fri, 12 Aug 2022 16:44:03 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Thu, 11 Aug 2022 18:08:57 +0100 -> -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> On Tue, 9 Aug 2022 17:08:25 +0100 -> -> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> -> > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > -> -> > > Hi Jonathan -> -> > > -> -> > > Thanks for your reply! -> -> > > -> -> > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > -> -> > > > Probably not related to your problem, but there is a disconnect in -> -> > > > QEMU / -> -> > > > kernel assumptionsaround the presence of an HDM decoder when a HB only -> -> > > > has a single root port. Spec allows it to be provided or not as an -> -> > > > implementation choice. -> -> > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > -> -> > > > The temporary solution is to throw in a second root port on the HB -> -> > > > and not -> -> > > > connect anything to it. Longer term I may special case this so that -> -> > > > the particular -> -> > > > decoder defaults to pass through settings in QEMU if there is only -> -> > > > one root port. -> -> > > > -> -> > > -> -> > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > region successfully. -> -> > > But have some errors in Nvdimm: -> -> > > -> -> > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > assuming node 0 -> -> > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > assuming node 0 -> -> > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > -> -> > -> -> > Ah. I've seen this one, but not chased it down yet. Was on my todo list -> -> > to chase -> -> > down. Once I reach this state I can verify the HDM Decode is correct -> -> > which is what -> -> > I've been using to test (Which wasn't true until earlier this week). -> -> > I'm currently testing via devmem, more for historical reasons than -> -> > because it makes -> -> > that much sense anymore. -> -> -> -> *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> I'd forgotten that was still on the todo list. I don't think it will -> -> be particularly hard to do and will take a look in next few days. -> -> -> -> Very very indirectly this error is causing a driver probe fail that means -> -> that -> -> we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> Should not have gotten near that path though - hence the problem is actually -> -> when we call cxl_pmem_get_config_data() and it returns an error because -> -> we haven't fully connected up the command in QEMU. -> -> -So a least one bug in QEMU. We were not supporting variable length payloads -> -on mailbox -> -inputs (but were on outputs). That hasn't mattered until we get to LSA -> -writes. -> -We just need to relax condition on the supplied length. -> -> -diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -index c352a935c4..fdda9529fe 100644 -> ---- a/hw/cxl/cxl-mailbox-utils.c -> -+++ b/hw/cxl/cxl-mailbox-utils.c -> -@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -cxl_cmd = &cxl_cmd_set[set][cmd]; -> -h = cxl_cmd->handler; -> -if (h) { -> -- if (len == cxl_cmd->in) { -> -+ if (len == cxl_cmd->in || !cxl_cmd->in) { -Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. - -With that fixed we hit new fun paths - after some errors we get the -worrying - not totally sure but looks like a failure on an error cleanup. -I'll chase down the error source, but even then this is probably triggerable by -hardware problem or similar. Some bonus prints in here from me chasing -error paths, but it's otherwise just cxl/next + the fix I posted earlier today. - -[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) -[ 69.920108] nd_region_probe -[ 69.920623] ------------[ cut here ]------------ -[ 69.920675] refcount_t: addition on 0; use-after-free. -[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 -refcount_warn_saturate+0xa0/0x144 -[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi -cxl_core -[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 -[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 -[ 69.931482] Workqueue: events_unbound async_run_entry_fn -[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 -[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 -[ 69.936541] sp : ffff80000890b960 -[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: 0000000000000000 -[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: 0000000000000000 -[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: ffff0000c5254800 -[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: ffffffffffffffff -[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: 0000000000000000 -[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: 657466612d657375 -[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : ffffa54a8f63d288 -[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : 00000000fffff31e -[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : ffff5ab66e5ef000 -root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : -0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 -[ 69.957098] Call trace: -[ 69.957959] refcount_warn_saturate+0xa0/0x144 -[ 69.958773] get_ndd+0x5c/0x80 -[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 -[ 69.960253] nd_region_probe+0x100/0x290 -[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 -[ 69.962087] really_probe+0x19c/0x3f0 -[ 69.962620] __driver_probe_device+0x11c/0x190 -[ 69.963258] driver_probe_device+0x44/0xf4 -[ 69.963773] __device_attach_driver+0xa4/0x140 -[ 69.964471] bus_for_each_drv+0x84/0xe0 -[ 69.965068] __device_attach+0xb0/0x1f0 -[ 69.966101] device_initial_probe+0x20/0x30 -[ 69.967142] bus_probe_device+0xa4/0xb0 -[ 69.968104] device_add+0x3e8/0x910 -[ 69.969111] nd_async_device_register+0x24/0x74 -[ 69.969928] async_run_entry_fn+0x40/0x150 -[ 69.970725] process_one_work+0x1dc/0x450 -[ 69.971796] worker_thread+0x154/0x450 -[ 69.972700] kthread+0x118/0x120 -[ 69.974141] ret_from_fork+0x10/0x20 -[ 69.975141] ---[ end trace 0000000000000000 ]--- -[ 70.117887] Into nd_namespace_pmem_set_resource() - -> -cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -A_CXL_DEV_CMD_PAYLOAD; -> -ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> -> -This lets the nvdimm/region probe fine, but I'm getting some issues with -> -namespace capacity so I'll look at what is causing that next. -> -Unfortunately I'm not that familiar with the driver/nvdimm side of things -> -so it's take a while to figure out what kicks off what! -> -> -Jonathan -> -> -> -> -> Jonathan -> -> -> -> -> -> > -> -> > > -> -> > > And x4 region still failed with same errors, using latest cxl/preview -> -> > > branch don't work. -> -> > > I have picked "Two CXL emulation fixes" patches in qemu, still not -> -> > > working. -> -> > > -> -> > > Bob -> -> -> -> -> - -On Mon, 15 Aug 2022 18:04:44 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Fri, 12 Aug 2022 16:44:03 +0100 -> -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> On Thu, 11 Aug 2022 18:08:57 +0100 -> -> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> -> > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > -> -> > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > -> -> > > > Hi Jonathan -> -> > > > -> -> > > > Thanks for your reply! -> -> > > > -> -> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > -> -> > > > > Probably not related to your problem, but there is a disconnect in -> -> > > > > QEMU / -> -> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB -> -> > > > > only -> -> > > > > has a single root port. Spec allows it to be provided or not as an -> -> > > > > implementation choice. -> -> > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > -> -> > > > > The temporary solution is to throw in a second root port on the HB -> -> > > > > and not -> -> > > > > connect anything to it. Longer term I may special case this so -> -> > > > > that the particular -> -> > > > > decoder defaults to pass through settings in QEMU if there is only -> -> > > > > one root port. -> -> > > > > -> -> > > > -> -> > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > region successfully. -> -> > > > But have some errors in Nvdimm: -> -> > > > -> -> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > > -> -> > > -> -> > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > list to chase -> -> > > down. Once I reach this state I can verify the HDM Decode is correct -> -> > > which is what -> -> > > I've been using to test (Which wasn't true until earlier this week). -> -> > > I'm currently testing via devmem, more for historical reasons than -> -> > > because it makes -> -> > > that much sense anymore. -> -> > -> -> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > I'd forgotten that was still on the todo list. I don't think it will -> -> > be particularly hard to do and will take a look in next few days. -> -> > -> -> > Very very indirectly this error is causing a driver probe fail that means -> -> > that -> -> > we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> > Should not have gotten near that path though - hence the problem is -> -> > actually -> -> > when we call cxl_pmem_get_config_data() and it returns an error because -> -> > we haven't fully connected up the command in QEMU. -> -> -> -> So a least one bug in QEMU. We were not supporting variable length payloads -> -> on mailbox -> -> inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> writes. -> -> We just need to relax condition on the supplied length. -> -> -> -> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> index c352a935c4..fdda9529fe 100644 -> -> --- a/hw/cxl/cxl-mailbox-utils.c -> -> +++ b/hw/cxl/cxl-mailbox-utils.c -> -> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> h = cxl_cmd->handler; -> -> if (h) { -> -> - if (len == cxl_cmd->in) { -> -> + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. -Cause of the error is a failure in GET_LSA. -Reason, payload length is wrong in QEMU but was hidden previously by my wrong -fix here. Probably still a good idea to inject an error in GET_LSA and chase -down the refcount issue. - - -diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -index fdda9529fe..e8565fbd6e 100644 ---- a/hw/cxl/cxl-mailbox-utils.c -+++ b/hw/cxl/cxl-mailbox-utils.c -@@ -489,7 +489,7 @@ static struct cxl_cmd cxl_cmd_set[256][256] = { - cmd_identify_memory_device, 0, 0 }, - [CCLS][GET_PARTITION_INFO] = { "CCLS_GET_PARTITION_INFO", - cmd_ccls_get_partition_info, 0, 0 }, -- [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 0, 0 }, -+ [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 }, - [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa, - ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE }, - [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST", -@@ -510,12 +510,13 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) - cxl_cmd = &cxl_cmd_set[set][cmd]; - h = cxl_cmd->handler; - if (h) { -- if (len == cxl_cmd->in || !cxl_cmd->in) { -+ if (len == cxl_cmd->in || cxl_cmd->in == ~0) { - cxl_cmd->payload = cxl_dstate->mbox_reg_state + - A_CXL_DEV_CMD_PAYLOAD; - -And woot, we get a namespace in the LSA :) - -I'll post QEMU fixes in next day or two. Kernel side now seems more or less -fine be it with suspicious refcount underflow. - -> -> -With that fixed we hit new fun paths - after some errors we get the -> -worrying - not totally sure but looks like a failure on an error cleanup. -> -I'll chase down the error source, but even then this is probably triggerable -> -by -> -hardware problem or similar. Some bonus prints in here from me chasing -> -error paths, but it's otherwise just cxl/next + the fix I posted earlier -> -today. -> -> -[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) -> -[ 69.920108] nd_region_probe -> -[ 69.920623] ------------[ cut here ]------------ -> -[ 69.920675] refcount_t: addition on 0; use-after-free. -> -[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 -> -refcount_warn_saturate+0xa0/0x144 -> -[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi -> -cxl_core -> -[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 -> -[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 -> -[ 69.931482] Workqueue: events_unbound async_run_entry_fn -> -[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -> -[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 -> -[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 -> -[ 69.936541] sp : ffff80000890b960 -> -[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: -> -0000000000000000 -> -[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: -> -0000000000000000 -> -[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: -> -ffff0000c5254800 -> -[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: -> -ffffffffffffffff -> -[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: -> -0000000000000000 -> -[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: -> -657466612d657375 -> -[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : -> -ffffa54a8f63d288 -> -[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : -> -00000000fffff31e -> -[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : -> -ffff5ab66e5ef000 -> -root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : -> -0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 -> -[ 69.957098] Call trace: -> -[ 69.957959] refcount_warn_saturate+0xa0/0x144 -> -[ 69.958773] get_ndd+0x5c/0x80 -> -[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 -> -[ 69.960253] nd_region_probe+0x100/0x290 -> -[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 -> -[ 69.962087] really_probe+0x19c/0x3f0 -> -[ 69.962620] __driver_probe_device+0x11c/0x190 -> -[ 69.963258] driver_probe_device+0x44/0xf4 -> -[ 69.963773] __device_attach_driver+0xa4/0x140 -> -[ 69.964471] bus_for_each_drv+0x84/0xe0 -> -[ 69.965068] __device_attach+0xb0/0x1f0 -> -[ 69.966101] device_initial_probe+0x20/0x30 -> -[ 69.967142] bus_probe_device+0xa4/0xb0 -> -[ 69.968104] device_add+0x3e8/0x910 -> -[ 69.969111] nd_async_device_register+0x24/0x74 -> -[ 69.969928] async_run_entry_fn+0x40/0x150 -> -[ 69.970725] process_one_work+0x1dc/0x450 -> -[ 69.971796] worker_thread+0x154/0x450 -> -[ 69.972700] kthread+0x118/0x120 -> -[ 69.974141] ret_from_fork+0x10/0x20 -> -[ 69.975141] ---[ end trace 0000000000000000 ]--- -> -[ 70.117887] Into nd_namespace_pmem_set_resource() -> -> -> cxl_cmd->payload = cxl_dstate->mbox_reg_state + -> -> A_CXL_DEV_CMD_PAYLOAD; -> -> ret = (*h)(cxl_cmd, cxl_dstate, &len); -> -> -> -> -> -> This lets the nvdimm/region probe fine, but I'm getting some issues with -> -> namespace capacity so I'll look at what is causing that next. -> -> Unfortunately I'm not that familiar with the driver/nvdimm side of things -> -> so it's take a while to figure out what kicks off what! -> -> -> -> Jonathan -> -> -> -> > -> -> > Jonathan -> -> > -> -> > -> -> > > -> -> > > > -> -> > > > And x4 region still failed with same errors, using latest cxl/preview -> -> > > > branch don't work. -> -> > > > I have picked "Two CXL emulation fixes" patches in qemu, still not -> -> > > > working. -> -> > > > -> -> > > > Bob -> -> > -> -> > -> -> -> - -Jonathan Cameron wrote: -> -On Fri, 12 Aug 2022 16:44:03 +0100 -> -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> On Thu, 11 Aug 2022 18:08:57 +0100 -> -> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> -> > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > -> -> > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > -> -> > > > Hi Jonathan -> -> > > > -> -> > > > Thanks for your reply! -> -> > > > -> -> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > -> -> > > > > Probably not related to your problem, but there is a disconnect in -> -> > > > > QEMU / -> -> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB -> -> > > > > only -> -> > > > > has a single root port. Spec allows it to be provided or not as an -> -> > > > > implementation choice. -> -> > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > -> -> > > > > The temporary solution is to throw in a second root port on the HB -> -> > > > > and not -> -> > > > > connect anything to it. Longer term I may special case this so -> -> > > > > that the particular -> -> > > > > decoder defaults to pass through settings in QEMU if there is only -> -> > > > > one root port. -> -> > > > > -> -> > > > -> -> > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > region successfully. -> -> > > > But have some errors in Nvdimm: -> -> > > > -> -> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > assuming node 0 -> -> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > > -> -> > > -> -> > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > list to chase -> -> > > down. Once I reach this state I can verify the HDM Decode is correct -> -> > > which is what -> -> > > I've been using to test (Which wasn't true until earlier this week). -> -> > > I'm currently testing via devmem, more for historical reasons than -> -> > > because it makes -> -> > > that much sense anymore. -> -> > -> -> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > I'd forgotten that was still on the todo list. I don't think it will -> -> > be particularly hard to do and will take a look in next few days. -> -> > -> -> > Very very indirectly this error is causing a driver probe fail that means -> -> > that -> -> > we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> > Should not have gotten near that path though - hence the problem is -> -> > actually -> -> > when we call cxl_pmem_get_config_data() and it returns an error because -> -> > we haven't fully connected up the command in QEMU. -> -> -> -> So a least one bug in QEMU. We were not supporting variable length payloads -> -> on mailbox -> -> inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> writes. -> -> We just need to relax condition on the supplied length. -> -> -> -> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> index c352a935c4..fdda9529fe 100644 -> -> --- a/hw/cxl/cxl-mailbox-utils.c -> -> +++ b/hw/cxl/cxl-mailbox-utils.c -> -> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> h = cxl_cmd->handler; -> -> if (h) { -> -> - if (len == cxl_cmd->in) { -> -> + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. -> -> -With that fixed we hit new fun paths - after some errors we get the -> -worrying - not totally sure but looks like a failure on an error cleanup. -> -I'll chase down the error source, but even then this is probably triggerable -> -by -> -hardware problem or similar. Some bonus prints in here from me chasing -> -error paths, but it's otherwise just cxl/next + the fix I posted earlier -> -today. -One of the scenarios that I cannot rule out is nvdimm_probe() racing -nd_region_probe(), but given all the work it takes to create a region I -suspect all the nvdimm_probe() work to have completed... - -It is at least one potentially wrong hypothesis that needs to be chased -down. - -> -> -[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) -> -[ 69.920108] nd_region_probe -> -[ 69.920623] ------------[ cut here ]------------ -> -[ 69.920675] refcount_t: addition on 0; use-after-free. -> -[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 -> -refcount_warn_saturate+0xa0/0x144 -> -[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi -> -cxl_core -> -[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 -> -[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 -> -[ 69.931482] Workqueue: events_unbound async_run_entry_fn -> -[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -> -[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 -> -[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 -> -[ 69.936541] sp : ffff80000890b960 -> -[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: -> -0000000000000000 -> -[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: -> -0000000000000000 -> -[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: -> -ffff0000c5254800 -> -[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: -> -ffffffffffffffff -> -[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: -> -0000000000000000 -> -[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: -> -657466612d657375 -> -[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : -> -ffffa54a8f63d288 -> -[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : -> -00000000fffff31e -> -[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : -> -ffff5ab66e5ef000 -> -root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : -> -0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 -> -[ 69.957098] Call trace: -> -[ 69.957959] refcount_warn_saturate+0xa0/0x144 -> -[ 69.958773] get_ndd+0x5c/0x80 -> -[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 -> -[ 69.960253] nd_region_probe+0x100/0x290 -> -[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 -> -[ 69.962087] really_probe+0x19c/0x3f0 -> -[ 69.962620] __driver_probe_device+0x11c/0x190 -> -[ 69.963258] driver_probe_device+0x44/0xf4 -> -[ 69.963773] __device_attach_driver+0xa4/0x140 -> -[ 69.964471] bus_for_each_drv+0x84/0xe0 -> -[ 69.965068] __device_attach+0xb0/0x1f0 -> -[ 69.966101] device_initial_probe+0x20/0x30 -> -[ 69.967142] bus_probe_device+0xa4/0xb0 -> -[ 69.968104] device_add+0x3e8/0x910 -> -[ 69.969111] nd_async_device_register+0x24/0x74 -> -[ 69.969928] async_run_entry_fn+0x40/0x150 -> -[ 69.970725] process_one_work+0x1dc/0x450 -> -[ 69.971796] worker_thread+0x154/0x450 -> -[ 69.972700] kthread+0x118/0x120 -> -[ 69.974141] ret_from_fork+0x10/0x20 -> -[ 69.975141] ---[ end trace 0000000000000000 ]--- -> -[ 70.117887] Into nd_namespace_pmem_set_resource() - -On Mon, 15 Aug 2022 15:55:15 -0700 -Dan Williams <dan.j.williams@intel.com> wrote: - -> -Jonathan Cameron wrote: -> -> On Fri, 12 Aug 2022 16:44:03 +0100 -> -> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> -> > On Thu, 11 Aug 2022 18:08:57 +0100 -> -> > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> > -> -> > > On Tue, 9 Aug 2022 17:08:25 +0100 -> -> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> > > -> -> > > > On Tue, 9 Aug 2022 21:07:06 +0800 -> -> > > > Bobo WL <lmw.bobo@gmail.com> wrote: -> -> > > > -> -> > > > > Hi Jonathan -> -> > > > > -> -> > > > > Thanks for your reply! -> -> > > > > -> -> > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron -> -> > > > > <Jonathan.Cameron@huawei.com> wrote: -> -> > > > > > -> -> > > > > > Probably not related to your problem, but there is a disconnect -> -> > > > > > in QEMU / -> -> > > > > > kernel assumptionsaround the presence of an HDM decoder when a HB -> -> > > > > > only -> -> > > > > > has a single root port. Spec allows it to be provided or not as -> -> > > > > > an implementation choice. -> -> > > > > > Kernel assumes it isn't provide. Qemu assumes it is. -> -> > > > > > -> -> > > > > > The temporary solution is to throw in a second root port on the -> -> > > > > > HB and not -> -> > > > > > connect anything to it. Longer term I may special case this so -> -> > > > > > that the particular -> -> > > > > > decoder defaults to pass through settings in QEMU if there is -> -> > > > > > only one root port. -> -> > > > > > -> -> > > > > -> -> > > > > You are right! After adding an extra HB in qemu, I can create a x1 -> -> > > > > region successfully. -> -> > > > > But have some errors in Nvdimm: -> -> > > > > -> -> > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, -> -> > > > > assuming node 0 -> -> > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, -> -> > > > > assuming node 0 -> -> > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe -> -> > > > > -> -> > > > -> -> > > > Ah. I've seen this one, but not chased it down yet. Was on my todo -> -> > > > list to chase -> -> > > > down. Once I reach this state I can verify the HDM Decode is correct -> -> > > > which is what -> -> > > > I've been using to test (Which wasn't true until earlier this week). -> -> > > > I'm currently testing via devmem, more for historical reasons than -> -> > > > because it makes -> -> > > > that much sense anymore. -> -> > > -> -> > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. -> -> > > I'd forgotten that was still on the todo list. I don't think it will -> -> > > be particularly hard to do and will take a look in next few days. -> -> > > -> -> > > Very very indirectly this error is causing a driver probe fail that -> -> > > means that -> -> > > we hit a code path that has a rather odd looking check on NDD_LABELING. -> -> > > Should not have gotten near that path though - hence the problem is -> -> > > actually -> -> > > when we call cxl_pmem_get_config_data() and it returns an error because -> -> > > we haven't fully connected up the command in QEMU. -> -> > -> -> > So a least one bug in QEMU. We were not supporting variable length -> -> > payloads on mailbox -> -> > inputs (but were on outputs). That hasn't mattered until we get to LSA -> -> > writes. -> -> > We just need to relax condition on the supplied length. -> -> > -> -> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c -> -> > index c352a935c4..fdda9529fe 100644 -> -> > --- a/hw/cxl/cxl-mailbox-utils.c -> -> > +++ b/hw/cxl/cxl-mailbox-utils.c -> -> > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) -> -> > cxl_cmd = &cxl_cmd_set[set][cmd]; -> -> > h = cxl_cmd->handler; -> -> > if (h) { -> -> > - if (len == cxl_cmd->in) { -> -> > + if (len == cxl_cmd->in || !cxl_cmd->in) { -> -> Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. -> -> -> -> With that fixed we hit new fun paths - after some errors we get the -> -> worrying - not totally sure but looks like a failure on an error cleanup. -> -> I'll chase down the error source, but even then this is probably -> -> triggerable by -> -> hardware problem or similar. Some bonus prints in here from me chasing -> -> error paths, but it's otherwise just cxl/next + the fix I posted earlier -> -> today. -> -> -One of the scenarios that I cannot rule out is nvdimm_probe() racing -> -nd_region_probe(), but given all the work it takes to create a region I -> -suspect all the nvdimm_probe() work to have completed... -> -> -It is at least one potentially wrong hypothesis that needs to be chased -> -down. -Maybe there should be a special award for the non-intuitive -ndctl create-namespace command (modifies existing namespace and might create -a different empty one...) I'm sure there is some interesting history behind -that one :) - -Upshot is I just threw a filesystem on fsdax and wrote some text files on it -to allow easy grepping. The right data ends up in the memory and a plausible -namespace description is stored in the LSA. - -So to some degree at least it's 'working' on an 8 way direct connected -set of emulated devices. - -One snag is that serial number support isn't yet upstream in QEMU. -(I have had it in my tree for a while but not posted it yet because of - QEMU feature freeze) -https://gitlab.com/jic23/qemu/-/commit/144c783ea8a5fbe169f46ea1ba92940157f42733 -That's needed for meaningful cookie generation. Otherwise you can build the -namespace once, but it won't work on next probe as the cookie is 0 and you -hit some error paths. - -Maybe sensible to add a sanity check and fail namespace creation if -cookie is 0? (Silly side question, but is there a theoretical risk of -a serial number / other data combination leading to a fletcher64() -checksum that happens to be 0 - that would give a very odd bug report!) - -So to make it work the following is needed: - -1) The kernel fix for mailbox buffer overflow. -2) Qemu fix for size of arguements for get_lsa -3) Qemu fix to allow variable size input arguements (for set_lsa) -4) Serial number patch above + command lines to qemu to set appropriate - serial numbers. - -I'll send out the QEMU fixes shortly and post the Serial number patch, -though that almost certainly won't go in until next QEMU development -cycle starts in a few weeks. - -Next up, run through same tests on some other topologies. - -Jonathan - -> -> -> -> -> [ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) -> -> [ 69.920108] nd_region_probe -> -> [ 69.920623] ------------[ cut here ]------------ -> -> [ 69.920675] refcount_t: addition on 0; use-after-free. -> -> [ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 -> -> refcount_warn_saturate+0xa0/0x144 -> -> [ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port -> -> cxl_acpi cxl_core -> -> [ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ -> -> #399 -> -> [ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 -> -> 02/06/2015 -> -> [ 69.931482] Workqueue: events_unbound async_run_entry_fn -> -> [ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS -> -> BTYPE=--) -> -> [ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 -> -> [ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 -> -> [ 69.936541] sp : ffff80000890b960 -> -> [ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: -> -> 0000000000000000 -> -> [ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: -> -> 0000000000000000 -> -> [ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: -> -> ffff0000c5254800 -> -> [ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: -> -> ffffffffffffffff -> -> [ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: -> -> 0000000000000000 -> -> [ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: -> -> 657466612d657375 -> -> [ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : -> -> ffffa54a8f63d288 -> -> [ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : -> -> 00000000fffff31e -> -> [ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : -> -> ffff5ab66e5ef000 -> -> root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : -> -> 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 -> -> [ 69.957098] Call trace: -> -> [ 69.957959] refcount_warn_saturate+0xa0/0x144 -> -> [ 69.958773] get_ndd+0x5c/0x80 -> -> [ 69.959294] nd_region_register_namespaces+0xe4/0xe90 -> -> [ 69.960253] nd_region_probe+0x100/0x290 -> -> [ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 -> -> [ 69.962087] really_probe+0x19c/0x3f0 -> -> [ 69.962620] __driver_probe_device+0x11c/0x190 -> -> [ 69.963258] driver_probe_device+0x44/0xf4 -> -> [ 69.963773] __device_attach_driver+0xa4/0x140 -> -> [ 69.964471] bus_for_each_drv+0x84/0xe0 -> -> [ 69.965068] __device_attach+0xb0/0x1f0 -> -> [ 69.966101] device_initial_probe+0x20/0x30 -> -> [ 69.967142] bus_probe_device+0xa4/0xb0 -> -> [ 69.968104] device_add+0x3e8/0x910 -> -> [ 69.969111] nd_async_device_register+0x24/0x74 -> -> [ 69.969928] async_run_entry_fn+0x40/0x150 -> -> [ 69.970725] process_one_work+0x1dc/0x450 -> -> [ 69.971796] worker_thread+0x154/0x450 -> -> [ 69.972700] kthread+0x118/0x120 -> -> [ 69.974141] ret_from_fork+0x10/0x20 -> -> [ 69.975141] ---[ end trace 0000000000000000 ]--- -> -> [ 70.117887] Into nd_namespace_pmem_set_resource() - -Bobo WL wrote: -> -Hi list -> -> -I want to test cxl functions in arm64, and found some problems I can't -> -figure out. -> -> -My test environment: -> -> -1. build latest bios from -https://github.com/tianocore/edk2.git -master -> -branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) -> -2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git -> -master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm -> -support patch: -> -https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ -> -3. build Linux kernel from -> -https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git -preview -> -branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) -> -4. build latest ndctl tools from -https://github.com/pmem/ndctl -> -create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) -> -> -And my qemu test commands: -> -sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ -> --cpu max -smp 8 -nographic -no-reboot \ -> --kernel $KERNEL -bios $BIOS_BIN \ -> --drive if=none,file=$ROOTFS,format=qcow2,id=hd \ -> --device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 -> -nokaslr dyndbg="module cxl* +p"' \ -> --object memory-backend-ram,size=4G,id=mem0 \ -> --numa node,nodeid=0,cpus=0-7,memdev=mem0 \ -> --net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ -> --object -> -memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M -> -\ -> --object -> -memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M -> -\ -> --device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -> --device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -> --device cxl-upstream,bus=root_port0,id=us0 \ -> --device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -> --device -> -cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ -> --device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -> --device -> -cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ -> --device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -> --device -> -cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ -> --device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -> --device -> -cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ -> --M -> -cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k -> -> -And I have got two problems. -> -1. When I want to create x1 region with command: "cxl create-region -d -> -decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer -> -reference. Crash log: -> -> -[ 534.697324] cxl_region region0: config state: 0 -> -[ 534.697346] cxl_region region0: probe: -6 -> -[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 -> -[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: -> -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -> -[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -> -[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -> -[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -> -[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -> -for mem0:decoder3.0 @ 0 -> -[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 -> -[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = -> -0000:0e:00.0 for mem0:decoder3.0 @ 0 -> -[ 534.699405] Unable to handle kernel NULL pointer dereference at -> -virtual address 0000000000000000 -> -[ 534.701474] Mem abort info: -> -[ 534.701994] ESR = 0x0000000086000004 -> -[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits -> -[ 534.703616] SET = 0, FnV = 0 -> -[ 534.704174] EA = 0, S1PTW = 0 -> -[ 534.704803] FSC = 0x04: level 0 translation fault -> -[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 -> -[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 -> -[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP -> -[ 534.710301] Modules linked in: -> -[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted -> -5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 -> -[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 -> -[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) -> -[ 534.719190] pc : 0x0 -> -[ 534.719928] lr : commit_store+0x118/0x2cc -> -[ 534.721007] sp : ffff80000aec3c30 -> -[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: -> -ffff0000c0c06b30 -> -[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: -> -ffff0000c0a29400 -> -[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: -> -ffff0000c0c06800 -> -[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: -> -0000000000000000 -> -[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: -> -0000ffffd41fe838 -> -[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: -> -0000000000000000 -> -[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : -> -0000000000000000 -> -[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : -> -ffff0000c0906e80 -> -[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : -> -ffff80000aec3bf0 -> -[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : -> -ffff0000c155a000 -> -[ 534.738878] Call trace: -> -[ 534.739368] 0x0 -> -[ 534.739713] dev_attr_store+0x1c/0x30 -> -[ 534.740186] sysfs_kf_write+0x48/0x58 -> -[ 534.740961] kernfs_fop_write_iter+0x128/0x184 -> -[ 534.741872] new_sync_write+0xdc/0x158 -> -[ 534.742706] vfs_write+0x1ac/0x2a8 -> -[ 534.743440] ksys_write+0x68/0xf0 -> -[ 534.744328] __arm64_sys_write+0x1c/0x28 -> -[ 534.745180] invoke_syscall+0x44/0xf0 -> -[ 534.745989] el0_svc_common+0x4c/0xfc -> -[ 534.746661] do_el0_svc+0x60/0xa8 -> -[ 534.747378] el0_svc+0x2c/0x78 -> -[ 534.748066] el0t_64_sync_handler+0xb8/0x12c -> -[ 534.748919] el0t_64_sync+0x18c/0x190 -> -[ 534.749629] Code: bad PC value -> -[ 534.750169] ---[ end trace 0000000000000000 ]--- -What was the top kernel commit when you ran this test? What is the line -number of "commit_store+0x118"? - -> -2. When I want to create x4 region with command: "cxl create-region -d -> -decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: -> -> -cxl region: create_region: region0: failed to set target3 to mem3 -> -cxl region: cmd_create_region: created 0 regions -> -> -And kernel log as below: -> -[ 60.536663] cxl_region region0: config state: 0 -> -[ 60.536675] cxl_region region0: probe: -6 -> -[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 -> -[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: -> -mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 -> -[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 -> -[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: -> -mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 -> -[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 -> -[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: -> -mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 -> -[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 -> -[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: -> -mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 -> -[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: -> -mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 -> -[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: -> -mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 -> -[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 -> -[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 -> -for mem0:decoder3.0 @ 0 -> -[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 -> -[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = -> -0000:0e:00.0 for mem0:decoder3.0 @ 0 -> -[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at -> -1 -> -> -I have tried to write sysfs node manually, got same errors. -> -> -Hope I can get some helps here. -What is the output of: - - cxl list -MDTu -d decoder0.0 - -...? It might be the case that mem1 cannot be mapped by decoder0.0, or -at least not in the specified order, or that validation check is broken. - -Hi Dan, - -Thanks for your reply! - -On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> wrote: -> -> -What is the output of: -> -> -cxl list -MDTu -d decoder0.0 -> -> -...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -at least not in the specified order, or that validation check is broken. -Command "cxl list -MDTu -d decoder0.0" output: - -[ - { - "memdevs":[ - { - "memdev":"mem2", - "pmem_size":"256.00 MiB (268.44 MB)", - "ram_size":0, - "serial":"0", - "host":"0000:11:00.0" - }, - { - "memdev":"mem1", - "pmem_size":"256.00 MiB (268.44 MB)", - "ram_size":0, - "serial":"0", - "host":"0000:10:00.0" - }, - { - "memdev":"mem0", - "pmem_size":"256.00 MiB (268.44 MB)", - "ram_size":0, - "serial":"0", - "host":"0000:0f:00.0" - }, - { - "memdev":"mem3", - "pmem_size":"256.00 MiB (268.44 MB)", - "ram_size":0, - "serial":"0", - "host":"0000:12:00.0" - } - ] - }, - { - "root decoders":[ - { - "decoder":"decoder0.0", - "resource":"0x10000000000", - "size":"4.00 GiB (4.29 GB)", - "pmem_capable":true, - "volatile_capable":true, - "accelmem_capable":true, - "nr_targets":1, - "targets":[ - { - "target":"ACPI0016:01", - "alias":"pci0000:0c", - "position":0, - "id":"0xc" - } - ] - } - ] - } -] - -Bobo WL wrote: -> -Hi Dan, -> -> -Thanks for your reply! -> -> -On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> wrote: -> -> -> -> What is the output of: -> -> -> -> cxl list -MDTu -d decoder0.0 -> -> -> -> ...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -> at least not in the specified order, or that validation check is broken. -> -> -Command "cxl list -MDTu -d decoder0.0" output: -Thanks for this, I think I know the problem, but will try some -experiments with cxl_test first. - -Did the commit_store() crash stop reproducing with latest cxl/preview -branch? - -On Tue, Aug 9, 2022 at 11:17 PM Dan Williams <dan.j.williams@intel.com> wrote: -> -> -Bobo WL wrote: -> -> Hi Dan, -> -> -> -> Thanks for your reply! -> -> -> -> On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> -> -> wrote: -> -> > -> -> > What is the output of: -> -> > -> -> > cxl list -MDTu -d decoder0.0 -> -> > -> -> > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -> > at least not in the specified order, or that validation check is broken. -> -> -> -> Command "cxl list -MDTu -d decoder0.0" output: -> -> -Thanks for this, I think I know the problem, but will try some -> -experiments with cxl_test first. -> -> -Did the commit_store() crash stop reproducing with latest cxl/preview -> -branch? -No, still hitting this bug if don't add extra HB device in qemu - -Dan Williams wrote: -> -Bobo WL wrote: -> -> Hi Dan, -> -> -> -> Thanks for your reply! -> -> -> -> On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> -> -> wrote: -> -> > -> -> > What is the output of: -> -> > -> -> > cxl list -MDTu -d decoder0.0 -> -> > -> -> > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -> > at least not in the specified order, or that validation check is broken. -> -> -> -> Command "cxl list -MDTu -d decoder0.0" output: -> -> -Thanks for this, I think I know the problem, but will try some -> -experiments with cxl_test first. -Hmm, so my cxl_test experiment unfortunately passed so I'm not -reproducing the failure mode. This is the result of creating x4 region -with devices directly attached to a single host-bridge: - -# cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s $((1<<30)) -{ - "region":"region8", - "resource":"0xf1f0000000", - "size":"1024.00 MiB (1073.74 MB)", - "interleave_ways":4, - "interleave_granularity":256, - "decode_state":"commit", - "mappings":[ - { - "position":3, - "memdev":"mem11", - "decoder":"decoder21.0" - }, - { - "position":2, - "memdev":"mem9", - "decoder":"decoder19.0" - }, - { - "position":1, - "memdev":"mem10", - "decoder":"decoder20.0" - }, - { - "position":0, - "memdev":"mem12", - "decoder":"decoder22.0" - } - ] -} -cxl region: cmd_create_region: created 1 region - -> -Did the commit_store() crash stop reproducing with latest cxl/preview -> -branch? -I missed the answer to this question. - -All of these changes are now in Linus' tree perhaps give that a try and -post the debug log again? - -On Thu, 11 Aug 2022 17:46:55 -0700 -Dan Williams <dan.j.williams@intel.com> wrote: - -> -Dan Williams wrote: -> -> Bobo WL wrote: -> -> > Hi Dan, -> -> > -> -> > Thanks for your reply! -> -> > -> -> > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> -> -> > wrote: -> -> > > -> -> > > What is the output of: -> -> > > -> -> > > cxl list -MDTu -d decoder0.0 -> -> > > -> -> > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -> > > at least not in the specified order, or that validation check is -> -> > > broken. -> -> > -> -> > Command "cxl list -MDTu -d decoder0.0" output: -> -> -> -> Thanks for this, I think I know the problem, but will try some -> -> experiments with cxl_test first. -> -> -Hmm, so my cxl_test experiment unfortunately passed so I'm not -> -reproducing the failure mode. This is the result of creating x4 region -> -with devices directly attached to a single host-bridge: -> -> -# cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s $((1<<30)) -> -{ -> -"region":"region8", -> -"resource":"0xf1f0000000", -> -"size":"1024.00 MiB (1073.74 MB)", -> -"interleave_ways":4, -> -"interleave_granularity":256, -> -"decode_state":"commit", -> -"mappings":[ -> -{ -> -"position":3, -> -"memdev":"mem11", -> -"decoder":"decoder21.0" -> -}, -> -{ -> -"position":2, -> -"memdev":"mem9", -> -"decoder":"decoder19.0" -> -}, -> -{ -> -"position":1, -> -"memdev":"mem10", -> -"decoder":"decoder20.0" -> -}, -> -{ -> -"position":0, -> -"memdev":"mem12", -> -"decoder":"decoder22.0" -> -} -> -] -> -} -> -cxl region: cmd_create_region: created 1 region -> -> -> Did the commit_store() crash stop reproducing with latest cxl/preview -> -> branch? -> -> -I missed the answer to this question. -> -> -All of these changes are now in Linus' tree perhaps give that a try and -> -post the debug log again? -Hi Dan, - -I've moved onto looking at this one. -1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy that -up -at some stage), 1 switch, 4 downstream switch ports each with a type 3 - -I'm not getting a crash, but can't successfully setup a region. -Upon adding the final target -It's failing in check_last_peer() as pos < distance. -Seems distance is 4 which makes me think it's using the wrong level of the -heirarchy for -some reason or that distance check is wrong. -Wasn't a good idea to just skip that step though as it goes boom - though -stack trace is not useful. - -Jonathan - -On Wed, 17 Aug 2022 17:16:19 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Thu, 11 Aug 2022 17:46:55 -0700 -> -Dan Williams <dan.j.williams@intel.com> wrote: -> -> -> Dan Williams wrote: -> -> > Bobo WL wrote: -> -> > > Hi Dan, -> -> > > -> -> > > Thanks for your reply! -> -> > > -> -> > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> -> -> > > wrote: -> -> > > > -> -> > > > What is the output of: -> -> > > > -> -> > > > cxl list -MDTu -d decoder0.0 -> -> > > > -> -> > > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or -> -> > > > at least not in the specified order, or that validation check is -> -> > > > broken. -> -> > > -> -> > > Command "cxl list -MDTu -d decoder0.0" output: -> -> > -> -> > Thanks for this, I think I know the problem, but will try some -> -> > experiments with cxl_test first. -> -> -> -> Hmm, so my cxl_test experiment unfortunately passed so I'm not -> -> reproducing the failure mode. This is the result of creating x4 region -> -> with devices directly attached to a single host-bridge: -> -> -> -> # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s -> -> $((1<<30)) -> -> { -> -> "region":"region8", -> -> "resource":"0xf1f0000000", -> -> "size":"1024.00 MiB (1073.74 MB)", -> -> "interleave_ways":4, -> -> "interleave_granularity":256, -> -> "decode_state":"commit", -> -> "mappings":[ -> -> { -> -> "position":3, -> -> "memdev":"mem11", -> -> "decoder":"decoder21.0" -> -> }, -> -> { -> -> "position":2, -> -> "memdev":"mem9", -> -> "decoder":"decoder19.0" -> -> }, -> -> { -> -> "position":1, -> -> "memdev":"mem10", -> -> "decoder":"decoder20.0" -> -> }, -> -> { -> -> "position":0, -> -> "memdev":"mem12", -> -> "decoder":"decoder22.0" -> -> } -> -> ] -> -> } -> -> cxl region: cmd_create_region: created 1 region -> -> -> -> > Did the commit_store() crash stop reproducing with latest cxl/preview -> -> > branch? -> -> -> -> I missed the answer to this question. -> -> -> -> All of these changes are now in Linus' tree perhaps give that a try and -> -> post the debug log again? -> -> -Hi Dan, -> -> -I've moved onto looking at this one. -> -1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy -> -that up -> -at some stage), 1 switch, 4 downstream switch ports each with a type 3 -> -> -I'm not getting a crash, but can't successfully setup a region. -> -Upon adding the final target -> -It's failing in check_last_peer() as pos < distance. -> -Seems distance is 4 which makes me think it's using the wrong level of the -> -heirarchy for -> -some reason or that distance check is wrong. -> -Wasn't a good idea to just skip that step though as it goes boom - though -> -stack trace is not useful. -Turns out really weird corruption happens if you accidentally back two type3 -devices -with the same memory device. Who would have thought it :) - -That aside ignoring the check_last_peer() failure seems to make everything work -for this -topology. I'm not seeing the crash, so my guess is we fixed it somewhere along -the way. - -Now for the fun one. I've replicated the crash if we have - -1HB 1*RP 1SW, 4SW-DSP, 4Type3 - -Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be -programmed -but the null pointer dereference isn't related to that. - -The bug is straight forward. Not all decoders have commit callbacks... Will -send out -a possible fix shortly. - -Jonathan - - - -> -> -Jonathan -> -> -> -> -> -> - -On Thu, 18 Aug 2022 17:37:40 +0100 -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: - -> -On Wed, 17 Aug 2022 17:16:19 +0100 -> -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> On Thu, 11 Aug 2022 17:46:55 -0700 -> -> Dan Williams <dan.j.williams@intel.com> wrote: -> -> -> -> > Dan Williams wrote: -> -> > > Bobo WL wrote: -> -> > > > Hi Dan, -> -> > > > -> -> > > > Thanks for your reply! -> -> > > > -> -> > > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams -> -> > > > <dan.j.williams@intel.com> wrote: -> -> > > > > -> -> > > > > What is the output of: -> -> > > > > -> -> > > > > cxl list -MDTu -d decoder0.0 -> -> > > > > -> -> > > > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, -> -> > > > > or -> -> > > > > at least not in the specified order, or that validation check is -> -> > > > > broken. -> -> > > > -> -> > > > Command "cxl list -MDTu -d decoder0.0" output: -> -> > > -> -> > > Thanks for this, I think I know the problem, but will try some -> -> > > experiments with cxl_test first. -> -> > -> -> > Hmm, so my cxl_test experiment unfortunately passed so I'm not -> -> > reproducing the failure mode. This is the result of creating x4 region -> -> > with devices directly attached to a single host-bridge: -> -> > -> -> > # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s -> -> > $((1<<30)) -> -> > { -> -> > "region":"region8", -> -> > "resource":"0xf1f0000000", -> -> > "size":"1024.00 MiB (1073.74 MB)", -> -> > "interleave_ways":4, -> -> > "interleave_granularity":256, -> -> > "decode_state":"commit", -> -> > "mappings":[ -> -> > { -> -> > "position":3, -> -> > "memdev":"mem11", -> -> > "decoder":"decoder21.0" -> -> > }, -> -> > { -> -> > "position":2, -> -> > "memdev":"mem9", -> -> > "decoder":"decoder19.0" -> -> > }, -> -> > { -> -> > "position":1, -> -> > "memdev":"mem10", -> -> > "decoder":"decoder20.0" -> -> > }, -> -> > { -> -> > "position":0, -> -> > "memdev":"mem12", -> -> > "decoder":"decoder22.0" -> -> > } -> -> > ] -> -> > } -> -> > cxl region: cmd_create_region: created 1 region -> -> > -> -> > > Did the commit_store() crash stop reproducing with latest cxl/preview -> -> > > branch? -> -> > -> -> > I missed the answer to this question. -> -> > -> -> > All of these changes are now in Linus' tree perhaps give that a try and -> -> > post the debug log again? -> -> -> -> Hi Dan, -> -> -> -> I've moved onto looking at this one. -> -> 1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy -> -> that up -> -> at some stage), 1 switch, 4 downstream switch ports each with a type 3 -> -> -> -> I'm not getting a crash, but can't successfully setup a region. -> -> Upon adding the final target -> -> It's failing in check_last_peer() as pos < distance. -> -> Seems distance is 4 which makes me think it's using the wrong level of the -> -> heirarchy for -> -> some reason or that distance check is wrong. -> -> Wasn't a good idea to just skip that step though as it goes boom - though -> -> stack trace is not useful. -> -> -Turns out really weird corruption happens if you accidentally back two type3 -> -devices -> -with the same memory device. Who would have thought it :) -> -> -That aside ignoring the check_last_peer() failure seems to make everything -> -work for this -> -topology. I'm not seeing the crash, so my guess is we fixed it somewhere -> -along the way. -> -> -Now for the fun one. I've replicated the crash if we have -> -> -1HB 1*RP 1SW, 4SW-DSP, 4Type3 -> -> -Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be -> -programmed -> -but the null pointer dereference isn't related to that. -> -> -The bug is straight forward. Not all decoders have commit callbacks... Will -> -send out -> -a possible fix shortly. -> -For completeness I'm carrying this hack because I haven't gotten my head -around the right fix for check_last_peer() failing on this test topology. - -diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c -index c49d9a5f1091..275e143bd748 100644 ---- a/drivers/cxl/core/region.c -+++ b/drivers/cxl/core/region.c -@@ -978,7 +978,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, - rc = check_last_peer(cxled, ep, cxl_rr, - distance); - if (rc) -- return rc; -+ // return rc; - goto out_target_set; - } - goto add_target; --- - -I might find more bugs with more testing, but this is all the ones I've -seen so far + in Bobo's reports. Qemu fixes are now in upstream so -will be there in the release. - -As a reminder, testing on QEMU has a few corners... - -Need a patch to add serial number ECAP support. It is on list for revew, -but will have wait for after QEMU 7.1 release (which may be next week) - -QEMU still assumes HDM decoder on the host bridge will be programmed. -So if you want anything to work there should be at least -2 RP below the HB (no need to plug anything in to one of them). - -I don't want to add a commandline parameter to hide the decoder in QEMU -and detecting there is only one RP would require moving a bunch of static -stuff into runtime code (I think). - -I still think we should make the kernel check to see if there is a decoder, -but if not I might see how bad a hack it is to have QEMU ignore that decoder -if not committed in this one special case (HB HDM decoder with only one place -it can send stuff). Obviously that would be a break from specification -so less than idea! - -Thanks, - -Jonathan - -On Fri, 19 Aug 2022 09:46:55 +0100 -Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: - -> -On Thu, 18 Aug 2022 17:37:40 +0100 -> -Jonathan Cameron via <qemu-devel@nongnu.org> wrote: -> -> -> On Wed, 17 Aug 2022 17:16:19 +0100 -> -> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: -> -> -> -> > On Thu, 11 Aug 2022 17:46:55 -0700 -> -> > Dan Williams <dan.j.williams@intel.com> wrote: -> -> > -> -> > > Dan Williams wrote: -> -> > > > Bobo WL wrote: -> -> > > > > Hi Dan, -> -> > > > > -> -> > > > > Thanks for your reply! -> -> > > > > -> -> > > > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams -> -> > > > > <dan.j.williams@intel.com> wrote: -> -> > > > > > -> -> > > > > > What is the output of: -> -> > > > > > -> -> > > > > > cxl list -MDTu -d decoder0.0 -> -> > > > > > -> -> > > > > > ...? It might be the case that mem1 cannot be mapped by -> -> > > > > > decoder0.0, or -> -> > > > > > at least not in the specified order, or that validation check is -> -> > > > > > broken. -> -> > > > > -> -> > > > > Command "cxl list -MDTu -d decoder0.0" output: -> -> > > > -> -> > > > Thanks for this, I think I know the problem, but will try some -> -> > > > experiments with cxl_test first. -> -> > > -> -> > > Hmm, so my cxl_test experiment unfortunately passed so I'm not -> -> > > reproducing the failure mode. This is the result of creating x4 region -> -> > > with devices directly attached to a single host-bridge: -> -> > > -> -> > > # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s -> -> > > $((1<<30)) -> -> > > { -> -> > > "region":"region8", -> -> > > "resource":"0xf1f0000000", -> -> > > "size":"1024.00 MiB (1073.74 MB)", -> -> > > "interleave_ways":4, -> -> > > "interleave_granularity":256, -> -> > > "decode_state":"commit", -> -> > > "mappings":[ -> -> > > { -> -> > > "position":3, -> -> > > "memdev":"mem11", -> -> > > "decoder":"decoder21.0" -> -> > > }, -> -> > > { -> -> > > "position":2, -> -> > > "memdev":"mem9", -> -> > > "decoder":"decoder19.0" -> -> > > }, -> -> > > { -> -> > > "position":1, -> -> > > "memdev":"mem10", -> -> > > "decoder":"decoder20.0" -> -> > > }, -> -> > > { -> -> > > "position":0, -> -> > > "memdev":"mem12", -> -> > > "decoder":"decoder22.0" -> -> > > } -> -> > > ] -> -> > > } -> -> > > cxl region: cmd_create_region: created 1 region -> -> > > -> -> > > > Did the commit_store() crash stop reproducing with latest cxl/preview -> -> > > > branch? -> -> > > -> -> > > I missed the answer to this question. -> -> > > -> -> > > All of these changes are now in Linus' tree perhaps give that a try and -> -> > > post the debug log again? -> -> > -> -> > Hi Dan, -> -> > -> -> > I've moved onto looking at this one. -> -> > 1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy -> -> > that up -> -> > at some stage), 1 switch, 4 downstream switch ports each with a type 3 -> -> > -> -> > I'm not getting a crash, but can't successfully setup a region. -> -> > Upon adding the final target -> -> > It's failing in check_last_peer() as pos < distance. -> -> > Seems distance is 4 which makes me think it's using the wrong level of -> -> > the heirarchy for -> -> > some reason or that distance check is wrong. -> -> > Wasn't a good idea to just skip that step though as it goes boom - though -> -> > stack trace is not useful. -> -> -> -> Turns out really weird corruption happens if you accidentally back two -> -> type3 devices -> -> with the same memory device. Who would have thought it :) -> -> -> -> That aside ignoring the check_last_peer() failure seems to make everything -> -> work for this -> -> topology. I'm not seeing the crash, so my guess is we fixed it somewhere -> -> along the way. -> -> -> -> Now for the fun one. I've replicated the crash if we have -> -> -> -> 1HB 1*RP 1SW, 4SW-DSP, 4Type3 -> -> -> -> Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be -> -> programmed -> -> but the null pointer dereference isn't related to that. -> -> -> -> The bug is straight forward. Not all decoders have commit callbacks... -> -> Will send out -> -> a possible fix shortly. -> -> -> -For completeness I'm carrying this hack because I haven't gotten my head -> -around the right fix for check_last_peer() failing on this test topology. -> -> -diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c -> -index c49d9a5f1091..275e143bd748 100644 -> ---- a/drivers/cxl/core/region.c -> -+++ b/drivers/cxl/core/region.c -> -@@ -978,7 +978,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, -> -rc = check_last_peer(cxled, ep, cxl_rr, -> -distance); -> -if (rc) -> -- return rc; -> -+ // return rc; -> -goto out_target_set; -> -} -> -goto add_target; -I'm still carrying this hack and still haven't worked out the right fix. - -Suggestions welcome! If not I'll hopefully get some time on this -towards the end of the week. - -Jonathan - diff --git a/results/classifier/02/instruction/50773216 b/results/classifier/02/instruction/50773216 deleted file mode 100644 index 4b9731cac..000000000 --- a/results/classifier/02/instruction/50773216 +++ /dev/null @@ -1,111 +0,0 @@ -instruction: 0.768 -other: 0.737 -semantic: 0.669 -mistranslation: 0.652 -boot: 0.637 - -[Qemu-devel] Can I have someone's feedback on [bug 1809075] Concurrency bug on keyboard events: capslock LED messing up keycode streams causes character misses at guest kernel - -Hi everyone. -Can I please have someone's feedback on this bug? -https://bugs.launchpad.net/qemu/+bug/1809075 -Briefly, guest OS loses characters sent to it via vnc. And I spot the -bug in relation to ps2 driver. -I'm thinking of possible fixes and I might want to use a memory barrier. -But I would really like to have some suggestion from a qemu developer -first. For example, can we brutally drop capslock LED key events in ps2 -queue? -It is actually relevant to openQA, an automated QA tool for openSUSE. -And this bug blocks a few test cases for us. -Thank you in advance! - -Kind regards, -Gao Zhiyuan - -Cc'ing Marc-André & Gerd. - -On 12/19/18 10:31 AM, Gao Zhiyuan wrote: -> -Hi everyone. -> -> -Can I please have someone's feedback on this bug? -> -https://bugs.launchpad.net/qemu/+bug/1809075 -> -Briefly, guest OS loses characters sent to it via vnc. And I spot the -> -bug in relation to ps2 driver. -> -> -I'm thinking of possible fixes and I might want to use a memory barrier. -> -But I would really like to have some suggestion from a qemu developer -> -first. For example, can we brutally drop capslock LED key events in ps2 -> -queue? -> -> -It is actually relevant to openQA, an automated QA tool for openSUSE. -> -And this bug blocks a few test cases for us. -> -> -Thank you in advance! -> -> -Kind regards, -> -Gao Zhiyuan -> - -On Thu, Jan 03, 2019 at 12:05:54PM +0100, Philippe Mathieu-Daudé wrote: -> -Cc'ing Marc-André & Gerd. -> -> -On 12/19/18 10:31 AM, Gao Zhiyuan wrote: -> -> Hi everyone. -> -> -> -> Can I please have someone's feedback on this bug? -> -> -https://bugs.launchpad.net/qemu/+bug/1809075 -> -> Briefly, guest OS loses characters sent to it via vnc. And I spot the -> -> bug in relation to ps2 driver. -> -> -> -> I'm thinking of possible fixes and I might want to use a memory barrier. -> -> But I would really like to have some suggestion from a qemu developer -> -> first. For example, can we brutally drop capslock LED key events in ps2 -> -> queue? -There is no "capslock LED key event". 0xfa is KBD_REPLY_ACK, and the -device queues it in response to guest port writes. Yes, the ack can -race with actual key events. But IMO that isn't a bug in qemu. - -Probably the linux kernel just throws away everything until it got the -ack for the port write, and that way the key event gets lost. On -physical hardware you will not notice because it is next to impossible -to type fast enough to hit the race window. - -So, go fix the kernel. - -Alternatively fix vncdotool to send uppercase letters properly with -shift key pressed. Then qemu wouldn't generate capslock key events -(that happens because qemu thinks guest and host capslock state is out -of sync) and the guests's capslock led update request wouldn't get into -the way. - -cheers, - Gerd - diff --git a/results/classifier/02/instruction/55961334 b/results/classifier/02/instruction/55961334 deleted file mode 100644 index 4764ef49f..000000000 --- a/results/classifier/02/instruction/55961334 +++ /dev/null @@ -1,40 +0,0 @@ -instruction: 0.803 -semantic: 0.775 -mistranslation: 0.718 -other: 0.715 -boot: 0.569 - -[Bug] "-ht" flag ignored under KVM - guest still reports HT - -Hi Community, -We have observed that the 'ht' feature bit cannot be disabled when QEMU runs -with KVM acceleration. -qemu-system-x86_64 \ - --enable-kvm \ - -machine q35 \ - -cpu host,-ht \ - -smp 4 \ - -m 4G \ - -drive file=rootfs.img,format=raw \ - -nographic \ - -append 'console=ttyS0 root=/dev/sda rw' -Because '-ht' is specified, the guest should expose no HT capability -(cpuid.1.edx[28] = 0), and /proc/cpuinfo shouldn't show HT feature, but we still -saw ht in linux guest when run 'cat /proc/cpuinfo'. -XiaoYao mentioned that: - -It has been the behavior of QEMU since - - commit 400281af34e5ee6aa9f5496b53d8f82c6fef9319 - Author: Andre Przywara <andre.przywara@amd.com> - Date: Wed Aug 19 15:42:42 2009 +0200 - - set CPUID bits to present cores and threads topology - -that we cannot remove HT CPUID bit from guest via "-cpu xxx,-ht" if the -VM has >= 2 vcpus. -I'd like to know whether there's a plan to address this issue, or if the current -behaviour is considered acceptable. -Best regards, -Ewan. - diff --git a/results/classifier/02/instruction/62179944 b/results/classifier/02/instruction/62179944 deleted file mode 100644 index e4484befe..000000000 --- a/results/classifier/02/instruction/62179944 +++ /dev/null @@ -1,32 +0,0 @@ -instruction: 0.693 -boot: 0.567 -mistranslation: 0.533 -other: 0.519 -semantic: 0.454 - -[Qemu-devel] [BUG] network : windows os lost ip address of the network card in some cases - -we found this problem for a long time ãFor example, if we has three network -card in virtual xml file ï¼such as "network connection 1" / "network connection -2"/"network connection 3" ã - -Echo network card has own ip address ï¼such as 192.168.1.1 / 2.1 /3.1 , when -delete the first card ï¼reboot the windows virtual os, then this problem -happened ! - - - - -we found that the sencond network card will replace the first one , then the -ip address of "network connection 2 " become 192.168.1.1 ã - - -Our third party users began to complain about this bug ãAll the business of the -second ip lost !!! - -I mean both of windows and linux has this bug , we solve this bug in linux -throught bonding netcrad pci and mac address ã - -There is no good solution on windows os . thera are ? we implemented a plan to -resumption of IP by QGA. Is there a better way ? - diff --git a/results/classifier/02/instruction/63565653 b/results/classifier/02/instruction/63565653 deleted file mode 100644 index 33f9aaa0a..000000000 --- a/results/classifier/02/instruction/63565653 +++ /dev/null @@ -1,50 +0,0 @@ -instruction: 0.905 -other: 0.898 -boot: 0.889 -semantic: 0.825 -mistranslation: 0.462 - -[Qemu-devel] [BUG]pcibus_reset assertion failure on guest reboot - -Qemu-2.6.2 - -Start a vm with vhost-net , do reboot and hot-unplug viritio-net nic in short -time, we touch -pcibus_reset assertion failure. - -Here is qemu log: -22:29:46.359386+08:00 acpi_pm1_cnt_write -> guest do soft power off -22:29:46.785310+08:00 qemu_devices_reset -22:29:46.788093+08:00 virtio_pci_device_unplugged -> virtio net unpluged -22:29:46.803427+08:00 pcibus_reset: Assertion `bus->irq_count[i] == 0' failed. - -Here is stack info: -(gdb) bt -#0 0x00007f9a336795d7 in raise () from /usr/lib64/libc.so.6 -#1 0x00007f9a3367acc8 in abort () from /usr/lib64/libc.so.6 -#2 0x00007f9a33672546 in __assert_fail_base () from /usr/lib64/libc.so.6 -#3 0x00007f9a336725f2 in __assert_fail () from /usr/lib64/libc.so.6 -#4 0x0000000000641884 in pcibus_reset (qbus=0x29eee60) at hw/pci/pci.c:283 -#5 0x00000000005bfc30 in qbus_reset_one (bus=0x29eee60, opaque=<optimized -out>) at hw/core/qdev.c:319 -#6 0x00000000005c1b19 in qdev_walk_children (dev=0x29ed2b0, pre_devfn=0x0, -pre_busfn=0x0, post_devfn=0x5c2440 ... -#7 0x00000000005c1c59 in qbus_walk_children (bus=0x2736f80, pre_devfn=0x0, -pre_busfn=0x0, post_devfn=0x5c2440 ... -#8 0x00000000005513f5 in qemu_devices_reset () at vl.c:1998 -#9 0x00000000004cab9d in pc_machine_reset () at -/home/abuild/rpmbuild/BUILD/qemu-kvm-2.6.0/hw/i386/pc.c:1976 -#10 0x000000000055148b in qemu_system_reset (address@hidden) at vl.c:2011 -#11 0x000000000055164f in main_loop_should_exit () at vl.c:2169 -#12 0x0000000000551719 in main_loop () at vl.c:2212 -#13 0x000000000041c9a8 in main (argc=<optimized out>, argv=<optimized out>, -envp=<optimized out>) at vl.c:5130 -(gdb) f 4 -... -(gdb) p bus->irq_count[0] -$6 = 1 - -Seems pci_update_irq_disabled doesn't work well - -can anyone help? - diff --git a/results/classifier/02/instruction/70868267 b/results/classifier/02/instruction/70868267 deleted file mode 100644 index 53f7739a3..000000000 --- a/results/classifier/02/instruction/70868267 +++ /dev/null @@ -1,41 +0,0 @@ -instruction: 0.778 -semantic: 0.635 -mistranslation: 0.537 -other: 0.236 -boot: 0.197 - -[Qemu-devel] [BUG] Failed to compile using gcc7.1 - -Hi all, - -After upgrading gcc from 6.3.1 to 7.1.1, qemu can't be compiled with gcc. - -The error is: - ------- - CC block/blkdebug.o -block/blkdebug.c: In function 'blkdebug_refresh_filename': -block/blkdebug.c:693:31: error: '%s' directive output may be truncated -writing up to 4095 bytes into a region of size 4086 -[-Werror=format-truncation=] -"blkdebug:%s:%s", s->config_file ?: "", - ^~ -In file included from /usr/include/stdio.h:939:0, - from /home/adam/qemu/include/qemu/osdep.h:68, - from block/blkdebug.c:25: -/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' -output 11 or more bytes (assuming 4106) into a destination of size 4096 -return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, - ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - __bos (__s), __fmt, __va_arg_pack ()); - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -cc1: all warnings being treated as errors -make: *** [/home/adam/qemu/rules.mak:69: block/blkdebug.o] Error 1 ------- - -It seems that gcc 7 is introducing more restrict check for printf. -If using clang, although there are some extra warning, it can at least -pass the compile. -Thanks, -Qu - diff --git a/results/classifier/02/instruction/73660729 b/results/classifier/02/instruction/73660729 deleted file mode 100644 index 3e44ff46c..000000000 --- a/results/classifier/02/instruction/73660729 +++ /dev/null @@ -1,32 +0,0 @@ -instruction: 0.753 -semantic: 0.698 -mistranslation: 0.633 -other: 0.620 -boot: 0.367 - -[BUG]The latest qemu crashed when I tested cxl - -I test cxl with the patch:[v11,0/2] arm/virt: - CXL support via pxb_cxl. -https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ -But the qemu crashed,and showing an error: -qemu-system-aarch64: ../hw/arm/virt.c:1735: virt_get_high_memmap_enabled: - Assertion `ARRAY_SIZE(extended_memmap) - VIRT_LOWMEMMAP_LAST == ARRAY_SIZE(enabled_array)' failed. -Then I modify the patch to fix the bug: -diff --git a/hw/arm/virt.c b/hw/arm/virt.c -index ea2413a0ba..3d4cee3491 100644 ---- a/hw/arm/virt.c -+++ b/hw/arm/virt.c -@@ -1710,6 +1730,7 @@ static inline bool *virt_get_high_memmap_enabled(VirtMachineState - *vms, -&vms->highmem_redists, -&vms->highmem_ecam, -&vms->highmem_mmio, -+ &vms->cxl_devices_state.is_enabled, -}; -Now qemu works good. -Could you tell me when the patch( -arm/virt: - CXL support via pxb_cxl -) will be merged into upstream? - diff --git a/results/classifier/02/mistranslation/14887122 b/results/classifier/02/mistranslation/14887122 deleted file mode 100644 index 215d2cc8a..000000000 --- a/results/classifier/02/mistranslation/14887122 +++ /dev/null @@ -1,259 +0,0 @@ -mistranslation: 0.930 -semantic: 0.928 -instruction: 0.905 -other: 0.890 -boot: 0.831 - -[BUG][RFC] CPR transfer Issues: Socket permissions and PID files - -Hello, - -While testing CPR transfer I encountered two issues. The first is that the -transfer fails when running with pidfiles due to the destination qemu process -attempting to create the pidfile while it is still locked by the source -process. The second is that the transfer fails when running with the -run-with -user=$USERID parameter. This is because the destination qemu process creates -the UNIX sockets used for the CPR transfer before dropping to the lower -permissioned user, which causes them to be owned by the original user. The -source qemu process then does not have permission to connect to it because it -is already running as the lesser permissioned user. - -Reproducing the first issue: - -Create a source and destination qemu instance associated with the same VM where -both processes have the -pidfile parameter passed on the command line. You -should see the following error on the command line of the second process: - -qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource -temporarily unavailable - -Reproducing the second issue: - -Create a source and destination qemu instance associated with the same VM where -both processes have -run-with user=$USERID passed on the command line, where -$USERID is a different user from the one launching the processes. Then attempt -a CPR transfer using UNIX sockets for the main and cpr sockets. You should -receive the following error via QMP: -{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': -Permission denied"}} - -I provided a minimal patch that works around the second issue. - -Thank you, -Ben Chaney - ---- -include/system/os-posix.h | 4 ++++ -os-posix.c | 8 -------- -util/qemu-sockets.c | 21 +++++++++++++++++++++ -3 files changed, 25 insertions(+), 8 deletions(-) - -diff --git a/include/system/os-posix.h b/include/system/os-posix.h -index ce5b3bccf8..2a414a914a 100644 ---- a/include/system/os-posix.h -+++ b/include/system/os-posix.h -@@ -55,6 +55,10 @@ void os_setup_limits(void); -void os_setup_post(void); -int os_mlock(bool on_fault); - -+extern struct passwd *user_pwd; -+extern uid_t user_uid; -+extern gid_t user_gid; -+ -/** -* qemu_alloc_stack: -* @sz: pointer to a size_t holding the requested usable stack size -diff --git a/os-posix.c b/os-posix.c -index 52925c23d3..9369b312a0 100644 ---- a/os-posix.c -+++ b/os-posix.c -@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s) -} - - --/* -- * Must set all three of these at once. -- * Legal combinations are unset by name by uid -- */ --static struct passwd *user_pwd; /* NULL non-NULL NULL */ --static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ --static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ -- -/* -* Prepare to change user ID. user_id can be one of 3 forms: -* - a username, in which case user ID will be changed to its uid, -diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c -index 77477c1cd5..987977ead9 100644 ---- a/util/qemu-sockets.c -+++ b/util/qemu-sockets.c -@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr) -#endif -} - -+/* -+ * Must set all three of these at once. -+ * Legal combinations are unset by name by uid -+ */ -+struct passwd *user_pwd; /* NULL non-NULL NULL */ -+uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ -+gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ -+ -static int unix_listen_saddr(UnixSocketAddress *saddr, -int num, -Error **errp) -@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr, -error_setg_errno(errp, errno, "Failed to bind socket to %s", path); -goto err; -} -+ if (user_pwd) { -+ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) { -+ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", -path); -+ goto err; -+ } -+ } -+ else if (user_uid != -1 && user_gid != -1) { -+ if (chown(un.sun_path, user_uid, user_gid) < 0) { -+ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", -path); -+ goto err; -+ } -+ } -+ -if (listen(sock, num) < 0) { -error_setg_errno(errp, errno, "Failed to listen on socket"); -goto err; --- -2.40.1 - -Thank you Ben. I appreciate you testing CPR and shaking out the bugs. -I will study these and propose patches. - -My initial reaction to the pidfile issue is that the orchestration layer must -pass a different filename when starting the destination qemu instance. When -using live update without containers, these types of resource conflicts in the -global namespaces are a known issue. - -- Steve - -On 3/14/2025 2:33 PM, Chaney, Ben wrote: -Hello, - -While testing CPR transfer I encountered two issues. The first is that the -transfer fails when running with pidfiles due to the destination qemu process -attempting to create the pidfile while it is still locked by the source -process. The second is that the transfer fails when running with the -run-with -user=$USERID parameter. This is because the destination qemu process creates -the UNIX sockets used for the CPR transfer before dropping to the lower -permissioned user, which causes them to be owned by the original user. The -source qemu process then does not have permission to connect to it because it -is already running as the lesser permissioned user. - -Reproducing the first issue: - -Create a source and destination qemu instance associated with the same VM where -both processes have the -pidfile parameter passed on the command line. You -should see the following error on the command line of the second process: - -qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource -temporarily unavailable - -Reproducing the second issue: - -Create a source and destination qemu instance associated with the same VM where -both processes have -run-with user=$USERID passed on the command line, where -$USERID is a different user from the one launching the processes. Then attempt -a CPR transfer using UNIX sockets for the main and cpr sockets. You should -receive the following error via QMP: -{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': -Permission denied"}} - -I provided a minimal patch that works around the second issue. - -Thank you, -Ben Chaney - ---- -include/system/os-posix.h | 4 ++++ -os-posix.c | 8 -------- -util/qemu-sockets.c | 21 +++++++++++++++++++++ -3 files changed, 25 insertions(+), 8 deletions(-) - -diff --git a/include/system/os-posix.h b/include/system/os-posix.h -index ce5b3bccf8..2a414a914a 100644 ---- a/include/system/os-posix.h -+++ b/include/system/os-posix.h -@@ -55,6 +55,10 @@ void os_setup_limits(void); -void os_setup_post(void); -int os_mlock(bool on_fault); - -+extern struct passwd *user_pwd; -+extern uid_t user_uid; -+extern gid_t user_gid; -+ -/** -* qemu_alloc_stack: -* @sz: pointer to a size_t holding the requested usable stack size -diff --git a/os-posix.c b/os-posix.c -index 52925c23d3..9369b312a0 100644 ---- a/os-posix.c -+++ b/os-posix.c -@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s) -} - - --/* -- * Must set all three of these at once. -- * Legal combinations are unset by name by uid -- */ --static struct passwd *user_pwd; /* NULL non-NULL NULL */ --static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ --static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ -- -/* -* Prepare to change user ID. user_id can be one of 3 forms: -* - a username, in which case user ID will be changed to its uid, -diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c -index 77477c1cd5..987977ead9 100644 ---- a/util/qemu-sockets.c -+++ b/util/qemu-sockets.c -@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr) -#endif -} - -+/* -+ * Must set all three of these at once. -+ * Legal combinations are unset by name by uid -+ */ -+struct passwd *user_pwd; /* NULL non-NULL NULL */ -+uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ -+gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ -+ -static int unix_listen_saddr(UnixSocketAddress *saddr, -int num, -Error **errp) -@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr, -error_setg_errno(errp, errno, "Failed to bind socket to %s", path); -goto err; -} -+ if (user_pwd) { -+ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) { -+ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", -path); -+ goto err; -+ } -+ } -+ else if (user_uid != -1 && user_gid != -1) { -+ if (chown(un.sun_path, user_uid, user_gid) < 0) { -+ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", -path); -+ goto err; -+ } -+ } -+ -if (listen(sock, num) < 0) { -error_setg_errno(errp, errno, "Failed to listen on socket"); -goto err; --- -2.40.1 - diff --git a/results/classifier/02/mistranslation/22219210 b/results/classifier/02/mistranslation/22219210 deleted file mode 100644 index 80919f9f9..000000000 --- a/results/classifier/02/mistranslation/22219210 +++ /dev/null @@ -1,44 +0,0 @@ -mistranslation: 0.472 -semantic: 0.387 -other: 0.345 -instruction: 0.261 -boot: 0.070 - -[BUG][CPU hot-plug]CPU hot-plugs cause the qemu process to coredump - -Hello,Recently, when I was developing CPU hot-plugs under the loongarch -architecture, -I found that there was a problem with qemu cpu hot-plugs under x86 -architecture, -which caused the qemu process coredump when repeatedly inserting and -unplugging -the CPU when the TCG was accelerated. - - -The specific operation process is as follows: - -1.Use the following command to start the virtual machine - -qemu-system-x86_64 \ --machine q35 \ --cpu Broadwell-IBRS \ --smp 1,maxcpus=4,sockets=4,cores=1,threads=1 \ --m 4G \ --drive file=~/anolis-8.8.qcow2 \ --serial stdio  \ --monitor telnet:localhost:4498,server,nowait - - -2.Enter QEMU Monitor via telnet for repeated CPU insertion and unplugging - -telnet 127.0.0.1 4498 -(qemu) device_add -Broadwell-IBRS-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,id=cpu1 -(qemu) device_del cpu1 -(qemu) device_add -Broadwell-IBRS-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,id=cpu1 -3.You will notice that the QEMU process has a coredump - -# malloc(): unsorted double linked list corrupted -Aborted (core dumped) - diff --git a/results/classifier/02/mistranslation/23270873 b/results/classifier/02/mistranslation/23270873 deleted file mode 100644 index 9b75b011a..000000000 --- a/results/classifier/02/mistranslation/23270873 +++ /dev/null @@ -1,693 +0,0 @@ -mistranslation: 0.881 -other: 0.839 -boot: 0.830 -instruction: 0.755 -semantic: 0.752 - -[Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed - -Hi, - -I am seeing some strange QEMU assertion failures for qemu on s390x, -which prevents a guest from starting. - -Git bisecting points to the following commit as the source of the error. - -commit ed6e2161715c527330f936d44af4c547f25f687e -Author: Nishanth Aravamudan <address@hidden> -Date: Fri Jun 22 12:37:00 2018 -0700 - - linux-aio: properly bubble up errors from initialization - - laio_init() can fail for a couple of reasons, which will lead to a NULL - pointer dereference in laio_attach_aio_context(). - - To solve this, add a aio_setup_linux_aio() function which is called - early in raw_open_common. If this fails, propagate the error up. The - signature of aio_get_linux_aio() was not modified, because it seems - preferable to return the actual errno from the possible failing - initialization calls. - - Additionally, when the AioContext changes, we need to associate a - LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context - callback and call the new aio_setup_linux_aio(), which will allocate a -new AioContext if needed, and return errors on failures. If it -fails for -any reason, fallback to threaded AIO with an error message, as the - device is already in-use by the guest. - - Add an assert that aio_get_linux_aio() cannot return NULL. - - Signed-off-by: Nishanth Aravamudan <address@hidden> - Message-id: address@hidden - Signed-off-by: Stefan Hajnoczi <address@hidden> -Not sure what is causing this assertion to fail. Here is the qemu -command line of the guest, from qemu log, which throws this error: -LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin -QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-s390x -name -guest=rt_vm1,debug-threads=on -S -object -secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-rt_vm1/master-key.aes --machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off -m -1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object -iothread,id=iothread1 -uuid 0cde16cd-091d-41bd-9ac2-5243df5c9a0d --display none -no-user-config -nodefaults -chardev -socket,id=charmonitor,fd=28,server,nowait -mon -chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown --boot strict=on -drive -file=/dev/mapper/360050763998b0883980000002a000031,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native --device -virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on --netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device -virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:3a:c8:67:95:84,devno=fe.0.0000 --netdev tap,fd=32,id=hostnet1,vhost=on,vhostfd=33 -device -virtio-net-ccw,netdev=hostnet1,id=net1,mac=52:54:00:2a:e5:08,devno=fe.0.0002 --chardev pty,id=charconsole0 -device -sclpconsole,chardev=charconsole0,id=console0 -device -virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox -on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny --msg timestamp=on -2018-07-17 15:48:42.252+0000: Domain id=21 is tainted: high-privileges -2018-07-17T15:48:42.279380Z qemu-system-s390x: -chardev -pty,id=charconsole0: char device redirected to /dev/pts/3 (label -charconsole0) -qemu-system-s390x: util/async.c:339: aio_get_linux_aio: Assertion -`ctx->linux_aio' failed. -2018-07-17 15:48:43.309+0000: shutting down, reason=failed - - -Any help debugging this would be greatly appreciated. - -Thank you -Farhan - -On 17.07.2018 [13:25:53 -0400], Farhan Ali wrote: -> -Hi, -> -> -I am seeing some strange QEMU assertion failures for qemu on s390x, -> -which prevents a guest from starting. -> -> -Git bisecting points to the following commit as the source of the error. -> -> -commit ed6e2161715c527330f936d44af4c547f25f687e -> -Author: Nishanth Aravamudan <address@hidden> -> -Date: Fri Jun 22 12:37:00 2018 -0700 -> -> -linux-aio: properly bubble up errors from initialization -> -> -laio_init() can fail for a couple of reasons, which will lead to a NULL -> -pointer dereference in laio_attach_aio_context(). -> -> -To solve this, add a aio_setup_linux_aio() function which is called -> -early in raw_open_common. If this fails, propagate the error up. The -> -signature of aio_get_linux_aio() was not modified, because it seems -> -preferable to return the actual errno from the possible failing -> -initialization calls. -> -> -Additionally, when the AioContext changes, we need to associate a -> -LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context -> -callback and call the new aio_setup_linux_aio(), which will allocate a -> -new AioContext if needed, and return errors on failures. If it fails for -> -any reason, fallback to threaded AIO with an error message, as the -> -device is already in-use by the guest. -> -> -Add an assert that aio_get_linux_aio() cannot return NULL. -> -> -Signed-off-by: Nishanth Aravamudan <address@hidden> -> -Message-id: address@hidden -> -Signed-off-by: Stefan Hajnoczi <address@hidden> -> -> -> -Not sure what is causing this assertion to fail. Here is the qemu command -> -line of the guest, from qemu log, which throws this error: -> -> -> -LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin -> -QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-s390x -name -> -guest=rt_vm1,debug-threads=on -S -object -> -secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-rt_vm1/master-key.aes -> --machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off -m 1024 -> --realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object -> -iothread,id=iothread1 -uuid 0cde16cd-091d-41bd-9ac2-5243df5c9a0d -display -> -none -no-user-config -nodefaults -chardev -> -socket,id=charmonitor,fd=28,server,nowait -mon -> -chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot -> -strict=on -drive -> -file=/dev/mapper/360050763998b0883980000002a000031,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -> --device -> -virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on -> --netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device -> -virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:3a:c8:67:95:84,devno=fe.0.0000 -> --netdev tap,fd=32,id=hostnet1,vhost=on,vhostfd=33 -device -> -virtio-net-ccw,netdev=hostnet1,id=net1,mac=52:54:00:2a:e5:08,devno=fe.0.0002 -> --chardev pty,id=charconsole0 -device -> -sclpconsole,chardev=charconsole0,id=console0 -device -> -virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -sandbox -> -on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg -> -timestamp=on -> -> -> -> -2018-07-17 15:48:42.252+0000: Domain id=21 is tainted: high-privileges -> -2018-07-17T15:48:42.279380Z qemu-system-s390x: -chardev pty,id=charconsole0: -> -char device redirected to /dev/pts/3 (label charconsole0) -> -qemu-system-s390x: util/async.c:339: aio_get_linux_aio: Assertion -> -`ctx->linux_aio' failed. -> -2018-07-17 15:48:43.309+0000: shutting down, reason=failed -> -> -> -Any help debugging this would be greatly appreciated. -iiuc, this possibly implies AIO was not actually used previously on this -guest (it might have silently been falling back to threaded IO?). I -don't have access to s390x, but would it be possible to run qemu under -gdb and see if aio_setup_linux_aio is being called at all (I think it -might not be, but I'm not sure why), and if so, if it's for the context -in question? - -If it's not being called first, could you see what callpath is calling -aio_get_linux_aio when this assertion trips? - -Thanks! --Nish - -On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: -iiuc, this possibly implies AIO was not actually used previously on this -guest (it might have silently been falling back to threaded IO?). I -don't have access to s390x, but would it be possible to run qemu under -gdb and see if aio_setup_linux_aio is being called at all (I think it -might not be, but I'm not sure why), and if so, if it's for the context -in question? - -If it's not being called first, could you see what callpath is calling -aio_get_linux_aio when this assertion trips? - -Thanks! --Nish -Hi Nishant, -From the coredump of the guest this is the call trace that calls -aio_get_linux_aio: -Stack trace of thread 145158: -#0 0x000003ff94dbe274 raise (libc.so.6) -#1 0x000003ff94da39a8 abort (libc.so.6) -#2 0x000003ff94db62ce __assert_fail_base (libc.so.6) -#3 0x000003ff94db634c __assert_fail (libc.so.6) -#4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) -#5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) -#6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) -#7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) -#8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) -#9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) -#10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) -#11 0x000003ff94f879a8 start_thread (libpthread.so.0) -#12 0x000003ff94e797ee thread_start (libc.so.6) - - -Thanks for taking a look and responding. - -Thanks -Farhan - -On 07/18/2018 09:42 AM, Farhan Ali wrote: -On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: -iiuc, this possibly implies AIO was not actually used previously on this -guest (it might have silently been falling back to threaded IO?). I -don't have access to s390x, but would it be possible to run qemu under -gdb and see if aio_setup_linux_aio is being called at all (I think it -might not be, but I'm not sure why), and if so, if it's for the context -in question? - -If it's not being called first, could you see what callpath is calling -aio_get_linux_aio when this assertion trips? - -Thanks! --Nish -Hi Nishant, -From the coredump of the guest this is the call trace that calls -aio_get_linux_aio: -Stack trace of thread 145158: -#0 0x000003ff94dbe274 raise (libc.so.6) -#1 0x000003ff94da39a8 abort (libc.so.6) -#2 0x000003ff94db62ce __assert_fail_base (libc.so.6) -#3 0x000003ff94db634c __assert_fail (libc.so.6) -#4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) -#5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) -#6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) -#7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) -#8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) -#9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) -#10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) -#11 0x000003ff94f879a8 start_thread (libpthread.so.0) -#12 0x000003ff94e797ee thread_start (libc.so.6) - - -Thanks for taking a look and responding. - -Thanks -Farhan -Trying to debug a little further, the block device in this case is a -"host device". And looking at your commit carefully you use the -bdrv_attach_aio_context callback to setup a Linux AioContext. -For some reason the "host device" struct (BlockDriver bdrv_host_device -in block/file-posix.c) does not have a bdrv_attach_aio_context defined. -So a simple change of adding the callback to the struct solves the issue -and the guest starts fine. -diff --git a/block/file-posix.c b/block/file-posix.c -index 28824aa..b8d59fb 100644 ---- a/block/file-posix.c -+++ b/block/file-posix.c -@@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { - .bdrv_refresh_limits = raw_refresh_limits, - .bdrv_io_plug = raw_aio_plug, - .bdrv_io_unplug = raw_aio_unplug, -+ .bdrv_attach_aio_context = raw_aio_attach_aio_context, - - .bdrv_co_truncate = raw_co_truncate, - .bdrv_getlength = raw_getlength, -I am not too familiar with block device code in QEMU, so not sure if -this is the right fix or if there are some underlying problems. -Thanks -Farhan - -On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: -> -> -> -On 07/18/2018 09:42 AM, Farhan Ali wrote: -> -> -> -> -> -> On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: -> -> > iiuc, this possibly implies AIO was not actually used previously on this -> -> > guest (it might have silently been falling back to threaded IO?). I -> -> > don't have access to s390x, but would it be possible to run qemu under -> -> > gdb and see if aio_setup_linux_aio is being called at all (I think it -> -> > might not be, but I'm not sure why), and if so, if it's for the context -> -> > in question? -> -> > -> -> > If it's not being called first, could you see what callpath is calling -> -> > aio_get_linux_aio when this assertion trips? -> -> > -> -> > Thanks! -> -> > -Nish -> -> -> -> -> -> Hi Nishant, -> -> -> -> From the coredump of the guest this is the call trace that calls -> -> aio_get_linux_aio: -> -> -> -> -> -> Stack trace of thread 145158: -> -> #0 0x000003ff94dbe274 raise (libc.so.6) -> -> #1 0x000003ff94da39a8 abort (libc.so.6) -> -> #2 0x000003ff94db62ce __assert_fail_base (libc.so.6) -> -> #3 0x000003ff94db634c __assert_fail (libc.so.6) -> -> #4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) -> -> #5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) -> -> #6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) -> -> #7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) -> -> #8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) -> -> #9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) -> -> #10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) -> -> #11 0x000003ff94f879a8 start_thread (libpthread.so.0) -> -> #12 0x000003ff94e797ee thread_start (libc.so.6) -> -> -> -> -> -> Thanks for taking a look and responding. -> -> -> -> Thanks -> -> Farhan -> -> -> -> -> -> -> -> -Trying to debug a little further, the block device in this case is a "host -> -device". And looking at your commit carefully you use the -> -bdrv_attach_aio_context callback to setup a Linux AioContext. -> -> -For some reason the "host device" struct (BlockDriver bdrv_host_device in -> -block/file-posix.c) does not have a bdrv_attach_aio_context defined. -> -So a simple change of adding the callback to the struct solves the issue and -> -the guest starts fine. -> -> -> -diff --git a/block/file-posix.c b/block/file-posix.c -> -index 28824aa..b8d59fb 100644 -> ---- a/block/file-posix.c -> -+++ b/block/file-posix.c -> -@@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { -> -.bdrv_refresh_limits = raw_refresh_limits, -> -.bdrv_io_plug = raw_aio_plug, -> -.bdrv_io_unplug = raw_aio_unplug, -> -+ .bdrv_attach_aio_context = raw_aio_attach_aio_context, -> -> -.bdrv_co_truncate = raw_co_truncate, -> -.bdrv_getlength = raw_getlength, -> -> -> -> -I am not too familiar with block device code in QEMU, so not sure if -> -this is the right fix or if there are some underlying problems. -Oh this is quite embarassing! I only added the bdrv_attach_aio_context -callback for the file-backed device. Your fix is definitely corect for -host device. Let me make sure there weren't any others missed and I will -send out a properly formatted patch. Thank you for the quick testing and -turnaround! - --Nish - -On 07/18/2018 08:52 PM, Nishanth Aravamudan wrote: -> -On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: -> -> -> -> -> -> On 07/18/2018 09:42 AM, Farhan Ali wrote: -> ->> -> ->> -> ->> On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: -> ->>> iiuc, this possibly implies AIO was not actually used previously on this -> ->>> guest (it might have silently been falling back to threaded IO?). I -> ->>> don't have access to s390x, but would it be possible to run qemu under -> ->>> gdb and see if aio_setup_linux_aio is being called at all (I think it -> ->>> might not be, but I'm not sure why), and if so, if it's for the context -> ->>> in question? -> ->>> -> ->>> If it's not being called first, could you see what callpath is calling -> ->>> aio_get_linux_aio when this assertion trips? -> ->>> -> ->>> Thanks! -> ->>> -Nish -> ->> -> ->> -> ->> Hi Nishant, -> ->> -> ->> From the coredump of the guest this is the call trace that calls -> ->> aio_get_linux_aio: -> ->> -> ->> -> ->> Stack trace of thread 145158: -> ->> #0 0x000003ff94dbe274 raise (libc.so.6) -> ->> #1 0x000003ff94da39a8 abort (libc.so.6) -> ->> #2 0x000003ff94db62ce __assert_fail_base (libc.so.6) -> ->> #3 0x000003ff94db634c __assert_fail (libc.so.6) -> ->> #4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) -> ->> #5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) -> ->> #6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) -> ->> #7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) -> ->> #8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) -> ->> #9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) -> ->> #10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) -> ->> #11 0x000003ff94f879a8 start_thread (libpthread.so.0) -> ->> #12 0x000003ff94e797ee thread_start (libc.so.6) -> ->> -> ->> -> ->> Thanks for taking a look and responding. -> ->> -> ->> Thanks -> ->> Farhan -> ->> -> ->> -> ->> -> -> -> -> Trying to debug a little further, the block device in this case is a "host -> -> device". And looking at your commit carefully you use the -> -> bdrv_attach_aio_context callback to setup a Linux AioContext. -> -> -> -> For some reason the "host device" struct (BlockDriver bdrv_host_device in -> -> block/file-posix.c) does not have a bdrv_attach_aio_context defined. -> -> So a simple change of adding the callback to the struct solves the issue and -> -> the guest starts fine. -> -> -> -> -> -> diff --git a/block/file-posix.c b/block/file-posix.c -> -> index 28824aa..b8d59fb 100644 -> -> --- a/block/file-posix.c -> -> +++ b/block/file-posix.c -> -> @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { -> -> .bdrv_refresh_limits = raw_refresh_limits, -> -> .bdrv_io_plug = raw_aio_plug, -> -> .bdrv_io_unplug = raw_aio_unplug, -> -> + .bdrv_attach_aio_context = raw_aio_attach_aio_context, -> -> -> -> .bdrv_co_truncate = raw_co_truncate, -> -> .bdrv_getlength = raw_getlength, -> -> -> -> -> -> -> -> I am not too familiar with block device code in QEMU, so not sure if -> -> this is the right fix or if there are some underlying problems. -> -> -Oh this is quite embarassing! I only added the bdrv_attach_aio_context -> -callback for the file-backed device. Your fix is definitely corect for -> -host device. Let me make sure there weren't any others missed and I will -> -send out a properly formatted patch. Thank you for the quick testing and -> -turnaround! -Farhan, can you respin your patch with proper sign-off and patch description? -Adding qemu-block. - -Hi Christian, - -On 19.07.2018 [08:55:20 +0200], Christian Borntraeger wrote: -> -> -> -On 07/18/2018 08:52 PM, Nishanth Aravamudan wrote: -> -> On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: -> ->> -> ->> -> ->> On 07/18/2018 09:42 AM, Farhan Ali wrote: -<snip> - -> ->> I am not too familiar with block device code in QEMU, so not sure if -> ->> this is the right fix or if there are some underlying problems. -> -> -> -> Oh this is quite embarassing! I only added the bdrv_attach_aio_context -> -> callback for the file-backed device. Your fix is definitely corect for -> -> host device. Let me make sure there weren't any others missed and I will -> -> send out a properly formatted patch. Thank you for the quick testing and -> -> turnaround! -> -> -Farhan, can you respin your patch with proper sign-off and patch description? -> -Adding qemu-block. -I sent it yesterday, sorry I didn't cc everyone from this e-mail: -http://lists.nongnu.org/archive/html/qemu-block/2018-07/msg00516.html -Thanks, -Nish - diff --git a/results/classifier/02/mistranslation/24930826 b/results/classifier/02/mistranslation/24930826 deleted file mode 100644 index 034af2c4a..000000000 --- a/results/classifier/02/mistranslation/24930826 +++ /dev/null @@ -1,34 +0,0 @@ -mistranslation: 0.637 -instruction: 0.555 -other: 0.535 -semantic: 0.487 -boot: 0.218 - -[Qemu-devel] [BUG] vhost-user: hot-unplug vhost-user nic for windows guest OS will fail with 100% reproduce rate - -Hi, guys - -I met a problem when hot-unplug vhost-user nic for Windows 2008 rc2 sp1 64 -(Guest OS) - -The xml of nic is as followed: -<interface type='vhostuser'> - <mac address='52:54:00:3b:83:aa'/> - <source type='unix' path='/var/run/vhost-user/port1' mode='client'/> - <target dev='port1'/> - <model type='virtio'/> - <driver queues='4'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> -</interface> - -Firstly, I use virsh attach-device win2008 vif.xml to hot-plug a nic for Guest -OS. This operation returns success. -After guest OS discover nic successfully, I use virsh detach-device win2008 -vif.xml to hot-unplug it. This operation will fail with 100% reproduce rate. - -However, if I hot-plug and hot-unplug virtio-net nic , it will not fail. - -I have analysis the process of qmp_device_del , I found that qemu have inject -interrupt to acpi to let it notice guest OS to remove nic. -I guess there is something wrong in Windows when handle the interrupt. - diff --git a/results/classifier/02/mistranslation/25842545 b/results/classifier/02/mistranslation/25842545 deleted file mode 100644 index 3264cc962..000000000 --- a/results/classifier/02/mistranslation/25842545 +++ /dev/null @@ -1,203 +0,0 @@ -mistranslation: 0.928 -other: 0.912 -instruction: 0.835 -semantic: 0.829 -boot: 0.824 - -[Qemu-devel] [Bug?] Guest pause because VMPTRLD failed in KVM - -Hello, - - We encountered a problem that a guest paused because the KMOD report VMPTRLD -failed. - -The related information is as follows: - -1) Qemu command: - /usr/bin/qemu-kvm -name omu1 -S -machine pc-i440fx-2.3,accel=kvm,usb=off -cpu -host -m 15625 -realtime mlock=off -smp 8,sockets=1,cores=8,threads=1 -uuid -a2aacfff-6583-48b4-b6a4-e6830e519931 -no-user-config -nodefaults -chardev -socket,id=charmonitor,path=/var/lib/libvirt/qemu/omu1.monitor,server,nowait --mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown --boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive -file=/home/env/guest1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native - -device -virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 - -drive -file=/home/env/guest_300G.img,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native - -device -virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 - -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device -virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:80:05:00:00,bus=pci.0,addr=0x3 --netdev tap,fd=27,id=hostnet1,vhost=on,vhostfd=28 -device -virtio-net-pci,netdev=hostnet1,id=net1,mac=00:00:80:05:00:01,bus=pci.0,addr=0x4 --chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 --device usb-tablet,id=input0 -vnc 0.0.0.0:0 -device -cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2 -device -virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on - - 2) Qemu log: - KVM: entry failed, hardware error 0x4 - RAX=00000000ffffffed RBX=ffff8803fa2d7fd8 RCX=0100000000000000 -RDX=0000000000000000 - RSI=0000000000000000 RDI=0000000000000046 RBP=ffff8803fa2d7e90 -RSP=ffff8803fa2efe90 - R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 -R11=000000000000b69a - R12=0000000000000001 R13=ffffffff81a25b40 R14=0000000000000000 -R15=ffff8803fa2d7fd8 - RIP=ffffffff81053e16 RFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 - ES =0000 0000000000000000 ffffffff 00c00000 - CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] - SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] - DS =0000 0000000000000000 ffffffff 00c00000 - FS =0000 0000000000000000 ffffffff 00c00000 - GS =0000 ffff88040f540000 ffffffff 00c00000 - LDT=0000 0000000000000000 ffffffff 00c00000 - TR =0040 ffff88040f550a40 00002087 00008b00 DPL=0 TSS64-busy - GDT= ffff88040f549000 0000007f - IDT= ffffffffff529000 00000fff - CR0=80050033 CR2=00007f81ca0c5000 CR3=00000003f5081000 CR4=000407e0 - DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 -DR3=0000000000000000 - DR6=00000000ffff0ff0 DR7=0000000000000400 - EFER=0000000000000d01 - Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? -?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? - - 3) Demsg - [347315.028339] kvm: vmptrld ffff8817ec5f0000/17ec5f0000 failed - klogd 1.4.1, ---------- state change ---------- - [347315.039506] kvm: vmptrld ffff8817ec5f0000/17ec5f0000 failed - [347315.051728] kvm: vmptrld ffff8817ec5f0000/17ec5f0000 failed - [347315.057472] vmwrite error: reg 6c0a value ffff88307e66e480 (err -2120672384) - [347315.064567] Pid: 69523, comm: qemu-kvm Tainted: GF X -3.0.93-0.8-default #1 - [347315.064569] Call Trace: - [347315.064587] [<ffffffff810049d5>] dump_trace+0x75/0x300 - [347315.064595] [<ffffffff8145e3e3>] dump_stack+0x69/0x6f - [347315.064617] [<ffffffffa03738de>] vmx_vcpu_load+0x11e/0x1d0 [kvm_intel] - [347315.064647] [<ffffffffa029a204>] kvm_arch_vcpu_load+0x44/0x1d0 [kvm] - [347315.064669] [<ffffffff81054ee1>] finish_task_switch+0x81/0xe0 - [347315.064676] [<ffffffff8145f0b4>] thread_return+0x3b/0x2a7 - [347315.064687] [<ffffffffa028d9b5>] kvm_vcpu_block+0x65/0xa0 [kvm] - [347315.064703] [<ffffffffa02a16d1>] __vcpu_run+0xd1/0x260 [kvm] - [347315.064732] [<ffffffffa02a2418>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 -[kvm] - [347315.064759] [<ffffffffa028ecee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] - [347315.064771] [<ffffffff8116bdfb>] do_vfs_ioctl+0x8b/0x3b0 - [347315.064776] [<ffffffff8116c1c1>] sys_ioctl+0xa1/0xb0 - [347315.064783] [<ffffffff81469272>] system_call_fastpath+0x16/0x1b - [347315.064797] [<00007fee51969ce7>] 0x7fee51969ce6 - [347315.064799] vmwrite error: reg 6c0c value ffff88307e664000 (err -2120630272) - [347315.064802] Pid: 69523, comm: qemu-kvm Tainted: GF X -3.0.93-0.8-default #1 - [347315.064803] Call Trace: - [347315.064807] [<ffffffff810049d5>] dump_trace+0x75/0x300 - [347315.064811] [<ffffffff8145e3e3>] dump_stack+0x69/0x6f - [347315.064817] [<ffffffffa03738ec>] vmx_vcpu_load+0x12c/0x1d0 [kvm_intel] - [347315.064832] [<ffffffffa029a204>] kvm_arch_vcpu_load+0x44/0x1d0 [kvm] - [347315.064851] [<ffffffff81054ee1>] finish_task_switch+0x81/0xe0 - [347315.064855] [<ffffffff8145f0b4>] thread_return+0x3b/0x2a7 - [347315.064865] [<ffffffffa028d9b5>] kvm_vcpu_block+0x65/0xa0 [kvm] - [347315.064880] [<ffffffffa02a16d1>] __vcpu_run+0xd1/0x260 [kvm] - [347315.064907] [<ffffffffa02a2418>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 -[kvm] - [347315.064933] [<ffffffffa028ecee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] - [347315.064943] [<ffffffff8116bdfb>] do_vfs_ioctl+0x8b/0x3b0 - [347315.064947] [<ffffffff8116c1c1>] sys_ioctl+0xa1/0xb0 - [347315.064951] [<ffffffff81469272>] system_call_fastpath+0x16/0x1b - [347315.064957] [<00007fee51969ce7>] 0x7fee51969ce6 - [347315.064959] vmwrite error: reg 6c10 value 0 (err 0) - - 4) The isssue can't be reporduced. I search the Intel VMX sepc about reaseons -of vmptrld failure: - The instruction fails if its operand is not properly aligned, sets -unsupported physical-address bits, or is equal to the VMXON - pointer. In addition, the instruction fails if the 32 bits in memory -referenced by the operand do not match the VMCS - revision identifier supported by this processor. - - But I can't find any cues from the KVM source code. It seems each - error conditions is impossible in theory. :( - -Any suggestions will be appreciated! Paolo? - --- -Regards, --Gonglei - -On 10/11/2016 15:10, gong lei wrote: -> -4) The isssue can't be reporduced. I search the Intel VMX sepc about -> -reaseons -> -of vmptrld failure: -> -The instruction fails if its operand is not properly aligned, sets -> -unsupported physical-address bits, or is equal to the VMXON -> -pointer. In addition, the instruction fails if the 32 bits in memory -> -referenced by the operand do not match the VMCS -> -revision identifier supported by this processor. -> -> -But I can't find any cues from the KVM source code. It seems each -> -error conditions is impossible in theory. :( -Yes, it should not happen. :( - -If it's not reproducible, it's really hard to say what it was, except a -random memory corruption elsewhere or even a bit flip (!). - -Paolo - -On 2016/11/17 20:39, Paolo Bonzini wrote: -> -> -On 10/11/2016 15:10, gong lei wrote: -> -> 4) The isssue can't be reporduced. I search the Intel VMX sepc about -> -> reaseons -> -> of vmptrld failure: -> -> The instruction fails if its operand is not properly aligned, sets -> -> unsupported physical-address bits, or is equal to the VMXON -> -> pointer. In addition, the instruction fails if the 32 bits in memory -> -> referenced by the operand do not match the VMCS -> -> revision identifier supported by this processor. -> -> -> -> But I can't find any cues from the KVM source code. It seems each -> -> error conditions is impossible in theory. :( -> -Yes, it should not happen. :( -> -> -If it's not reproducible, it's really hard to say what it was, except a -> -random memory corruption elsewhere or even a bit flip (!). -> -> -Paolo -Thanks for your reply, Paolo :) - --- -Regards, --Gonglei - diff --git a/results/classifier/02/mistranslation/26430026 b/results/classifier/02/mistranslation/26430026 deleted file mode 100644 index 70de0d82a..000000000 --- a/results/classifier/02/mistranslation/26430026 +++ /dev/null @@ -1,166 +0,0 @@ -mistranslation: 0.915 -semantic: 0.904 -instruction: 0.888 -boot: 0.841 -other: 0.813 - -[BUG] cxl,i386: e820 mappings may not be correct for cxl - -Context included below from prior discussion - - `cxl create-region` would fail on inability to allocate memory - - traced this down to the memory region being marked RESERVED - - E820 map marks the CXL fixed memory window as RESERVED - - -Re: x86 errors, I found that region worked with this patch. (I also -added the SRAT patches the Davidlohr posted, but I do not think they are -relevant). - -I don't think this is correct, and setting this to E820_RAM causes the -system to fail to boot at all, but with this change `cxl create-region` -succeeds, which suggests our e820 mappings in the i386 machine are -incorrect. - -Anyone who can help or have an idea as to what e820 should actually be -doing with this region, or if this is correct and something else is -failing, please help! - - -diff --git a/hw/i386/pc.c b/hw/i386/pc.c -index 566accf7e6..a5e688a742 100644 ---- a/hw/i386/pc.c -+++ b/hw/i386/pc.c -@@ -1077,7 +1077,7 @@ void pc_memory_init(PCMachineState *pcms, - memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, fw, - "cxl-fixed-memory-region", fw->size); - memory_region_add_subregion(system_memory, fw->base, &fw->mr); -- e820_add_entry(fw->base, fw->size, E820_RESERVED); -+ e820_add_entry(fw->base, fw->size, E820_NVS); - cxl_fmw_base += fw->size; - cxl_resv_end = cxl_fmw_base; - } - - -On Mon, Oct 10, 2022 at 05:32:42PM +0100, Jonathan Cameron wrote: -> -> -> > but i'm not sure of what to do with this info. We have some proof -> -> > that real hardware works with this no problem, and the only difference -> -> > is that the EFI/bios/firmware is setting the memory regions as `usable` -> -> > or `soft reserved`, which would imply the EDK2 is the blocker here -> -> > regardless of the OS driver status. -> -> > -> -> > But I'd seen elsewhere you had gotten some of this working, and I'm -> -> > failing to get anything working at the moment. If you have any input i -> -> > would greatly appreciate the help. -> -> > -> -> > QEMU config: -> -> > -> -> > /opt/qemu-cxl2/bin/qemu-system-x86_64 \ -> -> > -drive -> -> > file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=d\ -> -> > -m 2G,slots=4,maxmem=4G \ -> -> > -smp 4 \ -> -> > -machine type=q35,accel=kvm,cxl=on \ -> -> > -enable-kvm \ -> -> > -nographic \ -> -> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \ -> -> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \ -> -> > -object memory-backend-file,id=cxl-mem0,mem-path=/tmp/cxl-mem0,size=256M \ -> -> > -object memory-backend-file,id=lsa0,mem-path=/tmp/cxl-lsa0,size=256M \ -> -> > -device cxl-type3,bus=rp0,pmem=true,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 -> -> > \ -> -> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=256M -> -> > -> -> > I'd seen on the lists that you had seen issues with single-rp setups, -> -> > but no combination of configuration I've tried (including all the ones -> -> > in the docs and tests) lead to a successful region creation with -> -> > `cxl create-region` -> -> -> -> Hmm. Let me have a play. I've not run x86 tests for a while so -> -> perhaps something is missing there. -> -> -> -> I'm carrying a patch to override check_last_peer() in -> -> cxl_port_setup_targets() as that is wrong for some combinations, -> -> but that doesn't look like it's related to what you are seeing. -> -> -I'm not sure if it's relevant, but turned out I'd forgotten I'm carrying 3 -> -patches that aren't upstream (and one is a horrible hack). -> -> -Hack: -https://lore.kernel.org/linux-cxl/20220819094655.000005ed@huawei.com/ -> -Shouldn't affect a simple case like this... -> -> -https://lore.kernel.org/linux-cxl/20220819093133.00006c22@huawei.com/T/#t -> -(Dan's version) -> -> -https://lore.kernel.org/linux-cxl/20220815154044.24733-1-Jonathan.Cameron@huawei.com/T/#t -> -> -For writes to work you will currently need two rps (nothing on the second is -> -fine) -> -as we still haven't resolved if the kernel should support an HDM decoder on -> -a host bridge with one port. I think it should (Spec allows it), others -> -unconvinced. -> -> -Note I haven't shifted over to x86 yet so may still be something different -> -from -> -arm64. -> -> -Jonathan -> -> - diff --git a/results/classifier/02/mistranslation/36568044 b/results/classifier/02/mistranslation/36568044 deleted file mode 100644 index 995215323..000000000 --- a/results/classifier/02/mistranslation/36568044 +++ /dev/null @@ -1,4582 +0,0 @@ -mistranslation: 0.962 -instruction: 0.930 -other: 0.930 -semantic: 0.923 -boot: 0.895 - -[BUG, RFC] cpr-transfer: qxl guest driver crashes after migration - -Hi all, - -We've been experimenting with cpr-transfer migration mode recently and -have discovered the following issue with the guest QXL driver: - -Run migration source: -> -EMULATOR=/path/to/emulator -> -ROOTFS=/path/to/image -> -QMPSOCK=/var/run/alma8qmp-src.sock -> -> -$EMULATOR -enable-kvm \ -> --machine q35 \ -> --cpu host -smp 2 -m 2G \ -> --object -> -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ -> --machine memory-backend=ram0 \ -> --machine aux-ram-share=on \ -> --drive file=$ROOTFS,media=disk,if=virtio \ -> --qmp unix:$QMPSOCK,server=on,wait=off \ -> --nographic \ -> --device qxl-vga -Run migration target: -> -EMULATOR=/path/to/emulator -> -ROOTFS=/path/to/image -> -QMPSOCK=/var/run/alma8qmp-dst.sock -> -> -> -> -$EMULATOR -enable-kvm \ -> --machine q35 \ -> --cpu host -smp 2 -m 2G \ -> --object -> -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ -> --machine memory-backend=ram0 \ -> --machine aux-ram-share=on \ -> --drive file=$ROOTFS,media=disk,if=virtio \ -> --qmp unix:$QMPSOCK,server=on,wait=off \ -> --nographic \ -> --device qxl-vga \ -> --incoming tcp:0:44444 \ -> --incoming '{"channel-type": "cpr", "addr": { "transport": "socket", -> -"type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -> -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -> -QMPSOCK=/var/run/alma8qmp-src.sock -> -> -$QMPSHELL -p $QMPSOCK <<EOF -> -migrate-set-parameters mode=cpr-transfer -> -migrate -> -channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}] -> -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -> -[ 73.962002] [TTM] Buffer eviction failed -> -[ 73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001) -> -[ 73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate -> -VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which speeds up -the crash in the guest): -> -#!/bin/bash -> -> -chvt 3 -> -> -for j in $(seq 80); do -> -echo "$(date) starting round $j" -> -if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" -> -]; then -> -echo "bug was reproduced after $j tries" -> -exit 1 -> -fi -> -for i in $(seq 100); do -> -dmesg > /dev/tty3 -> -done -> -done -> -> -echo "bug could not be reproduced" -> -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into this? Any -suggestions would be appreciated. Thanks! - -Andrey - -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ - -machine q35 \ - -cpu host -smp 2 -m 2G \ - -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ - -machine memory-backend=ram0 \ - -machine aux-ram-share=on \ - -drive file=$ROOTFS,media=disk,if=virtio \ - -qmp unix:$QMPSOCK,server=on,wait=off \ - -nographic \ - -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ --machine q35 \ - -cpu host -smp 2 -m 2G \ - -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ - -machine memory-backend=ram0 \ - -machine aux-ram-share=on \ - -drive file=$ROOTFS,media=disk,if=virtio \ - -qmp unix:$QMPSOCK,server=on,wait=off \ - -nographic \ - -device qxl-vga \ - -incoming tcp:0:44444 \ - -incoming '{"channel-type": "cpr", "addr": { "transport": "socket", "type": "unix", -"path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF - migrate-set-parameters mode=cpr-transfer - migrate -channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[ 73.962002] [TTM] Buffer eviction failed -[ 73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001) -[ 73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate -VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do - echo "$(date) starting round $j" - if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" -]; then - echo "bug was reproduced after $j tries" - exit 1 - fi - for i in $(seq 100); do - dmesg > /dev/tty3 - done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' - -- Steve - -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -    -machine q35 \ -    -cpu host -smp 2 -m 2G \ -    -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ -    -machine memory-backend=ram0 \ -    -machine aux-ram-share=on \ -    -drive file=$ROOTFS,media=disk,if=virtio \ -    -qmp unix:$QMPSOCK,server=on,wait=off \ -    -nographic \ -    -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -    -machine q35 \ -    -cpu host -smp 2 -m 2G \ -    -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\ -    -machine memory-backend=ram0 \ -    -machine aux-ram-share=on \ -    -drive file=$ROOTFS,media=disk,if=virtio \ -    -qmp unix:$QMPSOCK,server=on,wait=off \ -    -nographic \ -    -device qxl-vga \ -    -incoming tcp:0:44444 \ -    -incoming '{"channel-type": "cpr", "addr": { "transport": "socket", "type": "unix", -"path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -    migrate-set-parameters mode=cpr-transfer -    migrate -channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate -VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -        echo "$(date) starting round $j" -        if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" -]; then -                echo "bug was reproduced after $j tries" -                exit 1 -        fi -        for i in $(seq 100); do -                dmesg > /dev/tty3 -        done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -1740667681-257312-1-git-send-email-steven.sistare@oracle.com -/">https://lore.kernel.org/qemu-devel/ -1740667681-257312-1-git-send-email-steven.sistare@oracle.com -/ -- Steve - -On 2/28/25 8:20 PM, Steven Sistare wrote: -> -On 2/28/2025 1:13 PM, Steven Sistare wrote: -> -> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -> ->> Hi all, -> ->> -> ->> We've been experimenting with cpr-transfer migration mode recently and -> ->> have discovered the following issue with the guest QXL driver: -> ->> -> ->> Run migration source: -> ->>> EMULATOR=/path/to/emulator -> ->>> ROOTFS=/path/to/image -> ->>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>> -> ->>> $EMULATOR -enable-kvm \ -> ->>>     -machine q35 \ -> ->>>     -cpu host -smp 2 -m 2G \ -> ->>>     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>> ram0,share=on\ -> ->>>     -machine memory-backend=ram0 \ -> ->>>     -machine aux-ram-share=on \ -> ->>>     -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>     -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>     -nographic \ -> ->>>     -device qxl-vga -> ->> -> ->> Run migration target: -> ->>> EMULATOR=/path/to/emulator -> ->>> ROOTFS=/path/to/image -> ->>> QMPSOCK=/var/run/alma8qmp-dst.sock -> ->>> $EMULATOR -enable-kvm \ -> ->>>     -machine q35 \ -> ->>>     -cpu host -smp 2 -m 2G \ -> ->>>     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>> ram0,share=on\ -> ->>>     -machine memory-backend=ram0 \ -> ->>>     -machine aux-ram-share=on \ -> ->>>     -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>     -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>     -nographic \ -> ->>>     -device qxl-vga \ -> ->>>     -incoming tcp:0:44444 \ -> ->>>     -incoming '{"channel-type": "cpr", "addr": { "transport": -> ->>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -> ->> -> ->> -> ->> Launch the migration: -> ->>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -> ->>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>> -> ->>> $QMPSHELL -p $QMPSOCK <<EOF -> ->>>     migrate-set-parameters mode=cpr-transfer -> ->>>     migrate channels=[{"channel-type":"main","addr": -> ->>> {"transport":"socket","type":"inet","host":"0","port":"44444"}}, -> ->>> {"channel-type":"cpr","addr": -> ->>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -> ->>> dst.sock"}}] -> ->>> EOF -> ->> -> ->> Then, after a while, QXL guest driver on target crashes spewing the -> ->> following messages: -> ->>> [  73.962002] [TTM] Buffer eviction failed -> ->>> [  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -> ->>> 0x00000001) -> ->>> [  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -> ->>> allocate VRAM BO -> ->> -> ->> That seems to be a known kernel QXL driver bug: -> ->> -> ->> -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -> ->> -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -> ->> -> ->> (the latter discussion contains that reproduce script which speeds up -> ->> the crash in the guest): -> ->>> #!/bin/bash -> ->>> -> ->>> chvt 3 -> ->>> -> ->>> for j in $(seq 80); do -> ->>>         echo "$(date) starting round $j" -> ->>>         if [ "$(journalctl --boot | grep "failed to allocate VRAM -> ->>> BO")" != "" ]; then -> ->>>                 echo "bug was reproduced after $j tries" -> ->>>                 exit 1 -> ->>>         fi -> ->>>         for i in $(seq 100); do -> ->>>                 dmesg > /dev/tty3 -> ->>>         done -> ->>> done -> ->>> -> ->>> echo "bug could not be reproduced" -> ->>> exit 0 -> ->> -> ->> The bug itself seems to remain unfixed, as I was able to reproduce that -> ->> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -> ->> cpr-transfer code also seems to be buggy as it triggers the crash - -> ->> without the cpr-transfer migration the above reproduce doesn't lead to -> ->> crash on the source VM. -> ->> -> ->> I suspect that, as cpr-transfer doesn't migrate the guest memory, but -> ->> rather passes it through the memory backend object, our code might -> ->> somehow corrupt the VRAM. However, I wasn't able to trace the -> ->> corruption so far. -> ->> -> ->> Could somebody help the investigation and take a look into this? Any -> ->> suggestions would be appreciated. Thanks! -> -> -> -> Possibly some memory region created by qxl is not being preserved. -> -> Try adding these traces to see what is preserved: -> -> -> -> -trace enable='*cpr*' -> -> -trace enable='*ram_alloc*' -> -> -Also try adding this patch to see if it flags any ram blocks as not -> -compatible with cpr. A message is printed at migration start time. -> - -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email- -> -steven.sistare@oracle.com/ -> -> -- Steve -> -With the traces enabled + the "migration: ram block cpr blockers" patch -applied: - -Source: -> -cpr_find_fd pc.bios, id 0 returns -1 -> -cpr_save_fd pc.bios, id 0, fd 22 -> -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -> -0x7fec18e00000 -> -cpr_find_fd pc.rom, id 0 returns -1 -> -cpr_save_fd pc.rom, id 0, fd 23 -> -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -> -0x7fec18c00000 -> -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -> -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -> -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd -> -24 host 0x7fec18a00000 -> -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -> -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -> -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 -> -fd 25 host 0x7feb77e00000 -> -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -> -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27 -> -host 0x7fec18800000 -> -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -> -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 -> -fd 28 host 0x7feb73c00000 -> -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -> -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34 -> -host 0x7fec18600000 -> -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -> -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -> -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 35 -> -host 0x7fec18200000 -> -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -> -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -> -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36 -> -host 0x7feb8b600000 -> -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -> -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -> -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host -> -0x7feb8b400000 -> -> -cpr_state_save cpr-transfer mode -> -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -> -cpr_transfer_input /var/run/alma8cpr-dst.sock -> -cpr_state_load cpr-transfer mode -> -cpr_find_fd pc.bios, id 0 returns 20 -> -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -> -0x7fcdc9800000 -> -cpr_find_fd pc.rom, id 0 returns 19 -> -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -> -0x7fcdc9600000 -> -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -> -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd -> -18 host 0x7fcdc9400000 -> -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -> -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 -> -fd 17 host 0x7fcd27e00000 -> -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16 -> -host 0x7fcdc9200000 -> -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 -> -fd 15 host 0x7fcd23c00000 -> -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -> -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14 -> -host 0x7fcdc8800000 -> -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -> -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 13 -> -host 0x7fcdc8400000 -> -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -> -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11 -> -host 0x7fcdc8200000 -> -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -> -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host -> -0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the same -addresses), and no incompatible ram blocks are found during migration. - -Andrey - -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -> -On 2/28/25 8:20 PM, Steven Sistare wrote: -> -> On 2/28/2025 1:13 PM, Steven Sistare wrote: -> ->> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -> ->>> Hi all, -> ->>> -> ->>> We've been experimenting with cpr-transfer migration mode recently and -> ->>> have discovered the following issue with the guest QXL driver: -> ->>> -> ->>> Run migration source: -> ->>>> EMULATOR=/path/to/emulator -> ->>>> ROOTFS=/path/to/image -> ->>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>> -> ->>>> $EMULATOR -enable-kvm \ -> ->>>>     -machine q35 \ -> ->>>>     -cpu host -smp 2 -m 2G \ -> ->>>>     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>>> ram0,share=on\ -> ->>>>     -machine memory-backend=ram0 \ -> ->>>>     -machine aux-ram-share=on \ -> ->>>>     -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>     -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>     -nographic \ -> ->>>>     -device qxl-vga -> ->>> -> ->>> Run migration target: -> ->>>> EMULATOR=/path/to/emulator -> ->>>> ROOTFS=/path/to/image -> ->>>> QMPSOCK=/var/run/alma8qmp-dst.sock -> ->>>> $EMULATOR -enable-kvm \ -> ->>>>     -machine q35 \ -> ->>>>     -cpu host -smp 2 -m 2G \ -> ->>>>     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>>> ram0,share=on\ -> ->>>>     -machine memory-backend=ram0 \ -> ->>>>     -machine aux-ram-share=on \ -> ->>>>     -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>     -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>     -nographic \ -> ->>>>     -device qxl-vga \ -> ->>>>     -incoming tcp:0:44444 \ -> ->>>>     -incoming '{"channel-type": "cpr", "addr": { "transport": -> ->>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -> ->>> -> ->>> -> ->>> Launch the migration: -> ->>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -> ->>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>> -> ->>>> $QMPSHELL -p $QMPSOCK <<EOF -> ->>>>     migrate-set-parameters mode=cpr-transfer -> ->>>>     migrate channels=[{"channel-type":"main","addr": -> ->>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}}, -> ->>>> {"channel-type":"cpr","addr": -> ->>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -> ->>>> dst.sock"}}] -> ->>>> EOF -> ->>> -> ->>> Then, after a while, QXL guest driver on target crashes spewing the -> ->>> following messages: -> ->>>> [  73.962002] [TTM] Buffer eviction failed -> ->>>> [  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -> ->>>> 0x00000001) -> ->>>> [  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -> ->>>> allocate VRAM BO -> ->>> -> ->>> That seems to be a known kernel QXL driver bug: -> ->>> -> ->>> -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -> ->>> -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -> ->>> -> ->>> (the latter discussion contains that reproduce script which speeds up -> ->>> the crash in the guest): -> ->>>> #!/bin/bash -> ->>>> -> ->>>> chvt 3 -> ->>>> -> ->>>> for j in $(seq 80); do -> ->>>>         echo "$(date) starting round $j" -> ->>>>         if [ "$(journalctl --boot | grep "failed to allocate VRAM -> ->>>> BO")" != "" ]; then -> ->>>>                 echo "bug was reproduced after $j tries" -> ->>>>                 exit 1 -> ->>>>         fi -> ->>>>         for i in $(seq 100); do -> ->>>>                 dmesg > /dev/tty3 -> ->>>>         done -> ->>>> done -> ->>>> -> ->>>> echo "bug could not be reproduced" -> ->>>> exit 0 -> ->>> -> ->>> The bug itself seems to remain unfixed, as I was able to reproduce that -> ->>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -> ->>> cpr-transfer code also seems to be buggy as it triggers the crash - -> ->>> without the cpr-transfer migration the above reproduce doesn't lead to -> ->>> crash on the source VM. -> ->>> -> ->>> I suspect that, as cpr-transfer doesn't migrate the guest memory, but -> ->>> rather passes it through the memory backend object, our code might -> ->>> somehow corrupt the VRAM. However, I wasn't able to trace the -> ->>> corruption so far. -> ->>> -> ->>> Could somebody help the investigation and take a look into this? Any -> ->>> suggestions would be appreciated. Thanks! -> ->> -> ->> Possibly some memory region created by qxl is not being preserved. -> ->> Try adding these traces to see what is preserved: -> ->> -> ->> -trace enable='*cpr*' -> ->> -trace enable='*ram_alloc*' -> -> -> -> Also try adding this patch to see if it flags any ram blocks as not -> -> compatible with cpr. A message is printed at migration start time. -> ->  -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email- -> -> steven.sistare@oracle.com/ -> -> -> -> - Steve -> -> -> -> -With the traces enabled + the "migration: ram block cpr blockers" patch -> -applied: -> -> -Source: -> -> cpr_find_fd pc.bios, id 0 returns -1 -> -> cpr_save_fd pc.bios, id 0, fd 22 -> -> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -> -> 0x7fec18e00000 -> -> cpr_find_fd pc.rom, id 0 returns -1 -> -> cpr_save_fd pc.rom, id 0, fd 23 -> -> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -> -> 0x7fec18c00000 -> -> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -> -> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -> -> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd -> -> 24 host 0x7fec18a00000 -> -> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -> -> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -> -> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 -> -> fd 25 host 0x7feb77e00000 -> -> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -> -> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27 -> -> host 0x7fec18800000 -> -> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -> -> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 -> -> fd 28 host 0x7feb73c00000 -> -> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -> -> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34 -> -> host 0x7fec18600000 -> -> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -> -> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -> -> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd -> -> 35 host 0x7fec18200000 -> -> cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -> -> cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -> -> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36 -> -> host 0x7feb8b600000 -> -> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -> -> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -> -> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host -> -> 0x7feb8b400000 -> -> -> -> cpr_state_save cpr-transfer mode -> -> cpr_transfer_output /var/run/alma8cpr-dst.sock -> -> -Target: -> -> cpr_transfer_input /var/run/alma8cpr-dst.sock -> -> cpr_state_load cpr-transfer mode -> -> cpr_find_fd pc.bios, id 0 returns 20 -> -> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -> -> 0x7fcdc9800000 -> -> cpr_find_fd pc.rom, id 0 returns 19 -> -> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -> -> 0x7fcdc9600000 -> -> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -> -> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd -> -> 18 host 0x7fcdc9400000 -> -> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -> -> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 -> -> fd 17 host 0x7fcd27e00000 -> -> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16 -> -> host 0x7fcdc9200000 -> -> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 -> -> fd 15 host 0x7fcd23c00000 -> -> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -> -> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14 -> -> host 0x7fcdc8800000 -> -> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -> -> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd -> -> 13 host 0x7fcdc8400000 -> -> cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -> -> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11 -> -> host 0x7fcdc8200000 -> -> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -> -> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host -> -> 0x7fcd3be00000 -> -> -Looks like both vga.vram and qxl.vram are being preserved (with the same -> -addresses), and no incompatible ram blocks are found during migration. -> -Sorry, addressed are not the same, of course. However corresponding ram -blocks do seem to be preserved and initialized. - -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -     -machine q35 \ -     -cpu host -smp 2 -m 2G \ -     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -     -machine memory-backend=ram0 \ -     -machine aux-ram-share=on \ -     -drive file=$ROOTFS,media=disk,if=virtio \ -     -qmp unix:$QMPSOCK,server=on,wait=off \ -     -nographic \ -     -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -     -machine q35 \ -     -cpu host -smp 2 -m 2G \ -     -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -     -machine memory-backend=ram0 \ -     -machine aux-ram-share=on \ -     -drive file=$ROOTFS,media=disk,if=virtio \ -     -qmp unix:$QMPSOCK,server=on,wait=off \ -     -nographic \ -     -device qxl-vga \ -     -incoming tcp:0:44444 \ -     -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -     migrate-set-parameters mode=cpr-transfer -     migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -         echo "$(date) starting round $j" -         if [ "$(journalctl --boot | grep "failed to allocate VRAM -BO")" != "" ]; then -                 echo "bug was reproduced after $j tries" -                 exit 1 -         fi -         for i in $(seq 100); do -                 dmesg > /dev/tty3 -         done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -  -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd 24 -host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 fd -25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27 host -0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 fd -28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34 host -0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 35 -host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36 host -0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host -0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd 18 -host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 fd -17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16 host -0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 fd -15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14 host -0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 13 -host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11 host -0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host -0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the same -addresses), and no incompatible ram blocks are found during migration. -Sorry, addressed are not the same, of course. However corresponding ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - - qemu_ram_alloc_internal() - if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) - ram_flags |= RAM_READONLY; - new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -0001-hw-qxl-cpr-support-preliminary.patch -Description: -Text document - -On 3/4/25 9:05 PM, Steven Sistare wrote: -> -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -> -> On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -> ->> On 2/28/25 8:20 PM, Steven Sistare wrote: -> ->>> On 2/28/2025 1:13 PM, Steven Sistare wrote: -> ->>>> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -> ->>>>> Hi all, -> ->>>>> -> ->>>>> We've been experimenting with cpr-transfer migration mode recently -> ->>>>> and -> ->>>>> have discovered the following issue with the guest QXL driver: -> ->>>>> -> ->>>>> Run migration source: -> ->>>>>> EMULATOR=/path/to/emulator -> ->>>>>> ROOTFS=/path/to/image -> ->>>>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>>>> -> ->>>>>> $EMULATOR -enable-kvm \ -> ->>>>>>      -machine q35 \ -> ->>>>>>      -cpu host -smp 2 -m 2G \ -> ->>>>>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>>>>> ram0,share=on\ -> ->>>>>>      -machine memory-backend=ram0 \ -> ->>>>>>      -machine aux-ram-share=on \ -> ->>>>>>      -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>>>      -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>>>      -nographic \ -> ->>>>>>      -device qxl-vga -> ->>>>> -> ->>>>> Run migration target: -> ->>>>>> EMULATOR=/path/to/emulator -> ->>>>>> ROOTFS=/path/to/image -> ->>>>>> QMPSOCK=/var/run/alma8qmp-dst.sock -> ->>>>>> $EMULATOR -enable-kvm \ -> ->>>>>>      -machine q35 \ -> ->>>>>>      -cpu host -smp 2 -m 2G \ -> ->>>>>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -> ->>>>>> ram0,share=on\ -> ->>>>>>      -machine memory-backend=ram0 \ -> ->>>>>>      -machine aux-ram-share=on \ -> ->>>>>>      -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>>>      -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>>>      -nographic \ -> ->>>>>>      -device qxl-vga \ -> ->>>>>>      -incoming tcp:0:44444 \ -> ->>>>>>      -incoming '{"channel-type": "cpr", "addr": { "transport": -> ->>>>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -> ->>>>> -> ->>>>> -> ->>>>> Launch the migration: -> ->>>>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -> ->>>>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>>>> -> ->>>>>> $QMPSHELL -p $QMPSOCK <<EOF -> ->>>>>>      migrate-set-parameters mode=cpr-transfer -> ->>>>>>      migrate channels=[{"channel-type":"main","addr": -> ->>>>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}}, -> ->>>>>> {"channel-type":"cpr","addr": -> ->>>>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -> ->>>>>> dst.sock"}}] -> ->>>>>> EOF -> ->>>>> -> ->>>>> Then, after a while, QXL guest driver on target crashes spewing the -> ->>>>> following messages: -> ->>>>>> [  73.962002] [TTM] Buffer eviction failed -> ->>>>>> [  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -> ->>>>>> 0x00000001) -> ->>>>>> [  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -> ->>>>>> allocate VRAM BO -> ->>>>> -> ->>>>> That seems to be a known kernel QXL driver bug: -> ->>>>> -> ->>>>> -https://lore.kernel.org/all/20220907094423.93581-1- -> ->>>>> min_halo@163.com/T/ -> ->>>>> -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -> ->>>>> -> ->>>>> (the latter discussion contains that reproduce script which speeds up -> ->>>>> the crash in the guest): -> ->>>>>> #!/bin/bash -> ->>>>>> -> ->>>>>> chvt 3 -> ->>>>>> -> ->>>>>> for j in $(seq 80); do -> ->>>>>>          echo "$(date) starting round $j" -> ->>>>>>          if [ "$(journalctl --boot | grep "failed to allocate VRAM -> ->>>>>> BO")" != "" ]; then -> ->>>>>>                  echo "bug was reproduced after $j tries" -> ->>>>>>                  exit 1 -> ->>>>>>          fi -> ->>>>>>          for i in $(seq 100); do -> ->>>>>>                  dmesg > /dev/tty3 -> ->>>>>>          done -> ->>>>>> done -> ->>>>>> -> ->>>>>> echo "bug could not be reproduced" -> ->>>>>> exit 0 -> ->>>>> -> ->>>>> The bug itself seems to remain unfixed, as I was able to reproduce -> ->>>>> that -> ->>>>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -> ->>>>> cpr-transfer code also seems to be buggy as it triggers the crash - -> ->>>>> without the cpr-transfer migration the above reproduce doesn't -> ->>>>> lead to -> ->>>>> crash on the source VM. -> ->>>>> -> ->>>>> I suspect that, as cpr-transfer doesn't migrate the guest memory, but -> ->>>>> rather passes it through the memory backend object, our code might -> ->>>>> somehow corrupt the VRAM. However, I wasn't able to trace the -> ->>>>> corruption so far. -> ->>>>> -> ->>>>> Could somebody help the investigation and take a look into this? Any -> ->>>>> suggestions would be appreciated. Thanks! -> ->>>> -> ->>>> Possibly some memory region created by qxl is not being preserved. -> ->>>> Try adding these traces to see what is preserved: -> ->>>> -> ->>>> -trace enable='*cpr*' -> ->>>> -trace enable='*ram_alloc*' -> ->>> -> ->>> Also try adding this patch to see if it flags any ram blocks as not -> ->>> compatible with cpr. A message is printed at migration start time. -> ->>>   -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -> ->>> email- -> ->>> steven.sistare@oracle.com/ -> ->>> -> ->>> - Steve -> ->>> -> ->> -> ->> With the traces enabled + the "migration: ram block cpr blockers" patch -> ->> applied: -> ->> -> ->> Source: -> ->>> cpr_find_fd pc.bios, id 0 returns -1 -> ->>> cpr_save_fd pc.bios, id 0, fd 22 -> ->>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -> ->>> 0x7fec18e00000 -> ->>> cpr_find_fd pc.rom, id 0 returns -1 -> ->>> cpr_save_fd pc.rom, id 0, fd 23 -> ->>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -> ->>> 0x7fec18c00000 -> ->>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -> ->>> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -> ->>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -> ->>> 262144 fd 24 host 0x7fec18a00000 -> ->>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -> ->>> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -> ->>> 67108864 fd 25 host 0x7feb77e00000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -> ->>> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -> ->>> fd 27 host 0x7fec18800000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -> ->>> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -> ->>> 67108864 fd 28 host 0x7feb73c00000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -> ->>> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -> ->>> fd 34 host 0x7fec18600000 -> ->>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -> ->>> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -> ->>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -> ->>> 2097152 fd 35 host 0x7fec18200000 -> ->>> cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -> ->>> cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -> ->>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -> ->>> fd 36 host 0x7feb8b600000 -> ->>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -> ->>> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -> ->>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -> ->>> 37 host 0x7feb8b400000 -> ->>> -> ->>> cpr_state_save cpr-transfer mode -> ->>> cpr_transfer_output /var/run/alma8cpr-dst.sock -> ->> -> ->> Target: -> ->>> cpr_transfer_input /var/run/alma8cpr-dst.sock -> ->>> cpr_state_load cpr-transfer mode -> ->>> cpr_find_fd pc.bios, id 0 returns 20 -> ->>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -> ->>> 0x7fcdc9800000 -> ->>> cpr_find_fd pc.rom, id 0 returns 19 -> ->>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -> ->>> 0x7fcdc9600000 -> ->>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -> ->>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -> ->>> 262144 fd 18 host 0x7fcdc9400000 -> ->>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -> ->>> 67108864 fd 17 host 0x7fcd27e00000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -> ->>> fd 16 host 0x7fcdc9200000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -> ->>> 67108864 fd 15 host 0x7fcd23c00000 -> ->>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -> ->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -> ->>> fd 14 host 0x7fcdc8800000 -> ->>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -> ->>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -> ->>> 2097152 fd 13 host 0x7fcdc8400000 -> ->>> cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -> ->>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -> ->>> fd 11 host 0x7fcdc8200000 -> ->>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -> ->>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -> ->>> 10 host 0x7fcd3be00000 -> ->> -> ->> Looks like both vga.vram and qxl.vram are being preserved (with the same -> ->> addresses), and no incompatible ram blocks are found during migration. -> -> -> -> Sorry, addressed are not the same, of course. However corresponding ram -> -> blocks do seem to be preserved and initialized. -> -> -So far, I have not reproduced the guest driver failure. -> -> -However, I have isolated places where new QEMU improperly writes to -> -the qxl memory regions prior to starting the guest, by mmap'ing them -> -readonly after cpr: -> -> - qemu_ram_alloc_internal() -> -   if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -> -       ram_flags |= RAM_READONLY; -> -   new_block = qemu_ram_alloc_from_fd(...) -> -> -I have attached a draft fix; try it and let me know. -> -My console window looks fine before and after cpr, using -> --vnc $hostip:0 -vga qxl -> -> -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. - -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: - -> -Program terminated with signal SIGSEGV, Segmentation fault. -> -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -> -412 d->ram->magic = cpu_to_le32(QXL_RAM_MAGIC); -> -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -> -(gdb) bt -> -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -> -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -> -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -> -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -> -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -> -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -> -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -> -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -> -value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -> -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -> -v=0x5638996f3770, name=0x56389759b141 "realized", opaque=0x5638987893d0, -> -errp=0x7ffd3c2b84e0) -> -at ../qom/object.c:2374 -> -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -> -name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -> -at ../qom/object.c:1449 -> -#7 0x00005638970f8586 in object_property_set_qobject (obj=0x5638996e0e70, -> -name=0x56389759b141 "realized", value=0x5638996df900, errp=0x7ffd3c2b84e0) -> -at ../qom/qom-qobject.c:28 -> -#8 0x00005638970f3d8d in object_property_set_bool (obj=0x5638996e0e70, -> -name=0x56389759b141 "realized", value=true, errp=0x7ffd3c2b84e0) -> -at ../qom/object.c:1519 -> -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -> -bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -> -#10 0x0000563896dba675 in qdev_device_add_from_qdict (opts=0x5638996dfe50, -> -from_json=false, errp=0x7ffd3c2b84e0) at ../system/qdev-monitor.c:714 -> -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -> -errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -> -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, opts=0x563898786150, -> -errp=0x56389855dc40 <error_fatal>) at ../system/vl.c:1207 -> -#13 0x000056389737a6cc in qemu_opts_foreach -> -(list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -> -<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -> -at ../util/qemu-option.c:1135 -> -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/vl.c:2745 -> -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -> -<error_fatal>) at ../system/vl.c:2806 -> -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) at -> -../system/vl.c:3838 -> -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at -> -../system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. - -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. - -Andrey -0001-hw-qxl-cpr-support-preliminary.patch -Description: -Text Data - -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -On 3/4/25 9:05 PM, Steven Sistare wrote: -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently -and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -      -machine q35 \ -      -cpu host -smp 2 -m 2G \ -      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -      -machine memory-backend=ram0 \ -      -machine aux-ram-share=on \ -      -drive file=$ROOTFS,media=disk,if=virtio \ -      -qmp unix:$QMPSOCK,server=on,wait=off \ -      -nographic \ -      -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -      -machine q35 \ -      -cpu host -smp 2 -m 2G \ -      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -      -machine memory-backend=ram0 \ -      -machine aux-ram-share=on \ -      -drive file=$ROOTFS,media=disk,if=virtio \ -      -qmp unix:$QMPSOCK,server=on,wait=off \ -      -nographic \ -      -device qxl-vga \ -      -incoming tcp:0:44444 \ -      -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -      migrate-set-parameters mode=cpr-transfer -      migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1- -min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -          echo "$(date) starting round $j" -          if [ "$(journalctl --boot | grep "failed to allocate VRAM -BO")" != "" ]; then -                  echo "bug was reproduced after $j tries" -                  exit 1 -          fi -          for i in $(seq 100); do -                  dmesg > /dev/tty3 -          done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce -that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't -lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -   -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 24 host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 27 host 0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 34 host 0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 35 host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 36 host 0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -37 host 0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 18 host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 16 host 0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 14 host 0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 13 host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 11 host 0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -10 host 0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the same -addresses), and no incompatible ram blocks are found during migration. -Sorry, addressed are not the same, of course. However corresponding ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - -  qemu_ram_alloc_internal() -    if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -        ram_flags |= RAM_READONLY; -    new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? -cpr does not preserve the vnc connection and session. To test, I specify -port 0 for the source VM and port 1 for the dest. When the src vnc goes -dormant the dest vnc becomes active. -Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. -I have been running like that, but have not reproduced the qxl driver crash, -and I suspect my guest image+kernel is too old. However, once I realized the -issue was post-cpr modification of qxl memory, I switched my attention to the -fix. -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -412 d->ram->magic = cpu_to_le32(QXL_RAM_MAGIC); -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -(gdb) bt -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, value=true, -errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, v=0x5638996f3770, -name=0x56389759b141 "realized", opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) - at ../qom/object.c:2374 -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, name=0x56389759b141 -"realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) - at ../qom/object.c:1449 -#7 0x00005638970f8586 in object_property_set_qobject (obj=0x5638996e0e70, -name=0x56389759b141 "realized", value=0x5638996df900, errp=0x7ffd3c2b84e0) - at ../qom/qom-qobject.c:28 -#8 0x00005638970f3d8d in object_property_set_bool (obj=0x5638996e0e70, -name=0x56389759b141 "realized", value=true, errp=0x7ffd3c2b84e0) - at ../qom/object.c:1519 -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, bus=0x563898cf3c20, -errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -#10 0x0000563896dba675 in qdev_device_add_from_qdict (opts=0x5638996dfe50, -from_json=false, errp=0x7ffd3c2b84e0) at ../system/qdev-monitor.c:714 -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, errp=0x56389855dc40 -<error_fatal>) at ../system/qdev-monitor.c:733 -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, opts=0x563898786150, -errp=0x56389855dc40 <error_fatal>) at ../system/vl.c:1207 -#13 0x000056389737a6cc in qemu_opts_foreach - (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca <device_init_func>, -opaque=0x0, errp=0x56389855dc40 <error_fatal>) - at ../util/qemu-option.c:1135 -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/vl.c:2745 -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -<error_fatal>) at ../system/vl.c:2806 -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) at -../system/vl.c:3838 -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at -../system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. -Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram are -definitely harmful. Try V2 of the patch, attached, which skips the lines -of init_qxl_ram that modify guest memory. -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. -It's a useful debugging technique, but changing protection on a large memory -region -can be too expensive for production due to TLB shootdowns. - -Also, there are cases where writes are performed but the value is guaranteed to -be the same: - qxl_post_load() - qxl_set_mode() - d->rom->mode = cpu_to_le32(modenr); -The value is the same because mode and shadow_rom.mode were passed in vmstate -from old qemu. - -- Steve -0001-hw-qxl-cpr-support-preliminary-V2.patch -Description: -Text document - -On 3/5/25 22:19, Steven Sistare wrote: -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -On 3/4/25 9:05 PM, Steven Sistare wrote: -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently -and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -      -machine q35 \ -      -cpu host -smp 2 -m 2G \ -      -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -      -machine memory-backend=ram0 \ -      -machine aux-ram-share=on \ -      -drive file=$ROOTFS,media=disk,if=virtio \ -      -qmp unix:$QMPSOCK,server=on,wait=off \ -      -nographic \ -      -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -      -machine q35 \ -      -cpu host -smp 2 -m 2G \ -      -object -memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ -ram0,share=on\ -      -machine memory-backend=ram0 \ -      -machine aux-ram-share=on \ -      -drive file=$ROOTFS,media=disk,if=virtio \ -      -qmp unix:$QMPSOCK,server=on,wait=off \ -      -nographic \ -      -device qxl-vga \ -      -incoming tcp:0:44444 \ -      -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -      migrate-set-parameters mode=cpr-transfer -      migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing -the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* -failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1- -min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which -speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -          echo "$(date) starting round $j" -          if [ "$(journalctl --boot | grep "failed to -allocate VRAM -BO")" != "" ]; then -                  echo "bug was reproduced after $j tries" -                  exit 1 -          fi -          for i in $(seq 100); do -                  dmesg > /dev/tty3 -          done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce -that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the -crash - -without the cpr-transfer migration the above reproduce doesn't -lead to -crash on the source VM. -I suspect that, as cpr-transfer doesn't migrate the guest -memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. -Could somebody help the investigation and take a look into -this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" -patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 24 host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 27 host 0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 34 host 0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 35 host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 36 host 0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -37 host 0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 18 host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 16 host 0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 14 host 0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 13 host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 11 host 0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -10 host 0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with -the same -addresses), and no incompatible ram blocks are found during -migration. -Sorry, addressed are not the same, of course. However -corresponding ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - -  qemu_ram_alloc_internal() -    if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -        ram_flags |= RAM_READONLY; -    new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? -cpr does not preserve the vnc connection and session. To test, I specify -port 0 for the source VM and port 1 for the dest. When the src vnc goes -dormant the dest vnc becomes active. -Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. -I have been running like that, but have not reproduced the qxl driver -crash, -and I suspect my guest image+kernel is too old. However, once I -realized the -issue was post-cpr modification of qxl memory, I switched my attention -to the -fix. -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -412        d->ram->magic      = cpu_to_le32(QXL_RAM_MAGIC); -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -(gdb) bt -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -v=0x5638996f3770, name=0x56389759b141 "realized", -opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) -    at ../qom/object.c:2374 -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -    at ../qom/object.c:1449 -#7 0x00005638970f8586 in object_property_set_qobject -(obj=0x5638996e0e70, name=0x56389759b141 "realized", -value=0x5638996df900, errp=0x7ffd3c2b84e0) -    at ../qom/qom-qobject.c:28 -#8 0x00005638970f3d8d in object_property_set_bool -(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, -errp=0x7ffd3c2b84e0) -    at ../qom/object.c:1519 -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -#10 0x0000563896dba675 in qdev_device_add_from_qdict -(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at -../system/qdev-monitor.c:714 -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, -opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at -../system/vl.c:1207 -#13 0x000056389737a6cc in qemu_opts_foreach -    (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -    at ../util/qemu-option.c:1135 -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at -../system/vl.c:2745 -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -<error_fatal>) at ../system/vl.c:2806 -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) -at ../system/vl.c:3838 -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at -../system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. -Thanks for the stack trace; the calls to SPICE_RING_INIT in -init_qxl_ram are -definitely harmful. Try V2 of the patch, attached, which skips the lines -of init_qxl_ram that modify guest memory. -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. -It's a useful debugging technique, but changing protection on a large -memory region -can be too expensive for production due to TLB shootdowns. -Good point. Though we could move this code under non-default option to -avoid re-writing. - -Den - -On 3/5/25 11:19 PM, Steven Sistare wrote: -> -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -> -> On 3/4/25 9:05 PM, Steven Sistare wrote: -> ->> On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -> ->>> On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -> ->>>> On 2/28/25 8:20 PM, Steven Sistare wrote: -> ->>>>> On 2/28/2025 1:13 PM, Steven Sistare wrote: -> ->>>>>> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -> ->>>>>>> Hi all, -> ->>>>>>> -> ->>>>>>> We've been experimenting with cpr-transfer migration mode recently -> ->>>>>>> and -> ->>>>>>> have discovered the following issue with the guest QXL driver: -> ->>>>>>> -> ->>>>>>> Run migration source: -> ->>>>>>>> EMULATOR=/path/to/emulator -> ->>>>>>>> ROOTFS=/path/to/image -> ->>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>>>>>> -> ->>>>>>>> $EMULATOR -enable-kvm \ -> ->>>>>>>>       -machine q35 \ -> ->>>>>>>>       -cpu host -smp 2 -m 2G \ -> ->>>>>>>>       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -> ->>>>>>>> dev/shm/ -> ->>>>>>>> ram0,share=on\ -> ->>>>>>>>       -machine memory-backend=ram0 \ -> ->>>>>>>>       -machine aux-ram-share=on \ -> ->>>>>>>>       -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>>>>>       -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>>>>>       -nographic \ -> ->>>>>>>>       -device qxl-vga -> ->>>>>>> -> ->>>>>>> Run migration target: -> ->>>>>>>> EMULATOR=/path/to/emulator -> ->>>>>>>> ROOTFS=/path/to/image -> ->>>>>>>> QMPSOCK=/var/run/alma8qmp-dst.sock -> ->>>>>>>> $EMULATOR -enable-kvm \ -> ->>>>>>>>       -machine q35 \ -> ->>>>>>>>       -cpu host -smp 2 -m 2G \ -> ->>>>>>>>       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -> ->>>>>>>> dev/shm/ -> ->>>>>>>> ram0,share=on\ -> ->>>>>>>>       -machine memory-backend=ram0 \ -> ->>>>>>>>       -machine aux-ram-share=on \ -> ->>>>>>>>       -drive file=$ROOTFS,media=disk,if=virtio \ -> ->>>>>>>>       -qmp unix:$QMPSOCK,server=on,wait=off \ -> ->>>>>>>>       -nographic \ -> ->>>>>>>>       -device qxl-vga \ -> ->>>>>>>>       -incoming tcp:0:44444 \ -> ->>>>>>>>       -incoming '{"channel-type": "cpr", "addr": { "transport": -> ->>>>>>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -> ->>>>>>> -> ->>>>>>> -> ->>>>>>> Launch the migration: -> ->>>>>>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -> ->>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock -> ->>>>>>>> -> ->>>>>>>> $QMPSHELL -p $QMPSOCK <<EOF -> ->>>>>>>>       migrate-set-parameters mode=cpr-transfer -> ->>>>>>>>       migrate channels=[{"channel-type":"main","addr": -> ->>>>>>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}}, -> ->>>>>>>> {"channel-type":"cpr","addr": -> ->>>>>>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -> ->>>>>>>> dst.sock"}}] -> ->>>>>>>> EOF -> ->>>>>>> -> ->>>>>>> Then, after a while, QXL guest driver on target crashes spewing the -> ->>>>>>> following messages: -> ->>>>>>>> [  73.962002] [TTM] Buffer eviction failed -> ->>>>>>>> [  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -> ->>>>>>>> 0x00000001) -> ->>>>>>>> [  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -> ->>>>>>>> allocate VRAM BO -> ->>>>>>> -> ->>>>>>> That seems to be a known kernel QXL driver bug: -> ->>>>>>> -> ->>>>>>> -https://lore.kernel.org/all/20220907094423.93581-1- -> ->>>>>>> min_halo@163.com/T/ -> ->>>>>>> -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -> ->>>>>>> -> ->>>>>>> (the latter discussion contains that reproduce script which -> ->>>>>>> speeds up -> ->>>>>>> the crash in the guest): -> ->>>>>>>> #!/bin/bash -> ->>>>>>>> -> ->>>>>>>> chvt 3 -> ->>>>>>>> -> ->>>>>>>> for j in $(seq 80); do -> ->>>>>>>>           echo "$(date) starting round $j" -> ->>>>>>>>           if [ "$(journalctl --boot | grep "failed to allocate -> ->>>>>>>> VRAM -> ->>>>>>>> BO")" != "" ]; then -> ->>>>>>>>                   echo "bug was reproduced after $j tries" -> ->>>>>>>>                   exit 1 -> ->>>>>>>>           fi -> ->>>>>>>>           for i in $(seq 100); do -> ->>>>>>>>                   dmesg > /dev/tty3 -> ->>>>>>>>           done -> ->>>>>>>> done -> ->>>>>>>> -> ->>>>>>>> echo "bug could not be reproduced" -> ->>>>>>>> exit 0 -> ->>>>>>> -> ->>>>>>> The bug itself seems to remain unfixed, as I was able to reproduce -> ->>>>>>> that -> ->>>>>>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -> ->>>>>>> cpr-transfer code also seems to be buggy as it triggers the crash - -> ->>>>>>> without the cpr-transfer migration the above reproduce doesn't -> ->>>>>>> lead to -> ->>>>>>> crash on the source VM. -> ->>>>>>> -> ->>>>>>> I suspect that, as cpr-transfer doesn't migrate the guest -> ->>>>>>> memory, but -> ->>>>>>> rather passes it through the memory backend object, our code might -> ->>>>>>> somehow corrupt the VRAM. However, I wasn't able to trace the -> ->>>>>>> corruption so far. -> ->>>>>>> -> ->>>>>>> Could somebody help the investigation and take a look into -> ->>>>>>> this? Any -> ->>>>>>> suggestions would be appreciated. Thanks! -> ->>>>>> -> ->>>>>> Possibly some memory region created by qxl is not being preserved. -> ->>>>>> Try adding these traces to see what is preserved: -> ->>>>>> -> ->>>>>> -trace enable='*cpr*' -> ->>>>>> -trace enable='*ram_alloc*' -> ->>>>> -> ->>>>> Also try adding this patch to see if it flags any ram blocks as not -> ->>>>> compatible with cpr. A message is printed at migration start time. -> ->>>>>    -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -> ->>>>> email- -> ->>>>> steven.sistare@oracle.com/ -> ->>>>> -> ->>>>> - Steve -> ->>>>> -> ->>>> -> ->>>> With the traces enabled + the "migration: ram block cpr blockers" -> ->>>> patch -> ->>>> applied: -> ->>>> -> ->>>> Source: -> ->>>>> cpr_find_fd pc.bios, id 0 returns -1 -> ->>>>> cpr_save_fd pc.bios, id 0, fd 22 -> ->>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -> ->>>>> 0x7fec18e00000 -> ->>>>> cpr_find_fd pc.rom, id 0 returns -1 -> ->>>>> cpr_save_fd pc.rom, id 0, fd 23 -> ->>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -> ->>>>> 0x7fec18c00000 -> ->>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -> ->>>>> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -> ->>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -> ->>>>> 262144 fd 24 host 0x7fec18a00000 -> ->>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -> ->>>>> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -> ->>>>> 67108864 fd 25 host 0x7feb77e00000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -> ->>>>> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -> ->>>>> fd 27 host 0x7fec18800000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -> ->>>>> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -> ->>>>> 67108864 fd 28 host 0x7feb73c00000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -> ->>>>> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -> ->>>>> fd 34 host 0x7fec18600000 -> ->>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -> ->>>>> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -> ->>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -> ->>>>> 2097152 fd 35 host 0x7fec18200000 -> ->>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -> ->>>>> cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -> ->>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -> ->>>>> fd 36 host 0x7feb8b600000 -> ->>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -> ->>>>> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -> ->>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -> ->>>>> 37 host 0x7feb8b400000 -> ->>>>> -> ->>>>> cpr_state_save cpr-transfer mode -> ->>>>> cpr_transfer_output /var/run/alma8cpr-dst.sock -> ->>>> -> ->>>> Target: -> ->>>>> cpr_transfer_input /var/run/alma8cpr-dst.sock -> ->>>>> cpr_state_load cpr-transfer mode -> ->>>>> cpr_find_fd pc.bios, id 0 returns 20 -> ->>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -> ->>>>> 0x7fcdc9800000 -> ->>>>> cpr_find_fd pc.rom, id 0 returns 19 -> ->>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -> ->>>>> 0x7fcdc9600000 -> ->>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -> ->>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -> ->>>>> 262144 fd 18 host 0x7fcdc9400000 -> ->>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -> ->>>>> 67108864 fd 17 host 0x7fcd27e00000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -> ->>>>> fd 16 host 0x7fcdc9200000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -> ->>>>> 67108864 fd 15 host 0x7fcd23c00000 -> ->>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -> ->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -> ->>>>> fd 14 host 0x7fcdc8800000 -> ->>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -> ->>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -> ->>>>> 2097152 fd 13 host 0x7fcdc8400000 -> ->>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -> ->>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -> ->>>>> fd 11 host 0x7fcdc8200000 -> ->>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -> ->>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -> ->>>>> 10 host 0x7fcd3be00000 -> ->>>> -> ->>>> Looks like both vga.vram and qxl.vram are being preserved (with the -> ->>>> same -> ->>>> addresses), and no incompatible ram blocks are found during migration. -> ->>> -> ->>> Sorry, addressed are not the same, of course. However corresponding -> ->>> ram -> ->>> blocks do seem to be preserved and initialized. -> ->> -> ->> So far, I have not reproduced the guest driver failure. -> ->> -> ->> However, I have isolated places where new QEMU improperly writes to -> ->> the qxl memory regions prior to starting the guest, by mmap'ing them -> ->> readonly after cpr: -> ->> -> ->>   qemu_ram_alloc_internal() -> ->>     if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -> ->>         ram_flags |= RAM_READONLY; -> ->>     new_block = qemu_ram_alloc_from_fd(...) -> ->> -> ->> I have attached a draft fix; try it and let me know. -> ->> My console window looks fine before and after cpr, using -> ->> -vnc $hostip:0 -vga qxl -> ->> -> ->> - Steve -> -> -> -> Regarding the reproduce: when I launch the buggy version with the same -> -> options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -> -> my VNC client silently hangs on the target after a while. Could it -> -> happen on your stand as well? -> -> -cpr does not preserve the vnc connection and session. To test, I specify -> -port 0 for the source VM and port 1 for the dest. When the src vnc goes -> -dormant the dest vnc becomes active. -> -Sure, I meant that VNC on the dest (on the port 1) works for a while -after the migration and then hangs, apparently after the guest QXL crash. - -> -> Could you try launching VM with -> -> "-nographic -device qxl-vga"? That way VM's serial console is given you -> -> directly in the shell, so when qxl driver crashes you're still able to -> -> inspect the kernel messages. -> -> -I have been running like that, but have not reproduced the qxl driver -> -crash, -> -and I suspect my guest image+kernel is too old. -Yes, that's probably the case. But the crash occurs on my Fedora 41 -guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to -be buggy. - - -> -However, once I realized the -> -issue was post-cpr modification of qxl memory, I switched my attention -> -to the -> -fix. -> -> -> As for your patch, I can report that it doesn't resolve the issue as it -> -> is. But I was able to track down another possible memory corruption -> -> using your approach with readonly mmap'ing: -> -> -> ->> Program terminated with signal SIGSEGV, Segmentation fault. -> ->> #0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -> ->> 412        d->ram->magic      = cpu_to_le32(QXL_RAM_MAGIC); -> ->> [Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -> ->> (gdb) bt -> ->> #0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -> ->> #1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -> ->> errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -> ->> #2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -> ->> errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -> ->> #3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -> ->> errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -> ->> #4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -> ->> value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -> ->> #5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -> ->> v=0x5638996f3770, name=0x56389759b141 "realized", -> ->> opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) -> ->>     at ../qom/object.c:2374 -> ->> #6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -> ->> name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -> ->>     at ../qom/object.c:1449 -> ->> #7 0x00005638970f8586 in object_property_set_qobject -> ->> (obj=0x5638996e0e70, name=0x56389759b141 "realized", -> ->> value=0x5638996df900, errp=0x7ffd3c2b84e0) -> ->>     at ../qom/qom-qobject.c:28 -> ->> #8 0x00005638970f3d8d in object_property_set_bool -> ->> (obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, -> ->> errp=0x7ffd3c2b84e0) -> ->>     at ../qom/object.c:1519 -> ->> #9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -> ->> bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -> ->> #10 0x0000563896dba675 in qdev_device_add_from_qdict -> ->> (opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../ -> ->> system/qdev-monitor.c:714 -> ->> #11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -> ->> errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -> ->> #12 0x0000563896dc48f1 in device_init_func (opaque=0x0, -> ->> opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/ -> ->> vl.c:1207 -> ->> #13 0x000056389737a6cc in qemu_opts_foreach -> ->>     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -> ->> <device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -> ->>     at ../util/qemu-option.c:1135 -> ->> #14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/ -> ->> vl.c:2745 -> ->> #15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -> ->> <error_fatal>) at ../system/vl.c:2806 -> ->> #16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) -> ->> at ../system/vl.c:3838 -> ->> #17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../ -> ->> system/main.c:72 -> -> -> -> So the attached adjusted version of your patch does seem to help. At -> -> least I can't reproduce the crash on my stand. -> -> -Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram -> -are -> -definitely harmful. Try V2 of the patch, attached, which skips the lines -> -of init_qxl_ram that modify guest memory. -> -Thanks, your v2 patch does seem to prevent the crash. Would you re-send -it to the list as a proper fix? - -> -> I'm wondering, could it be useful to explicitly mark all the reused -> -> memory regions readonly upon cpr-transfer, and then make them writable -> -> back again after the migration is done? That way we will be segfaulting -> -> early on instead of debugging tricky memory corruptions. -> -> -It's a useful debugging technique, but changing protection on a large -> -memory region -> -can be too expensive for production due to TLB shootdowns. -> -> -Also, there are cases where writes are performed but the value is -> -guaranteed to -> -be the same: -> - qxl_post_load() -> -   qxl_set_mode() -> -     d->rom->mode = cpu_to_le32(modenr); -> -The value is the same because mode and shadow_rom.mode were passed in -> -vmstate -> -from old qemu. -> -There're also cases where devices' ROM might be re-initialized. E.g. -this segfault occures upon further exploration of RO mapped RAM blocks: - -> -Program terminated with signal SIGSEGV, Segmentation fault. -> -#0 __memmove_avx_unaligned_erms () at -> -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -> -664 rep movsb -> -[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))] -> -(gdb) bt -> -#0 __memmove_avx_unaligned_erms () at -> -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -> -#1 0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, -> -owner=0x55aa2019ac10, name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true) -> -at ../hw/core/loader.c:1032 -> -#2 0x000055aa1d031577 in rom_add_blob -> -(name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, -> -max_len=2097152, addr=18446744073709551615, fw_file_name=0x55aa1da51f13 -> -"etc/acpi/tables", fw_callback=0x55aa1d441f59 <acpi_build_update>, -> -callback_opaque=0x55aa20ff0010, as=0x0, read_only=true) at -> -../hw/core/loader.c:1147 -> -#3 0x000055aa1cfd788d in acpi_add_rom_blob -> -(update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, -> -blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at -> -../hw/acpi/utils.c:46 -> -#4 0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720 -> -#5 0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) -> -at ../hw/i386/pc.c:638 -> -#6 0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 -> -<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39 -> -#7 0x000055aa1d039ee5 in qdev_machine_creation_done () at -> -../hw/core/machine.c:1749 -> -#8 0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 -> -<error_fatal>) at ../system/vl.c:2779 -> -#9 0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 -> -<error_fatal>) at ../system/vl.c:2807 -> -#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at -> -../system/vl.c:3838 -> -#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at -> -../system/main.c:72 -I'm not sure whether ACPI tables ROM in particular is rewritten with the -same content, but there might be cases where ROM can be read from file -system upon initialization. That is undesirable as guest kernel -certainly won't be too happy about sudden change of the device's ROM -content. - -So the issue we're dealing with here is any unwanted memory related -device initialization upon cpr. - -For now the only thing that comes to my mind is to make a test where we -put as many devices as we can into a VM, make ram blocks RO upon cpr -(and remap them as RW later after migration is done, if needed), and -catch any unwanted memory violations. As Den suggested, we might -consider adding that behaviour as a separate non-default option (or -"migrate" command flag specific to cpr-transfer), which would only be -used in the testing. - -Andrey - -On 3/6/25 16:16, Andrey Drobyshev wrote: -On 3/5/25 11:19 PM, Steven Sistare wrote: -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -On 3/4/25 9:05 PM, Steven Sistare wrote: -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently -and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga \ -       -incoming tcp:0:44444 \ -       -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -       migrate-set-parameters mode=cpr-transfer -       migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1- -min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which -speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -           echo "$(date) starting round $j" -           if [ "$(journalctl --boot | grep "failed to allocate -VRAM -BO")" != "" ]; then -                   echo "bug was reproduced after $j tries" -                   exit 1 -           fi -           for i in $(seq 100); do -                   dmesg > /dev/tty3 -           done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce -that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't -lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest -memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into -this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -    -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" -patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 24 host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 27 host 0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 34 host 0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 35 host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 36 host 0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -37 host 0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 18 host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 16 host 0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 14 host 0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 13 host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 11 host 0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -10 host 0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the -same -addresses), and no incompatible ram blocks are found during migration. -Sorry, addressed are not the same, of course. However corresponding -ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - -   qemu_ram_alloc_internal() -     if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -         ram_flags |= RAM_READONLY; -     new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? -cpr does not preserve the vnc connection and session. To test, I specify -port 0 for the source VM and port 1 for the dest. When the src vnc goes -dormant the dest vnc becomes active. -Sure, I meant that VNC on the dest (on the port 1) works for a while -after the migration and then hangs, apparently after the guest QXL crash. -Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. -I have been running like that, but have not reproduced the qxl driver -crash, -and I suspect my guest image+kernel is too old. -Yes, that's probably the case. But the crash occurs on my Fedora 41 -guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to -be buggy. -However, once I realized the -issue was post-cpr modification of qxl memory, I switched my attention -to the -fix. -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -412        d->ram->magic      = cpu_to_le32(QXL_RAM_MAGIC); -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -(gdb) bt -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -v=0x5638996f3770, name=0x56389759b141 "realized", -opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:2374 -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1449 -#7 0x00005638970f8586 in object_property_set_qobject -(obj=0x5638996e0e70, name=0x56389759b141 "realized", -value=0x5638996df900, errp=0x7ffd3c2b84e0) -     at ../qom/qom-qobject.c:28 -#8 0x00005638970f3d8d in object_property_set_bool -(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, -errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1519 -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -#10 0x0000563896dba675 in qdev_device_add_from_qdict -(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../ -system/qdev-monitor.c:714 -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, -opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/ -vl.c:1207 -#13 0x000056389737a6cc in qemu_opts_foreach -     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -     at ../util/qemu-option.c:1135 -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/ -vl.c:2745 -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -<error_fatal>) at ../system/vl.c:2806 -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) -at ../system/vl.c:3838 -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../ -system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. -Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram -are -definitely harmful. Try V2 of the patch, attached, which skips the lines -of init_qxl_ram that modify guest memory. -Thanks, your v2 patch does seem to prevent the crash. Would you re-send -it to the list as a proper fix? -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. -It's a useful debugging technique, but changing protection on a large -memory region -can be too expensive for production due to TLB shootdowns. - -Also, there are cases where writes are performed but the value is -guaranteed to -be the same: -  qxl_post_load() -    qxl_set_mode() -      d->rom->mode = cpu_to_le32(modenr); -The value is the same because mode and shadow_rom.mode were passed in -vmstate -from old qemu. -There're also cases where devices' ROM might be re-initialized. E.g. -this segfault occures upon further exploration of RO mapped RAM blocks: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -664 rep movsb -[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))] -(gdb) bt -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -#1 0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, -name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true) - at ../hw/core/loader.c:1032 -#2 0x000055aa1d031577 in rom_add_blob - (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, -addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", -fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, -read_only=true) at ../hw/core/loader.c:1147 -#3 0x000055aa1cfd788d in acpi_add_rom_blob - (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, -blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46 -#4 0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720 -#5 0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) -at ../hw/i386/pc.c:638 -#6 0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 -<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39 -#7 0x000055aa1d039ee5 in qdev_machine_creation_done () at -../hw/core/machine.c:1749 -#8 0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2779 -#9 0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2807 -#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at -../system/vl.c:3838 -#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at -../system/main.c:72 -I'm not sure whether ACPI tables ROM in particular is rewritten with the -same content, but there might be cases where ROM can be read from file -system upon initialization. That is undesirable as guest kernel -certainly won't be too happy about sudden change of the device's ROM -content. - -So the issue we're dealing with here is any unwanted memory related -device initialization upon cpr. - -For now the only thing that comes to my mind is to make a test where we -put as many devices as we can into a VM, make ram blocks RO upon cpr -(and remap them as RW later after migration is done, if needed), and -catch any unwanted memory violations. As Den suggested, we might -consider adding that behaviour as a separate non-default option (or -"migrate" command flag specific to cpr-transfer), which would only be -used in the testing. - -Andrey -No way. ACPI with the source must be used in the same way as BIOSes -and optional ROMs. - -Den - -On 3/6/2025 10:52 AM, Denis V. Lunev wrote: -On 3/6/25 16:16, Andrey Drobyshev wrote: -On 3/5/25 11:19 PM, Steven Sistare wrote: -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -On 3/4/25 9:05 PM, Steven Sistare wrote: -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently -and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga \ -       -incoming tcp:0:44444 \ -       -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -       migrate-set-parameters mode=cpr-transfer -       migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1- -min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which -speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -           echo "$(date) starting round $j" -           if [ "$(journalctl --boot | grep "failed to allocate -VRAM -BO")" != "" ]; then -                   echo "bug was reproduced after $j tries" -                   exit 1 -           fi -           for i in $(seq 100); do -                   dmesg > /dev/tty3 -           done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce -that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't -lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest -memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into -this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -    -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" -patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 24 host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 27 host 0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 34 host 0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 35 host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 36 host 0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -37 host 0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 18 host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 16 host 0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 14 host 0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 13 host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 11 host 0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -10 host 0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the -same -addresses), and no incompatible ram blocks are found during migration. -Sorry, addressed are not the same, of course. However corresponding -ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - -   qemu_ram_alloc_internal() -     if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -         ram_flags |= RAM_READONLY; -     new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? -cpr does not preserve the vnc connection and session. To test, I specify -port 0 for the source VM and port 1 for the dest. When the src vnc goes -dormant the dest vnc becomes active. -Sure, I meant that VNC on the dest (on the port 1) works for a while -after the migration and then hangs, apparently after the guest QXL crash. -Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. -I have been running like that, but have not reproduced the qxl driver -crash, -and I suspect my guest image+kernel is too old. -Yes, that's probably the case. But the crash occurs on my Fedora 41 -guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to -be buggy. -However, once I realized the -issue was post-cpr modification of qxl memory, I switched my attention -to the -fix. -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -412        d->ram->magic      = cpu_to_le32(QXL_RAM_MAGIC); -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -(gdb) bt -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -v=0x5638996f3770, name=0x56389759b141 "realized", -opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:2374 -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1449 -#7 0x00005638970f8586 in object_property_set_qobject -(obj=0x5638996e0e70, name=0x56389759b141 "realized", -value=0x5638996df900, errp=0x7ffd3c2b84e0) -     at ../qom/qom-qobject.c:28 -#8 0x00005638970f3d8d in object_property_set_bool -(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, -errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1519 -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -#10 0x0000563896dba675 in qdev_device_add_from_qdict -(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../ -system/qdev-monitor.c:714 -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, -opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/ -vl.c:1207 -#13 0x000056389737a6cc in qemu_opts_foreach -     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -     at ../util/qemu-option.c:1135 -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/ -vl.c:2745 -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -<error_fatal>) at ../system/vl.c:2806 -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) -at ../system/vl.c:3838 -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../ -system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. -Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram -are -definitely harmful. Try V2 of the patch, attached, which skips the lines -of init_qxl_ram that modify guest memory. -Thanks, your v2 patch does seem to prevent the crash. Would you re-send -it to the list as a proper fix? -Yes. Was waiting for your confirmation. -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. -It's a useful debugging technique, but changing protection on a large -memory region -can be too expensive for production due to TLB shootdowns. - -Also, there are cases where writes are performed but the value is -guaranteed to -be the same: -  qxl_post_load() -    qxl_set_mode() -      d->rom->mode = cpu_to_le32(modenr); -The value is the same because mode and shadow_rom.mode were passed in -vmstate -from old qemu. -There're also cases where devices' ROM might be re-initialized. E.g. -this segfault occures upon further exploration of RO mapped RAM blocks: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -664            rep    movsb -[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))] -(gdb) bt -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -#1 0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, -name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true) -    at ../hw/core/loader.c:1032 -#2 0x000055aa1d031577 in rom_add_blob -    (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, -addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", -fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, -read_only=true) at ../hw/core/loader.c:1147 -#3 0x000055aa1cfd788d in acpi_add_rom_blob -    (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, -blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46 -#4 0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720 -#5 0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) -at ../hw/i386/pc.c:638 -#6 0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 -<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39 -#7 0x000055aa1d039ee5 in qdev_machine_creation_done () at -../hw/core/machine.c:1749 -#8 0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2779 -#9 0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2807 -#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at -../system/vl.c:3838 -#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at -../system/main.c:72 -I'm not sure whether ACPI tables ROM in particular is rewritten with the -same content, but there might be cases where ROM can be read from file -system upon initialization. That is undesirable as guest kernel -certainly won't be too happy about sudden change of the device's ROM -content. - -So the issue we're dealing with here is any unwanted memory related -device initialization upon cpr. - -For now the only thing that comes to my mind is to make a test where we -put as many devices as we can into a VM, make ram blocks RO upon cpr -(and remap them as RW later after migration is done, if needed), and -catch any unwanted memory violations. As Den suggested, we might -consider adding that behaviour as a separate non-default option (or -"migrate" command flag specific to cpr-transfer), which would only be -used in the testing. -I'll look into adding an option, but there may be too many false positives, -such as the qxl_set_mode case above. And the maintainers may object to me -eliminating the false positives by adding more CPR_IN tests, due to gratuitous -(from their POV) ugliness. - -But I will use the technique to look for more write violations. -Andrey -No way. ACPI with the source must be used in the same way as BIOSes -and optional ROMs. -Yup, its a bug. Will fix. - -- Steve - -see -1741380954-341079-1-git-send-email-steven.sistare@oracle.com -/">https://lore.kernel.org/qemu-devel/ -1741380954-341079-1-git-send-email-steven.sistare@oracle.com -/ -- Steve - -On 3/6/2025 11:13 AM, Steven Sistare wrote: -On 3/6/2025 10:52 AM, Denis V. Lunev wrote: -On 3/6/25 16:16, Andrey Drobyshev wrote: -On 3/5/25 11:19 PM, Steven Sistare wrote: -On 3/5/2025 11:50 AM, Andrey Drobyshev wrote: -On 3/4/25 9:05 PM, Steven Sistare wrote: -On 2/28/2025 1:37 PM, Andrey Drobyshev wrote: -On 2/28/25 8:35 PM, Andrey Drobyshev wrote: -On 2/28/25 8:20 PM, Steven Sistare wrote: -On 2/28/2025 1:13 PM, Steven Sistare wrote: -On 2/28/2025 12:39 PM, Andrey Drobyshev wrote: -Hi all, - -We've been experimenting with cpr-transfer migration mode recently -and -have discovered the following issue with the guest QXL driver: - -Run migration source: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-src.sock - -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga -Run migration target: -EMULATOR=/path/to/emulator -ROOTFS=/path/to/image -QMPSOCK=/var/run/alma8qmp-dst.sock -$EMULATOR -enable-kvm \ -       -machine q35 \ -       -cpu host -smp 2 -m 2G \ -       -object memory-backend-file,id=ram0,size=2G,mem-path=/ -dev/shm/ -ram0,share=on\ -       -machine memory-backend=ram0 \ -       -machine aux-ram-share=on \ -       -drive file=$ROOTFS,media=disk,if=virtio \ -       -qmp unix:$QMPSOCK,server=on,wait=off \ -       -nographic \ -       -device qxl-vga \ -       -incoming tcp:0:44444 \ -       -incoming '{"channel-type": "cpr", "addr": { "transport": -"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}' -Launch the migration: -QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell -QMPSOCK=/var/run/alma8qmp-src.sock - -$QMPSHELL -p $QMPSOCK <<EOF -       migrate-set-parameters mode=cpr-transfer -       migrate channels=[{"channel-type":"main","addr": -{"transport":"socket","type":"inet","host":"0","port":"44444"}}, -{"channel-type":"cpr","addr": -{"transport":"socket","type":"unix","path":"/var/run/alma8cpr- -dst.sock"}}] -EOF -Then, after a while, QXL guest driver on target crashes spewing the -following messages: -[  73.962002] [TTM] Buffer eviction failed -[  73.962072] qxl 0000:00:02.0: object_init failed for (3149824, -0x00000001) -[  73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to -allocate VRAM BO -That seems to be a known kernel QXL driver bug: -https://lore.kernel.org/all/20220907094423.93581-1- -min_halo@163.com/T/ -https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/ -(the latter discussion contains that reproduce script which -speeds up -the crash in the guest): -#!/bin/bash - -chvt 3 - -for j in $(seq 80); do -           echo "$(date) starting round $j" -           if [ "$(journalctl --boot | grep "failed to allocate -VRAM -BO")" != "" ]; then -                   echo "bug was reproduced after $j tries" -                   exit 1 -           fi -           for i in $(seq 100); do -                   dmesg > /dev/tty3 -           done -done - -echo "bug could not be reproduced" -exit 0 -The bug itself seems to remain unfixed, as I was able to reproduce -that -with Fedora 41 guest, as well as AlmaLinux 8 guest. However our -cpr-transfer code also seems to be buggy as it triggers the crash - -without the cpr-transfer migration the above reproduce doesn't -lead to -crash on the source VM. - -I suspect that, as cpr-transfer doesn't migrate the guest -memory, but -rather passes it through the memory backend object, our code might -somehow corrupt the VRAM. However, I wasn't able to trace the -corruption so far. - -Could somebody help the investigation and take a look into -this? Any -suggestions would be appreciated. Thanks! -Possibly some memory region created by qxl is not being preserved. -Try adding these traces to see what is preserved: - --trace enable='*cpr*' --trace enable='*ram_alloc*' -Also try adding this patch to see if it flags any ram blocks as not -compatible with cpr. A message is printed at migration start time. -    -https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send- -email- -steven.sistare@oracle.com/ - -- Steve -With the traces enabled + the "migration: ram block cpr blockers" -patch -applied: - -Source: -cpr_find_fd pc.bios, id 0 returns -1 -cpr_save_fd pc.bios, id 0, fd 22 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host -0x7fec18e00000 -cpr_find_fd pc.rom, id 0 returns -1 -cpr_save_fd pc.rom, id 0, fd 23 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host -0x7fec18c00000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1 -cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 24 host 0x7fec18a00000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 25 host 0x7feb77e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 27 host 0x7fec18800000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 28 host 0x7feb73c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1 -cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 34 host 0x7fec18600000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 35 host 0x7fec18200000 -cpr_find_fd /rom@etc/table-loader, id 0 returns -1 -cpr_save_fd /rom@etc/table-loader, id 0, fd 36 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 36 host 0x7feb8b600000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1 -cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -37 host 0x7feb8b400000 - -cpr_state_save cpr-transfer mode -cpr_transfer_output /var/run/alma8cpr-dst.sock -Target: -cpr_transfer_input /var/run/alma8cpr-dst.sock -cpr_state_load cpr-transfer mode -cpr_find_fd pc.bios, id 0 returns 20 -qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host -0x7fcdc9800000 -cpr_find_fd pc.rom, id 0 returns 19 -qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host -0x7fcdc9600000 -cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18 -qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size -262144 fd 18 host 0x7fcdc9400000 -cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17 -qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size -67108864 fd 17 host 0x7fcd27e00000 -cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 -fd 16 host 0x7fcdc9200000 -cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15 -qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size -67108864 fd 15 host 0x7fcd23c00000 -cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14 -qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 -fd 14 host 0x7fcdc8800000 -cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13 -qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size -2097152 fd 13 host 0x7fcdc8400000 -cpr_find_fd /rom@etc/table-loader, id 0 returns 11 -qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 -fd 11 host 0x7fcdc8200000 -cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10 -qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd -10 host 0x7fcd3be00000 -Looks like both vga.vram and qxl.vram are being preserved (with the -same -addresses), and no incompatible ram blocks are found during migration. -Sorry, addressed are not the same, of course. However corresponding -ram -blocks do seem to be preserved and initialized. -So far, I have not reproduced the guest driver failure. - -However, I have isolated places where new QEMU improperly writes to -the qxl memory regions prior to starting the guest, by mmap'ing them -readonly after cpr: - -   qemu_ram_alloc_internal() -     if (reused && (strstr(name, "qxl") || strstr("name", "vga"))) -         ram_flags |= RAM_READONLY; -     new_block = qemu_ram_alloc_from_fd(...) - -I have attached a draft fix; try it and let me know. -My console window looks fine before and after cpr, using --vnc $hostip:0 -vga qxl - -- Steve -Regarding the reproduce: when I launch the buggy version with the same -options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer, -my VNC client silently hangs on the target after a while. Could it -happen on your stand as well? -cpr does not preserve the vnc connection and session. To test, I specify -port 0 for the source VM and port 1 for the dest. When the src vnc goes -dormant the dest vnc becomes active. -Sure, I meant that VNC on the dest (on the port 1) works for a while -after the migration and then hangs, apparently after the guest QXL crash. -Could you try launching VM with -"-nographic -device qxl-vga"? That way VM's serial console is given you -directly in the shell, so when qxl driver crashes you're still able to -inspect the kernel messages. -I have been running like that, but have not reproduced the qxl driver -crash, -and I suspect my guest image+kernel is too old. -Yes, that's probably the case. But the crash occurs on my Fedora 41 -guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to -be buggy. -However, once I realized the -issue was post-cpr modification of qxl memory, I switched my attention -to the -fix. -As for your patch, I can report that it doesn't resolve the issue as it -is. But I was able to track down another possible memory corruption -using your approach with readonly mmap'ing: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -412        d->ram->magic      = cpu_to_le32(QXL_RAM_MAGIC); -[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))] -(gdb) bt -#0 init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412 -#1 0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, -errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142 -#2 0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, -errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257 -#3 0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, -errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174 -#4 0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, -value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494 -#5 0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, -v=0x5638996f3770, name=0x56389759b141 "realized", -opaque=0x5638987893d0, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:2374 -#6 0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, -name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1449 -#7 0x00005638970f8586 in object_property_set_qobject -(obj=0x5638996e0e70, name=0x56389759b141 "realized", -value=0x5638996df900, errp=0x7ffd3c2b84e0) -     at ../qom/qom-qobject.c:28 -#8 0x00005638970f3d8d in object_property_set_bool -(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, -errp=0x7ffd3c2b84e0) -     at ../qom/object.c:1519 -#9 0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, -bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276 -#10 0x0000563896dba675 in qdev_device_add_from_qdict -(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../ -system/qdev-monitor.c:714 -#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, -errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733 -#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, -opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/ -vl.c:1207 -#13 0x000056389737a6cc in qemu_opts_foreach -     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca -<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>) -     at ../util/qemu-option.c:1135 -#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/ -vl.c:2745 -#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 -<error_fatal>) at ../system/vl.c:2806 -#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) -at ../system/vl.c:3838 -#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../ -system/main.c:72 -So the attached adjusted version of your patch does seem to help. At -least I can't reproduce the crash on my stand. -Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram -are -definitely harmful. Try V2 of the patch, attached, which skips the lines -of init_qxl_ram that modify guest memory. -Thanks, your v2 patch does seem to prevent the crash. Would you re-send -it to the list as a proper fix? -Yes. Was waiting for your confirmation. -I'm wondering, could it be useful to explicitly mark all the reused -memory regions readonly upon cpr-transfer, and then make them writable -back again after the migration is done? That way we will be segfaulting -early on instead of debugging tricky memory corruptions. -It's a useful debugging technique, but changing protection on a large -memory region -can be too expensive for production due to TLB shootdowns. - -Also, there are cases where writes are performed but the value is -guaranteed to -be the same: -  qxl_post_load() -    qxl_set_mode() -      d->rom->mode = cpu_to_le32(modenr); -The value is the same because mode and shadow_rom.mode were passed in -vmstate -from old qemu. -There're also cases where devices' ROM might be re-initialized. E.g. -this segfault occures upon further exploration of RO mapped RAM blocks: -Program terminated with signal SIGSEGV, Segmentation fault. -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -664            rep    movsb -[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))] -(gdb) bt -#0 __memmove_avx_unaligned_erms () at -../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664 -#1 0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, -name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true) -    at ../hw/core/loader.c:1032 -#2 0x000055aa1d031577 in rom_add_blob -    (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, -addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", -fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, -read_only=true) at ../hw/core/loader.c:1147 -#3 0x000055aa1cfd788d in acpi_add_rom_blob -    (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, -blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46 -#4 0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720 -#5 0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) -at ../hw/i386/pc.c:638 -#6 0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 -<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39 -#7 0x000055aa1d039ee5 in qdev_machine_creation_done () at -../hw/core/machine.c:1749 -#8 0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2779 -#9 0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 -<error_fatal>) at ../system/vl.c:2807 -#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at -../system/vl.c:3838 -#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at -../system/main.c:72 -I'm not sure whether ACPI tables ROM in particular is rewritten with the -same content, but there might be cases where ROM can be read from file -system upon initialization. That is undesirable as guest kernel -certainly won't be too happy about sudden change of the device's ROM -content. - -So the issue we're dealing with here is any unwanted memory related -device initialization upon cpr. - -For now the only thing that comes to my mind is to make a test where we -put as many devices as we can into a VM, make ram blocks RO upon cpr -(and remap them as RW later after migration is done, if needed), and -catch any unwanted memory violations. As Den suggested, we might -consider adding that behaviour as a separate non-default option (or -"migrate" command flag specific to cpr-transfer), which would only be -used in the testing. -I'll look into adding an option, but there may be too many false positives, -such as the qxl_set_mode case above. And the maintainers may object to me -eliminating the false positives by adding more CPR_IN tests, due to gratuitous -(from their POV) ugliness. - -But I will use the technique to look for more write violations. -Andrey -No way. ACPI with the source must be used in the same way as BIOSes -and optional ROMs. -Yup, its a bug. Will fix. - -- Steve - diff --git a/results/classifier/02/mistranslation/64322995 b/results/classifier/02/mistranslation/64322995 deleted file mode 100644 index eb735c4f8..000000000 --- a/results/classifier/02/mistranslation/64322995 +++ /dev/null @@ -1,55 +0,0 @@ -mistranslation: 0.936 -semantic: 0.906 -other: 0.881 -instruction: 0.864 -boot: 0.780 - -[Qemu-devel] [BUG] trace: QEMU hangs on initialization with the "simple" backend - -While starting the softmmu version of QEMU, the simple backend waits for the -writeout thread to signal a condition variable when initializing the output file -path. But since the writeout thread has not been created, it just waits forever. - -Thanks, - Lluis - -On Tue, Feb 09, 2016 at 09:24:04PM +0100, LluÃs Vilanova wrote: -> -While starting the softmmu version of QEMU, the simple backend waits for the -> -writeout thread to signal a condition variable when initializing the output -> -file -> -path. But since the writeout thread has not been created, it just waits -> -forever. -Denis Lunev posted a fix: -https://patchwork.ozlabs.org/patch/580968/ -Stefan -signature.asc -Description: -PGP signature - -Stefan Hajnoczi writes: - -> -On Tue, Feb 09, 2016 at 09:24:04PM +0100, LluÃs Vilanova wrote: -> -> While starting the softmmu version of QEMU, the simple backend waits for the -> -> writeout thread to signal a condition variable when initializing the output -> -> file -> -> path. But since the writeout thread has not been created, it just waits -> -> forever. -> -Denis Lunev posted a fix: -> -https://patchwork.ozlabs.org/patch/580968/ -Great, thanks. - -Lluis - diff --git a/results/classifier/02/mistranslation/70294255 b/results/classifier/02/mistranslation/70294255 deleted file mode 100644 index 95774334b..000000000 --- a/results/classifier/02/mistranslation/70294255 +++ /dev/null @@ -1,1062 +0,0 @@ -mistranslation: 0.862 -semantic: 0.858 -instruction: 0.856 -other: 0.852 -boot: 0.811 - -[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: 答复: Re: [BUG]COLO failover hang - -hi: - -yes.it is better. - -And should we delete - - - - -#ifdef WIN32 - - QIO_CHANNEL(cioc)-ï¼event = CreateEvent(NULL, FALSE, FALSE, NULL) - -#endif - - - - -in qio_channel_socket_acceptï¼ - -qio_channel_socket_new already have it. - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 -æéäººï¼ address@hidden address@hidden address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ22æ¥ 15:03 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: çå¤: Re: [BUG]COLO failover hang - - - - - -Hi, - -On 2017/3/22 9:42, address@hidden wrote: -ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ -ï¼ -ï¼ index 13966f1..d65a0ea 100644 -ï¼ -ï¼ -ï¼ --- a/migration/socket.c -ï¼ -ï¼ -ï¼ +++ b/migration/socket.c -ï¼ -ï¼ -ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ -ï¼ -ï¼ } -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ trace_migration_socket_incoming_accepted() -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -ï¼ -ï¼ -ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ -ï¼ -ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ -ï¼ -ï¼ QIO_CHANNEL(sioc)) -ï¼ -ï¼ -ï¼ object_unref(OBJECT(sioc)) -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ Is this patch ok? -ï¼ - -Yes, i think this works, but a better way maybe to call -qio_channel_set_feature() -in qio_channel_socket_accept(), we didn't set the SHUTDOWN feature for the -socket accept fd, -Or fix it by this: - -diff --git a/io/channel-socket.c b/io/channel-socket.c -index f546c68..ce6894c 100644 ---- a/io/channel-socket.c -+++ b/io/channel-socket.c -@@ -330,9 +330,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, - Error **errp) - { - QIOChannelSocket *cioc -- -- cioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)) -- cioc-ï¼fd = -1 -+ -+ cioc = qio_channel_socket_new() - cioc-ï¼remoteAddrLen = sizeof(ioc-ï¼remoteAddr) - cioc-ï¼localAddrLen = sizeof(ioc-ï¼localAddr) - - -Thanks, -Hailiang - -ï¼ I have test it . The test could not hang any more. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ -ï¼ -ï¼ -ï¼ åä»¶äººï¼ address@hidden -ï¼ æ¶ä»¶äººï¼ address@hidden address@hidden -ï¼ æéäººï¼ address@hidden address@hidden address@hidden -ï¼ æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 -ï¼ ä¸» é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -ï¼ ï¼ * Hailiang Zhang (address@hidden) wrote: -ï¼ ï¼ï¼ Hi, -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -ï¼ ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do -failover, -ï¼ ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) -ï¼ ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if -ï¼ ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() -ï¼ ï¼ï¼ if we tried to cancel migration. -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, -Error **errp) -ï¼ ï¼ï¼ { -ï¼ ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") -ï¼ ï¼ï¼ migration_channel_connect(s, ioc, NULL) -ï¼ ï¼ï¼ ... ... -ï¼ ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -ï¼ ï¼ï¼ and the -ï¼ ï¼ï¼ migrate_fd_cancel() -ï¼ ï¼ï¼ { -ï¼ ï¼ï¼ ... ... -ï¼ ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { -ï¼ ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? -ï¼ ï¼ï¼ } -ï¼ ï¼ï¼ } -ï¼ ï¼ -ï¼ ï¼ (cc'd in Daniel Berrange). -ï¼ ï¼ I see that we call qio_channel_set_feature(ioc, -QIO_CHANNEL_FEATURE_SHUTDOWN) at the -ï¼ ï¼ top of qio_channel_socket_new so I think that's safe isn't it? -ï¼ ï¼ -ï¼ -ï¼ Hmm, you are right, this problem is only exist for the migration incoming fd, -thanks. -ï¼ -ï¼ ï¼ Dave -ï¼ ï¼ -ï¼ ï¼ï¼ Thanks, -ï¼ ï¼ï¼ Hailiang -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: -ï¼ ï¼ï¼ï¼ Thank youã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I have test areadyã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same -placeã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node -qemu will not produce the problem,but Primary Node panic canã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I test a patch: -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ --- a/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ +++ b/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ } -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), -"migration-socket-incoming") -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ QIO_CHANNEL(sioc)) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ object_unref(OBJECT(sioc)) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ My test will not hang any more. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ åå§é®ä»¶ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ åä»¶äººï¼ address@hidden -ï¼ ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden -ï¼ ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden -ï¼ ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -ï¼ ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Hi,Wang. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ You can test this branch: -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -http://wiki.qemu-project.org/Features/COLO -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Thanks -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ hi. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", -ï¼ ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ ï¼ï¼ï¼ ï¼ outï¼, address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ at migration/colo.c:264 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $3 = 0 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ gmain.c:3054 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:258 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:506 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $1 = 6 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should -ï¼ ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ thank you. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ åå§é®ä»¶ -ï¼ ï¼ï¼ï¼ ï¼ address@hidden -ï¼ ï¼ï¼ï¼ ï¼ address@hidden -ï¼ ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ -ï¼ ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, -ï¼ ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, -ï¼ ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ Thanks -ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized -outï¼, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, -errp=0x0) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message -(errp=0x7f3d62bfaa48, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -- -ï¼ ï¼ï¼ï¼ ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -- -ï¼ ï¼ï¼ï¼ ï¼ Thanks -ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ -ï¼ ï¼ -- -ï¼ ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK -ï¼ ï¼ -ï¼ ï¼ . -ï¼ ï¼ -ï¼ - -On 2017/3/22 16:09, address@hidden wrote: -hi: - -yes.it is better. - -And should we delete -Yes, you are right. -#ifdef WIN32 - - QIO_CHANNEL(cioc)-ï¼event = CreateEvent(NULL, FALSE, FALSE, NULL) - -#endif - - - - -in qio_channel_socket_acceptï¼ - -qio_channel_socket_new already have it. - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 -æéäººï¼ address@hidden address@hidden address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ22æ¥ 15:03 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: çå¤: Re: [BUG]COLO failover hang - - - - - -Hi, - -On 2017/3/22 9:42, address@hidden wrote: -ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ -ï¼ -ï¼ index 13966f1..d65a0ea 100644 -ï¼ -ï¼ -ï¼ --- a/migration/socket.c -ï¼ -ï¼ -ï¼ +++ b/migration/socket.c -ï¼ -ï¼ -ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ -ï¼ -ï¼ } -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ trace_migration_socket_incoming_accepted() -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -ï¼ -ï¼ -ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ -ï¼ -ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ -ï¼ -ï¼ QIO_CHANNEL(sioc)) -ï¼ -ï¼ -ï¼ object_unref(OBJECT(sioc)) -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ Is this patch ok? -ï¼ - -Yes, i think this works, but a better way maybe to call -qio_channel_set_feature() -in qio_channel_socket_accept(), we didn't set the SHUTDOWN feature for the -socket accept fd, -Or fix it by this: - -diff --git a/io/channel-socket.c b/io/channel-socket.c -index f546c68..ce6894c 100644 ---- a/io/channel-socket.c -+++ b/io/channel-socket.c -@@ -330,9 +330,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, - Error **errp) - { - QIOChannelSocket *cioc -- -- cioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)) -- cioc-ï¼fd = -1 -+ -+ cioc = qio_channel_socket_new() - cioc-ï¼remoteAddrLen = sizeof(ioc-ï¼remoteAddr) - cioc-ï¼localAddrLen = sizeof(ioc-ï¼localAddr) - - -Thanks, -Hailiang - -ï¼ I have test it . The test could not hang any more. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ -ï¼ -ï¼ -ï¼ åä»¶äººï¼ address@hidden -ï¼ æ¶ä»¶äººï¼ address@hidden address@hidden -ï¼ æéäººï¼ address@hidden address@hidden address@hidden -ï¼ æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 -ï¼ ä¸» é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -ï¼ ï¼ * Hailiang Zhang (address@hidden) wrote: -ï¼ ï¼ï¼ Hi, -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -ï¼ ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do -failover, -ï¼ ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) -ï¼ ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if -ï¼ ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() -ï¼ ï¼ï¼ if we tried to cancel migration. -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, -Error **errp) -ï¼ ï¼ï¼ { -ï¼ ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") -ï¼ ï¼ï¼ migration_channel_connect(s, ioc, NULL) -ï¼ ï¼ï¼ ... ... -ï¼ ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -ï¼ ï¼ï¼ and the -ï¼ ï¼ï¼ migrate_fd_cancel() -ï¼ ï¼ï¼ { -ï¼ ï¼ï¼ ... ... -ï¼ ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { -ï¼ ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? -ï¼ ï¼ï¼ } -ï¼ ï¼ï¼ } -ï¼ ï¼ -ï¼ ï¼ (cc'd in Daniel Berrange). -ï¼ ï¼ I see that we call qio_channel_set_feature(ioc, -QIO_CHANNEL_FEATURE_SHUTDOWN) at the -ï¼ ï¼ top of qio_channel_socket_new so I think that's safe isn't it? -ï¼ ï¼ -ï¼ -ï¼ Hmm, you are right, this problem is only exist for the migration incoming fd, -thanks. -ï¼ -ï¼ ï¼ Dave -ï¼ ï¼ -ï¼ ï¼ï¼ Thanks, -ï¼ ï¼ï¼ Hailiang -ï¼ ï¼ï¼ -ï¼ ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: -ï¼ ï¼ï¼ï¼ Thank youã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I have test areadyã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same -placeã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node -qemu will not produce the problem,but Primary Node panic canã -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ I test a patch: -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ --- a/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ +++ b/migration/socket.c -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ } -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), -"migration-socket-incoming") -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ QIO_CHANNEL(sioc)) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ object_unref(OBJECT(sioc)) -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ My test will not hang any more. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ åå§é®ä»¶ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ åä»¶äººï¼ address@hidden -ï¼ ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden -ï¼ ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden -ï¼ ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -ï¼ ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Hi,Wang. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ You can test this branch: -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -http://wiki.qemu-project.org/Features/COLO -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Thanks -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ hi. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", -ï¼ ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ ï¼ï¼ï¼ ï¼ outï¼, address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ at migration/colo.c:264 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $3 = 0 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ gmain.c:3054 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:258 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:506 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $1 = 6 -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should -ï¼ ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ thank you. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ åå§é®ä»¶ -ï¼ ï¼ï¼ï¼ ï¼ address@hidden -ï¼ ï¼ï¼ï¼ ï¼ address@hidden -ï¼ ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ -ï¼ ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, -ï¼ ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, -ï¼ ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ Thanks -ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized -outï¼, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, -errp=0x0) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message -(errp=0x7f3d62bfaa48, -ï¼ ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -- -ï¼ ï¼ï¼ï¼ ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -- -ï¼ ï¼ï¼ï¼ ï¼ Thanks -ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ ï¼ -ï¼ ï¼ï¼ï¼ -ï¼ ï¼ï¼ -ï¼ ï¼ -- -ï¼ ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK -ï¼ ï¼ -ï¼ ï¼ . -ï¼ ï¼ -ï¼ - diff --git a/results/classifier/02/mistranslation/71456293 b/results/classifier/02/mistranslation/71456293 deleted file mode 100644 index 0e5494781..000000000 --- a/results/classifier/02/mistranslation/71456293 +++ /dev/null @@ -1,1487 +0,0 @@ -mistranslation: 0.659 -instruction: 0.624 -semantic: 0.600 -other: 0.598 -boot: 0.598 - -[Qemu-devel][bug] qemu crash when migrate vm and vm's disks - -When migrate vm and vmâs disks target host qemu crash due to an invalid free. -#0 object_unref (obj=0x1000) at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/qom/object.c:920 -#1 0x0000560434d79e79 in memory_region_unref (mr=<optimized out>) -at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:1730 -#2 flatview_destroy (view=0x560439653880) at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:292 -#3 0x000056043514dfbe in call_rcu_thread (opaque=<optimized out>) -at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/util/rcu.c:284 -#4 0x00007fbc2b36fe25 in start_thread () from /lib64/libpthread.so.0 -#5 0x00007fbc2b099bad in clone () from /lib64/libc.so.6 -test base qemu-2.12.0 -ï¼ -but use lastest qemu(v6.0.0-rc2) also reproduce. -As follow patch can resolve this problem: -https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg02272.html -Steps to reproduce: -(1) Create VM (virsh define) -(2) Add 64 virtio scsi disks -(3) migrate vm and vmâdisks -------------------------------------------------------------------------------------------------------------------------------------- -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -é®ä»¶ï¼ -This e-mail and its attachments contain confidential information from New H3C, which is -intended only for the person or entity whose address is listed above. Any use of the -information contained herein in any way (including, but not limited to, total or partial -disclosure, reproduction, or dissemination) by persons other than the intended -recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender -by phone or email immediately and delete it! - -* Yuchen (yu.chen@h3c.com) wrote: -> -When migrate vm and vmâs disks target host qemu crash due to an invalid free. -> -> -#0 object_unref (obj=0x1000) at -> -/qemu-2.12/rpmbuild/BUILD/qemu-2.12/qom/object.c:920 -> -#1 0x0000560434d79e79 in memory_region_unref (mr=<optimized out>) -> -at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:1730 -> -#2 flatview_destroy (view=0x560439653880) at -> -/qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:292 -> -#3 0x000056043514dfbe in call_rcu_thread (opaque=<optimized out>) -> -at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/util/rcu.c:284 -> -#4 0x00007fbc2b36fe25 in start_thread () from /lib64/libpthread.so.0 -> -#5 0x00007fbc2b099bad in clone () from /lib64/libc.so.6 -> -> -test base qemu-2.12.0ï¼but use lastest qemu(v6.0.0-rc2) also reproduce. -Interesting. - -> -As follow patch can resolve this problem: -> -https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg02272.html -That's a pci/rcu change; ccing Paolo and Micahel. - -> -Steps to reproduce: -> -(1) Create VM (virsh define) -> -(2) Add 64 virtio scsi disks -Is that hot adding the disks later, or are they included in the VM at -creation? -Can you provide a libvirt XML example? - -> -(3) migrate vm and vmâdisks -What do you mean by 'and vm disks' - are you doing a block migration? - -Dave - -> -------------------------------------------------------------------------------------------------------------------------------------- -> -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -> -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -> -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -> -é®ä»¶ï¼ -> -This e-mail and its attachments contain confidential information from New -> -H3C, which is -> -intended only for the person or entity whose address is listed above. Any use -> -of the -> -information contained herein in any way (including, but not limited to, total -> -or partial -> -disclosure, reproduction, or dissemination) by persons other than the intended -> -recipient(s) is prohibited. If you receive this e-mail in error, please -> -notify the sender -> -by phone or email immediately and delete it! --- -Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK - -> ------é®ä»¶åä»¶----- -> -å件人: Dr. David Alan Gilbert [ -mailto:dgilbert@redhat.com -] -> -åéæ¶é´: 2021å¹´4æ8æ¥ 19:27 -> -æ¶ä»¶äºº: yuchen (Cloud) <yu.chen@h3c.com>; pbonzini@redhat.com; -> -mst@redhat.com -> -æé: qemu-devel@nongnu.org -> -主é¢: Re: [Qemu-devel][bug] qemu crash when migrate vm and vm's disks -> -> -* Yuchen (yu.chen@h3c.com) wrote: -> -> When migrate vm and vmâs disks target host qemu crash due to an invalid -> -free. -> -> -> -> #0 object_unref (obj=0x1000) at -> -> /qemu-2.12/rpmbuild/BUILD/qemu-2.12/qom/object.c:920 -> -> #1 0x0000560434d79e79 in memory_region_unref (mr=<optimized out>) -> -> at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:1730 -> -> #2 flatview_destroy (view=0x560439653880) at -> -> /qemu-2.12/rpmbuild/BUILD/qemu-2.12/memory.c:292 -> -> #3 0x000056043514dfbe in call_rcu_thread (opaque=<optimized out>) -> -> at /qemu-2.12/rpmbuild/BUILD/qemu-2.12/util/rcu.c:284 -> -> #4 0x00007fbc2b36fe25 in start_thread () from /lib64/libpthread.so.0 -> -> #5 0x00007fbc2b099bad in clone () from /lib64/libc.so.6 -> -> -> -> test base qemu-2.12.0ï¼but use lastest qemu(v6.0.0-rc2) also reproduce. -> -> -Interesting. -> -> -> As follow patch can resolve this problem: -> -> -https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg02272.html -> -> -That's a pci/rcu change; ccing Paolo and Micahel. -> -> -> Steps to reproduce: -> -> (1) Create VM (virsh define) -> -> (2) Add 64 virtio scsi disks -> -> -Is that hot adding the disks later, or are they included in the VM at -> -creation? -> -Can you provide a libvirt XML example? -> -Include disks in the VM at creation - -vm disks xml (only virtio scsi disks): - <devices> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native'/> - <source file='/vms/tempp/vm-os'/> - <target dev='vda' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x08' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data1'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='2' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data2'/> - <target dev='sdb' bus='scsi'/> - <address type='drive' controller='3' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data3'/> - <target dev='sdc' bus='scsi'/> - <address type='drive' controller='4' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data4'/> - <target dev='sdd' bus='scsi'/> - <address type='drive' controller='5' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data5'/> - <target dev='sde' bus='scsi'/> - <address type='drive' controller='6' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data6'/> - <target dev='sdf' bus='scsi'/> - <address type='drive' controller='7' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data7'/> - <target dev='sdg' bus='scsi'/> - <address type='drive' controller='8' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data8'/> - <target dev='sdh' bus='scsi'/> - <address type='drive' controller='9' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data9'/> - <target dev='sdi' bus='scsi'/> - <address type='drive' controller='10' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data10'/> - <target dev='sdj' bus='scsi'/> - <address type='drive' controller='11' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data11'/> - <target dev='sdk' bus='scsi'/> - <address type='drive' controller='12' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data12'/> - <target dev='sdl' bus='scsi'/> - <address type='drive' controller='13' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data13'/> - <target dev='sdm' bus='scsi'/> - <address type='drive' controller='14' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data14'/> - <target dev='sdn' bus='scsi'/> - <address type='drive' controller='15' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data15'/> - <target dev='sdo' bus='scsi'/> - <address type='drive' controller='16' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data16'/> - <target dev='sdp' bus='scsi'/> - <address type='drive' controller='17' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data17'/> - <target dev='sdq' bus='scsi'/> - <address type='drive' controller='18' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data18'/> - <target dev='sdr' bus='scsi'/> - <address type='drive' controller='19' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data19'/> - <target dev='sds' bus='scsi'/> - <address type='drive' controller='20' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data20'/> - <target dev='sdt' bus='scsi'/> - <address type='drive' controller='21' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data21'/> - <target dev='sdu' bus='scsi'/> - <address type='drive' controller='22' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data22'/> - <target dev='sdv' bus='scsi'/> - <address type='drive' controller='23' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data23'/> - <target dev='sdw' bus='scsi'/> - <address type='drive' controller='24' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data24'/> - <target dev='sdx' bus='scsi'/> - <address type='drive' controller='25' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data25'/> - <target dev='sdy' bus='scsi'/> - <address type='drive' controller='26' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data26'/> - <target dev='sdz' bus='scsi'/> - <address type='drive' controller='27' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data27'/> - <target dev='sdaa' bus='scsi'/> - <address type='drive' controller='28' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data28'/> - <target dev='sdab' bus='scsi'/> - <address type='drive' controller='29' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data29'/> - <target dev='sdac' bus='scsi'/> - <address type='drive' controller='30' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data30'/> - <target dev='sdad' bus='scsi'/> - <address type='drive' controller='31' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data31'/> - <target dev='sdae' bus='scsi'/> - <address type='drive' controller='32' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data32'/> - <target dev='sdaf' bus='scsi'/> - <address type='drive' controller='33' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data33'/> - <target dev='sdag' bus='scsi'/> - <address type='drive' controller='34' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data34'/> - <target dev='sdah' bus='scsi'/> - <address type='drive' controller='35' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data35'/> - <target dev='sdai' bus='scsi'/> - <address type='drive' controller='36' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data36'/> - <target dev='sdaj' bus='scsi'/> - <address type='drive' controller='37' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data37'/> - <target dev='sdak' bus='scsi'/> - <address type='drive' controller='38' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data38'/> - <target dev='sdal' bus='scsi'/> - <address type='drive' controller='39' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data39'/> - <target dev='sdam' bus='scsi'/> - <address type='drive' controller='40' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data40'/> - <target dev='sdan' bus='scsi'/> - <address type='drive' controller='41' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data41'/> - <target dev='sdao' bus='scsi'/> - <address type='drive' controller='42' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data42'/> - <target dev='sdap' bus='scsi'/> - <address type='drive' controller='43' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data43'/> - <target dev='sdaq' bus='scsi'/> - <address type='drive' controller='44' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data44'/> - <target dev='sdar' bus='scsi'/> - <address type='drive' controller='45' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data45'/> - <target dev='sdas' bus='scsi'/> - <address type='drive' controller='46' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data46'/> - <target dev='sdat' bus='scsi'/> - <address type='drive' controller='47' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data47'/> - <target dev='sdau' bus='scsi'/> - <address type='drive' controller='48' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data48'/> - <target dev='sdav' bus='scsi'/> - <address type='drive' controller='49' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data49'/> - <target dev='sdaw' bus='scsi'/> - <address type='drive' controller='50' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data50'/> - <target dev='sdax' bus='scsi'/> - <address type='drive' controller='51' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data51'/> - <target dev='sday' bus='scsi'/> - <address type='drive' controller='52' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data52'/> - <target dev='sdaz' bus='scsi'/> - <address type='drive' controller='53' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data53'/> - <target dev='sdba' bus='scsi'/> - <address type='drive' controller='54' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data54'/> - <target dev='sdbb' bus='scsi'/> - <address type='drive' controller='55' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data55'/> - <target dev='sdbc' bus='scsi'/> - <address type='drive' controller='56' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data56'/> - <target dev='sdbd' bus='scsi'/> - <address type='drive' controller='57' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data57'/> - <target dev='sdbe' bus='scsi'/> - <address type='drive' controller='58' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data58'/> - <target dev='sdbf' bus='scsi'/> - <address type='drive' controller='59' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data59'/> - <target dev='sdbg' bus='scsi'/> - <address type='drive' controller='60' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data60'/> - <target dev='sdbh' bus='scsi'/> - <address type='drive' controller='61' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data61'/> - <target dev='sdbi' bus='scsi'/> - <address type='drive' controller='62' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data62'/> - <target dev='sdbj' bus='scsi'/> - <address type='drive' controller='63' bus='0' target='0' unit='0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data63'/> - <target dev='sdbk' bus='scsi'/> - <address type='drive' controller='64' bus='0' target='0' unit='0'/> - </disk> - <controller type='scsi' index='0'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x02' -function='0x0'/> - </controller> - <controller type='scsi' index='1' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x06' -function='0x0'/> - </controller> - <controller type='scsi' index='2' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x01' -function='0x0'/> - </controller> - <controller type='scsi' index='3' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x03' -function='0x0'/> - </controller> - <controller type='scsi' index='4' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x04' -function='0x0'/> - </controller> - <controller type='scsi' index='5' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x05' -function='0x0'/> - </controller> - <controller type='scsi' index='6' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x06' -function='0x0'/> - </controller> - <controller type='scsi' index='7' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x07' -function='0x0'/> - </controller> - <controller type='scsi' index='8' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x08' -function='0x0'/> - </controller> - <controller type='scsi' index='9' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x09' -function='0x0'/> - </controller> - <controller type='scsi' index='10' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0a' -function='0x0'/> - </controller> - <controller type='scsi' index='11' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0b' -function='0x0'/> - </controller> - <controller type='scsi' index='12' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0c' -function='0x0'/> - </controller> - <controller type='scsi' index='13' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0d' -function='0x0'/> - </controller> - <controller type='scsi' index='14' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0e' -function='0x0'/> - </controller> - <controller type='scsi' index='15' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0f' -function='0x0'/> - </controller> - <controller type='scsi' index='16' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x10' -function='0x0'/> - </controller> - <controller type='scsi' index='17' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x11' -function='0x0'/> - </controller> - <controller type='scsi' index='18' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x12' -function='0x0'/> - </controller> - <controller type='scsi' index='19' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x13' -function='0x0'/> - </controller> - <controller type='scsi' index='20' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x14' -function='0x0'/> - </controller> - <controller type='scsi' index='21' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x15' -function='0x0'/> - </controller> - <controller type='scsi' index='22' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x16' -function='0x0'/> - </controller> - <controller type='scsi' index='23' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x17' -function='0x0'/> - </controller> - <controller type='scsi' index='24' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x18' -function='0x0'/> - </controller> - <controller type='scsi' index='25' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x19' -function='0x0'/> - </controller> - <controller type='scsi' index='26' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1a' -function='0x0'/> - </controller> - <controller type='scsi' index='27' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1b' -function='0x0'/> - </controller> - <controller type='scsi' index='28' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1c' -function='0x0'/> - </controller> - <controller type='scsi' index='29' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1d' -function='0x0'/> - </controller> - <controller type='scsi' index='30' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1e' -function='0x0'/> - </controller> - <controller type='scsi' index='31' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x01' -function='0x0'/> - </controller> - <controller type='scsi' index='32' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x02' -function='0x0'/> - </controller> - <controller type='scsi' index='33' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x03' -function='0x0'/> - </controller> - <controller type='scsi' index='34' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x04' -function='0x0'/> - </controller> - <controller type='scsi' index='35' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x05' -function='0x0'/> - </controller> - <controller type='scsi' index='36' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x06' -function='0x0'/> - </controller> - <controller type='scsi' index='37' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x07' -function='0x0'/> - </controller> - <controller type='scsi' index='38' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x08' -function='0x0'/> - </controller> - <controller type='scsi' index='39' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x09' -function='0x0'/> - </controller> - <controller type='scsi' index='40' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0a' -function='0x0'/> - </controller> - <controller type='scsi' index='41' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0b' -function='0x0'/> - </controller> - <controller type='scsi' index='42' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0c' -function='0x0'/> - </controller> - <controller type='scsi' index='43' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0d' -function='0x0'/> - </controller> - <controller type='scsi' index='44' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x03' -function='0x0'/> - </controller> - <controller type='scsi' index='45' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x09' -function='0x0'/> - </controller> - <controller type='scsi' index='46' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' -function='0x0'/> - </controller> - <controller type='scsi' index='47' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' -function='0x0'/> - </controller> - <controller type='scsi' index='48' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' -function='0x0'/> - </controller> - <controller type='scsi' index='49' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' -function='0x0'/> - </controller> - <controller type='scsi' index='50' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0f' -function='0x0'/> - </controller> - <controller type='scsi' index='51' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x10' -function='0x0'/> - </controller> - <controller type='scsi' index='52' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x11' -function='0x0'/> - </controller> - <controller type='scsi' index='53' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x12' -function='0x0'/> - </controller> - <controller type='scsi' index='54' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x13' -function='0x0'/> - </controller> - <controller type='scsi' index='55' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x14' -function='0x0'/> - </controller> - <controller type='scsi' index='56' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x15' -function='0x0'/> - </controller> - <controller type='scsi' index='57' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x16' -function='0x0'/> - </controller> - <controller type='scsi' index='58' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x17' -function='0x0'/> - </controller> - <controller type='scsi' index='59' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x18' -function='0x0'/> - </controller> - <controller type='scsi' index='60' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x19' -function='0x0'/> - </controller> - <controller type='scsi' index='61' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1a' -function='0x0'/> - </controller> - <controller type='scsi' index='62' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1b' -function='0x0'/> - </controller> - <controller type='scsi' index='63' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1c' -function='0x0'/> - </controller> - <controller type='scsi' index='64' model='virtio-scsi'> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' -function='0x0'/> - </controller> - <controller type='pci' index='0' model='pci-root'/> - <controller type='pci' index='1' model='pci-bridge'> - <model name='pci-bridge'/> - <target chassisNr='1'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' -function='0x0'/> - </controller> - <controller type='pci' index='2' model='pci-bridge'> - <model name='pci-bridge'/> - <target chassisNr='2'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1f' -function='0x0'/> - </controller> - </devices> - -vm disks xml (only virtio disks): - <devices> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native'/> - <source file='/vms/tempp/vm-os'/> - <target dev='vda' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x08' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data2'/> - <target dev='vdb' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x06' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data3'/> - <target dev='vdc' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x09' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data4'/> - <target dev='vdd' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data5'/> - <target dev='vde' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data6'/> - <target dev='vdf' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data7'/> - <target dev='vdg' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data8'/> - <target dev='vdh' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0f' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data9'/> - <target dev='vdi' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x10' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data10'/> - <target dev='vdj' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x11' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data11'/> - <target dev='vdk' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x12' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data12'/> - <target dev='vdl' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x13' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data13'/> - <target dev='vdm' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x14' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data14'/> - <target dev='vdn' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x15' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data15'/> - <target dev='vdo' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x16' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data16'/> - <target dev='vdp' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x17' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data17'/> - <target dev='vdq' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x18' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data18'/> - <target dev='vdr' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x19' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data19'/> - <target dev='vds' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1a' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data20'/> - <target dev='vdt' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1b' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data21'/> - <target dev='vdu' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1c' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data22'/> - <target dev='vdv' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data23'/> - <target dev='vdw' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data24'/> - <target dev='vdx' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x01' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data25'/> - <target dev='vdy' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x03' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data26'/> - <target dev='vdz' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x04' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data27'/> - <target dev='vdaa' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x05' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data28'/> - <target dev='vdab' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x06' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data29'/> - <target dev='vdac' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x07' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data30'/> - <target dev='vdad' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x08' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data31'/> - <target dev='vdae' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x09' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data32'/> - <target dev='vdaf' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0a' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data33'/> - <target dev='vdag' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0b' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data34'/> - <target dev='vdah' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0c' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data35'/> - <target dev='vdai' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0d' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data36'/> - <target dev='vdaj' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0e' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data37'/> - <target dev='vdak' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x0f' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data38'/> - <target dev='vdal' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x10' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data39'/> - <target dev='vdam' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x11' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data40'/> - <target dev='vdan' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x12' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data41'/> - <target dev='vdao' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x13' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data42'/> - <target dev='vdap' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x14' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data43'/> - <target dev='vdaq' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x15' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data44'/> - <target dev='vdar' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x16' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data45'/> - <target dev='vdas' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x17' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data46'/> - <target dev='vdat' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x18' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data47'/> - <target dev='vdau' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x19' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data48'/> - <target dev='vdav' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1a' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data49'/> - <target dev='vdaw' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1b' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data50'/> - <target dev='vdax' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1c' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data51'/> - <target dev='vday' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1d' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data52'/> - <target dev='vdaz' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1e' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data53'/> - <target dev='vdba' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x01' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data54'/> - <target dev='vdbb' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x02' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data55'/> - <target dev='vdbc' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x03' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data56'/> - <target dev='vdbd' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x04' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data57'/> - <target dev='vdbe' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x05' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data58'/> - <target dev='vdbf' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x06' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data59'/> - <target dev='vdbg' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x07' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data60'/> - <target dev='vdbh' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x08' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data61'/> - <target dev='vdbi' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x09' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data62'/> - <target dev='vdbj' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0a' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data63'/> - <target dev='vdbk' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x02' slot='0x0b' -function='0x0'/> - </disk> - <disk type='file' device='disk'> - <driver name='qemu' type='qcow2' cache='directsync' io='native' -discard='unmap'/> - <source file='/vms/tempp/vm-data1'/> - <target dev='vdbl' bus='virtio'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x03' -function='0x0'/> - </disk> - <controller type='pci' index='0' model='pci-root'/> - <controller type='pci' index='1' model='pci-bridge'> - <model name='pci-bridge'/> - <target chassisNr='1'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' -function='0x0'/> - </controller> - <controller type='pci' index='2' model='pci-bridge'> - <model name='pci-bridge'/> - <target chassisNr='2'/> - <address type='pci' domain='0x0000' bus='0x01' slot='0x1f' -function='0x0'/> - </controller> - </devices> - -> -> (3) migrate vm and vmâdisks -> -> -What do you mean by 'and vm disks' - are you doing a block migration? -> -Yes, block migration. -In fact, only migration domain also reproduced. - -> -Dave -> -> -> ---------------------------------------------------------------------- -> -> --------------------------------------------------------------- -> -Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -------------------------------------------------------------------------------------------------------------------------------------- -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -é®ä»¶ï¼ -This e-mail and its attachments contain confidential information from New H3C, -which is -intended only for the person or entity whose address is listed above. Any use -of the -information contained herein in any way (including, but not limited to, total -or partial -disclosure, reproduction, or dissemination) by persons other than the intended -recipient(s) is prohibited. If you receive this e-mail in error, please notify -the sender -by phone or email immediately and delete it! - diff --git a/results/classifier/02/mistranslation/74466963 b/results/classifier/02/mistranslation/74466963 deleted file mode 100644 index 165b54f90..000000000 --- a/results/classifier/02/mistranslation/74466963 +++ /dev/null @@ -1,1879 +0,0 @@ -mistranslation: 0.927 -instruction: 0.903 -boot: 0.894 -semantic: 0.891 -other: 0.877 - -[Qemu-devel] [TCG only][Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration - -Hi all, - -Does anyboday remember the similar issue post by hailiang months ago -http://patchwork.ozlabs.org/patch/454322/ -At least tow bugs about migration had been fixed since that. -And now we found the same issue at the tcg vm(kvm is fine), after -migration, the content VM's memory is inconsistent. -we add a patch to check memory content, you can find it from affix - -steps to reporduce: -1) apply the patch and re-build qemu -2) prepare the ubuntu guest and run memtest in grub. -soruce side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off -destination side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 -3) start migration -with 1000M NIC, migration will finish within 3 min. - -at source: -(qemu) migrate tcp:192.168.2.66:8881 -after saving ram complete -e9e725df678d392b1a83b3a917f332bb -qemu-system-x86_64: end ram md5 -(qemu) - -at destination: -...skip... -Completed load of VM with exit code 0 seq iteration 1264 -Completed load of VM with exit code 0 seq iteration 1265 -Completed load of VM with exit code 0 seq iteration 1266 -qemu-system-x86_64: after loading state section id 2(ram) -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 -qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init - -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 - -This occurs occasionally and only at tcg machine. It seems that -some pages dirtied in source side don't transferred to destination. -This problem can be reproduced even if we disable virtio. -Is it OK for some pages that not transferred to destination when do -migration ? Or is it a bug? -Any idea... - -=================md5 check patch============================= - -diff --git a/Makefile.target b/Makefile.target -index 962d004..e2cb8e9 100644 ---- a/Makefile.target -+++ b/Makefile.target -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o - obj-y += memory_mapping.o - obj-y += dump.o - obj-y += migration/ram.o migration/savevm.o --LIBS := $(libs_softmmu) $(LIBS) -+LIBS := $(libs_softmmu) $(LIBS) -lplumb - - # xen support - obj-$(CONFIG_XEN) += xen-common.o -diff --git a/migration/ram.c b/migration/ram.c -index 1eb155a..3b7a09d 100644 ---- a/migration/ram.c -+++ b/migration/ram.c -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -version_id) -} - - rcu_read_unlock(); -- DPRINTF("Completed load of VM with exit code %d seq iteration " -+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " - "%" PRIu64 "\n", ret, seq_iter); - return ret; - } -diff --git a/migration/savevm.c b/migration/savevm.c -index 0ad1b93..3feaa61 100644 ---- a/migration/savevm.c -+++ b/migration/savevm.c -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) - - } - -+#include "exec/ram_addr.h" -+#include "qemu/rcu_queue.h" -+#include <clplumbing/md5.h> -+#ifndef MD5_DIGEST_LENGTH -+#define MD5_DIGEST_LENGTH 16 -+#endif -+ -+static void check_host_md5(void) -+{ -+ int i; -+ unsigned char md[MD5_DIGEST_LENGTH]; -+ rcu_read_lock(); -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -'pc.ram' block */ -+ rcu_read_unlock(); -+ -+ MD5(block->host, block->used_length, md); -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -+ fprintf(stderr, "%02x", md[i]); -+ } -+ fprintf(stderr, "\n"); -+ error_report("end ram md5"); -+} -+ - void qemu_savevm_state_begin(QEMUFile *f, - const MigrationParams *params) - { -@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile -*f, bool iterable_only) -save_section_header(f, se, QEMU_VM_SECTION_END); - - ret = se->ops->save_live_complete_precopy(f, se->opaque); -+ -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -+ check_host_md5(); -+ - trace_savevm_section_end(se->idstr, se->section_id, ret); - save_section_footer(f, se); - if (ret < 0) { -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -MigrationIncomingState *mis) -section_id, le->se->idstr); - return ret; - } -+ if (section_type == QEMU_VM_SECTION_END) { -+ error_report("after loading state section id %d(%s)", -+ section_id, le->se->idstr); -+ check_host_md5(); -+ } - if (!check_section_footer(f, le)) { - return -EINVAL; - } -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) - } - - cpu_synchronize_all_post_init(); -+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -+ check_host_md5(); - - return ret; - } - -* Li Zhijian (address@hidden) wrote: -> -Hi all, -> -> -Does anyboday remember the similar issue post by hailiang months ago -> -http://patchwork.ozlabs.org/patch/454322/ -> -At least tow bugs about migration had been fixed since that. -Yes, I wondered what happened to that. - -> -And now we found the same issue at the tcg vm(kvm is fine), after migration, -> -the content VM's memory is inconsistent. -Hmm, TCG only - I don't know much about that; but I guess something must -be accessing memory without using the proper macros/functions so -it doesn't mark it as dirty. - -> -we add a patch to check memory content, you can find it from affix -> -> -steps to reporduce: -> -1) apply the patch and re-build qemu -> -2) prepare the ubuntu guest and run memtest in grub. -> -soruce side: -> -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> -pc-i440fx-2.3,accel=tcg,usb=off -> -> -destination side: -> -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 -> -> -3) start migration -> -with 1000M NIC, migration will finish within 3 min. -> -> -at source: -> -(qemu) migrate tcp:192.168.2.66:8881 -> -after saving ram complete -> -e9e725df678d392b1a83b3a917f332bb -> -qemu-system-x86_64: end ram md5 -> -(qemu) -> -> -at destination: -> -...skip... -> -Completed load of VM with exit code 0 seq iteration 1264 -> -Completed load of VM with exit code 0 seq iteration 1265 -> -Completed load of VM with exit code 0 seq iteration 1266 -> -qemu-system-x86_64: after loading state section id 2(ram) -> -49c2dac7bde0e5e22db7280dcb3824f9 -> -qemu-system-x86_64: end ram md5 -> -qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init -> -> -49c2dac7bde0e5e22db7280dcb3824f9 -> -qemu-system-x86_64: end ram md5 -> -> -This occurs occasionally and only at tcg machine. It seems that -> -some pages dirtied in source side don't transferred to destination. -> -This problem can be reproduced even if we disable virtio. -> -> -Is it OK for some pages that not transferred to destination when do -> -migration ? Or is it a bug? -I'm pretty sure that means it's a bug. Hard to find though, I guess -at least memtest is smaller than a big OS. I think I'd dump the whole -of memory on both sides, hexdump and diff them - I'd guess it would -just be one byte/word different, maybe that would offer some idea what -wrote it. - -Dave - -> -Any idea... -> -> -=================md5 check patch============================= -> -> -diff --git a/Makefile.target b/Makefile.target -> -index 962d004..e2cb8e9 100644 -> ---- a/Makefile.target -> -+++ b/Makefile.target -> -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o -> -obj-y += memory_mapping.o -> -obj-y += dump.o -> -obj-y += migration/ram.o migration/savevm.o -> --LIBS := $(libs_softmmu) $(LIBS) -> -+LIBS := $(libs_softmmu) $(LIBS) -lplumb -> -> -# xen support -> -obj-$(CONFIG_XEN) += xen-common.o -> -diff --git a/migration/ram.c b/migration/ram.c -> -index 1eb155a..3b7a09d 100644 -> ---- a/migration/ram.c -> -+++ b/migration/ram.c -> -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -> -version_id) -> -} -> -> -rcu_read_unlock(); -> -- DPRINTF("Completed load of VM with exit code %d seq iteration " -> -+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " -> -"%" PRIu64 "\n", ret, seq_iter); -> -return ret; -> -} -> -diff --git a/migration/savevm.c b/migration/savevm.c -> -index 0ad1b93..3feaa61 100644 -> ---- a/migration/savevm.c -> -+++ b/migration/savevm.c -> -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) -> -> -} -> -> -+#include "exec/ram_addr.h" -> -+#include "qemu/rcu_queue.h" -> -+#include <clplumbing/md5.h> -> -+#ifndef MD5_DIGEST_LENGTH -> -+#define MD5_DIGEST_LENGTH 16 -> -+#endif -> -+ -> -+static void check_host_md5(void) -> -+{ -> -+ int i; -> -+ unsigned char md[MD5_DIGEST_LENGTH]; -> -+ rcu_read_lock(); -> -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -> -'pc.ram' block */ -> -+ rcu_read_unlock(); -> -+ -> -+ MD5(block->host, block->used_length, md); -> -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -> -+ fprintf(stderr, "%02x", md[i]); -> -+ } -> -+ fprintf(stderr, "\n"); -> -+ error_report("end ram md5"); -> -+} -> -+ -> -void qemu_savevm_state_begin(QEMUFile *f, -> -const MigrationParams *params) -> -{ -> -@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, -> -bool iterable_only) -> -save_section_header(f, se, QEMU_VM_SECTION_END); -> -> -ret = se->ops->save_live_complete_precopy(f, se->opaque); -> -+ -> -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -> -+ check_host_md5(); -> -+ -> -trace_savevm_section_end(se->idstr, se->section_id, ret); -> -save_section_footer(f, se); -> -if (ret < 0) { -> -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -> -MigrationIncomingState *mis) -> -section_id, le->se->idstr); -> -return ret; -> -} -> -+ if (section_type == QEMU_VM_SECTION_END) { -> -+ error_report("after loading state section id %d(%s)", -> -+ section_id, le->se->idstr); -> -+ check_host_md5(); -> -+ } -> -if (!check_section_footer(f, le)) { -> -return -EINVAL; -> -} -> -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) -> -} -> -> -cpu_synchronize_all_post_init(); -> -+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -> -+ check_host_md5(); -> -> -return ret; -> -} -> -> -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 2015/12/3 17:24, Dr. David Alan Gilbert wrote: -* Li Zhijian (address@hidden) wrote: -Hi all, - -Does anyboday remember the similar issue post by hailiang months ago -http://patchwork.ozlabs.org/patch/454322/ -At least tow bugs about migration had been fixed since that. -Yes, I wondered what happened to that. -And now we found the same issue at the tcg vm(kvm is fine), after migration, -the content VM's memory is inconsistent. -Hmm, TCG only - I don't know much about that; but I guess something must -be accessing memory without using the proper macros/functions so -it doesn't mark it as dirty. -we add a patch to check memory content, you can find it from affix - -steps to reporduce: -1) apply the patch and re-build qemu -2) prepare the ubuntu guest and run memtest in grub. -soruce side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off - -destination side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 - -3) start migration -with 1000M NIC, migration will finish within 3 min. - -at source: -(qemu) migrate tcp:192.168.2.66:8881 -after saving ram complete -e9e725df678d392b1a83b3a917f332bb -qemu-system-x86_64: end ram md5 -(qemu) - -at destination: -...skip... -Completed load of VM with exit code 0 seq iteration 1264 -Completed load of VM with exit code 0 seq iteration 1265 -Completed load of VM with exit code 0 seq iteration 1266 -qemu-system-x86_64: after loading state section id 2(ram) -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 -qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init - -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 - -This occurs occasionally and only at tcg machine. It seems that -some pages dirtied in source side don't transferred to destination. -This problem can be reproduced even if we disable virtio. - -Is it OK for some pages that not transferred to destination when do -migration ? Or is it a bug? -I'm pretty sure that means it's a bug. Hard to find though, I guess -at least memtest is smaller than a big OS. I think I'd dump the whole -of memory on both sides, hexdump and diff them - I'd guess it would -just be one byte/word different, maybe that would offer some idea what -wrote it. -Maybe one better way to do that is with the help of userfaultfd's write-protect -capability. It is still in the development by Andrea Arcangeli, but there -is a RFC version available, please refer to -http://www.spinics.net/lists/linux-mm/msg97422.html -ï¼I'm developing live memory snapshot which based on it, maybe this is another -scene where we -can use userfaultfd's WP ;) ). -Dave -Any idea... - -=================md5 check patch============================= - -diff --git a/Makefile.target b/Makefile.target -index 962d004..e2cb8e9 100644 ---- a/Makefile.target -+++ b/Makefile.target -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o - obj-y += memory_mapping.o - obj-y += dump.o - obj-y += migration/ram.o migration/savevm.o --LIBS := $(libs_softmmu) $(LIBS) -+LIBS := $(libs_softmmu) $(LIBS) -lplumb - - # xen support - obj-$(CONFIG_XEN) += xen-common.o -diff --git a/migration/ram.c b/migration/ram.c -index 1eb155a..3b7a09d 100644 ---- a/migration/ram.c -+++ b/migration/ram.c -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -version_id) - } - - rcu_read_unlock(); -- DPRINTF("Completed load of VM with exit code %d seq iteration " -+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " - "%" PRIu64 "\n", ret, seq_iter); - return ret; - } -diff --git a/migration/savevm.c b/migration/savevm.c -index 0ad1b93..3feaa61 100644 ---- a/migration/savevm.c -+++ b/migration/savevm.c -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) - - } - -+#include "exec/ram_addr.h" -+#include "qemu/rcu_queue.h" -+#include <clplumbing/md5.h> -+#ifndef MD5_DIGEST_LENGTH -+#define MD5_DIGEST_LENGTH 16 -+#endif -+ -+static void check_host_md5(void) -+{ -+ int i; -+ unsigned char md[MD5_DIGEST_LENGTH]; -+ rcu_read_lock(); -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -'pc.ram' block */ -+ rcu_read_unlock(); -+ -+ MD5(block->host, block->used_length, md); -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -+ fprintf(stderr, "%02x", md[i]); -+ } -+ fprintf(stderr, "\n"); -+ error_report("end ram md5"); -+} -+ - void qemu_savevm_state_begin(QEMUFile *f, - const MigrationParams *params) - { -@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, -bool iterable_only) - save_section_header(f, se, QEMU_VM_SECTION_END); - - ret = se->ops->save_live_complete_precopy(f, se->opaque); -+ -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -+ check_host_md5(); -+ - trace_savevm_section_end(se->idstr, se->section_id, ret); - save_section_footer(f, se); - if (ret < 0) { -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -MigrationIncomingState *mis) - section_id, le->se->idstr); - return ret; - } -+ if (section_type == QEMU_VM_SECTION_END) { -+ error_report("after loading state section id %d(%s)", -+ section_id, le->se->idstr); -+ check_host_md5(); -+ } - if (!check_section_footer(f, le)) { - return -EINVAL; - } -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) - } - - cpu_synchronize_all_post_init(); -+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -+ check_host_md5(); - - return ret; - } --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -. - -On 12/03/2015 05:37 PM, Hailiang Zhang wrote: -On 2015/12/3 17:24, Dr. David Alan Gilbert wrote: -* Li Zhijian (address@hidden) wrote: -Hi all, - -Does anyboday remember the similar issue post by hailiang months ago -http://patchwork.ozlabs.org/patch/454322/ -At least tow bugs about migration had been fixed since that. -Yes, I wondered what happened to that. -And now we found the same issue at the tcg vm(kvm is fine), after -migration, -the content VM's memory is inconsistent. -Hmm, TCG only - I don't know much about that; but I guess something must -be accessing memory without using the proper macros/functions so -it doesn't mark it as dirty. -we add a patch to check memory content, you can find it from affix - -steps to reporduce: -1) apply the patch and re-build qemu -2) prepare the ubuntu guest and run memtest in grub. -soruce side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 - --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off - -destination side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 - --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 - -3) start migration -with 1000M NIC, migration will finish within 3 min. - -at source: -(qemu) migrate tcp:192.168.2.66:8881 -after saving ram complete -e9e725df678d392b1a83b3a917f332bb -qemu-system-x86_64: end ram md5 -(qemu) - -at destination: -...skip... -Completed load of VM with exit code 0 seq iteration 1264 -Completed load of VM with exit code 0 seq iteration 1265 -Completed load of VM with exit code 0 seq iteration 1266 -qemu-system-x86_64: after loading state section id 2(ram) -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 -qemu-system-x86_64: qemu_loadvm_state: after -cpu_synchronize_all_post_init - -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 - -This occurs occasionally and only at tcg machine. It seems that -some pages dirtied in source side don't transferred to destination. -This problem can be reproduced even if we disable virtio. - -Is it OK for some pages that not transferred to destination when do -migration ? Or is it a bug? -I'm pretty sure that means it's a bug. Hard to find though, I guess -at least memtest is smaller than a big OS. I think I'd dump the whole -of memory on both sides, hexdump and diff them - I'd guess it would -just be one byte/word different, maybe that would offer some idea what -wrote it. -Maybe one better way to do that is with the help of userfaultfd's -write-protect -capability. It is still in the development by Andrea Arcangeli, but there -is a RFC version available, please refer to -http://www.spinics.net/lists/linux-mm/msg97422.html -ï¼I'm developing live memory snapshot which based on it, maybe this is -another scene where we -can use userfaultfd's WP ;) ). -sounds good. - -thanks -Li -Dave -Any idea... - -=================md5 check patch============================= - -diff --git a/Makefile.target b/Makefile.target -index 962d004..e2cb8e9 100644 ---- a/Makefile.target -+++ b/Makefile.target -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o - obj-y += memory_mapping.o - obj-y += dump.o - obj-y += migration/ram.o migration/savevm.o --LIBS := $(libs_softmmu) $(LIBS) -+LIBS := $(libs_softmmu) $(LIBS) -lplumb - - # xen support - obj-$(CONFIG_XEN) += xen-common.o -diff --git a/migration/ram.c b/migration/ram.c -index 1eb155a..3b7a09d 100644 ---- a/migration/ram.c -+++ b/migration/ram.c -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -version_id) - } - - rcu_read_unlock(); -- DPRINTF("Completed load of VM with exit code %d seq iteration " -+ fprintf(stderr, "Completed load of VM with exit code %d seq -iteration " - "%" PRIu64 "\n", ret, seq_iter); - return ret; - } -diff --git a/migration/savevm.c b/migration/savevm.c -index 0ad1b93..3feaa61 100644 ---- a/migration/savevm.c -+++ b/migration/savevm.c -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) - - } - -+#include "exec/ram_addr.h" -+#include "qemu/rcu_queue.h" -+#include <clplumbing/md5.h> -+#ifndef MD5_DIGEST_LENGTH -+#define MD5_DIGEST_LENGTH 16 -+#endif -+ -+static void check_host_md5(void) -+{ -+ int i; -+ unsigned char md[MD5_DIGEST_LENGTH]; -+ rcu_read_lock(); -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -'pc.ram' block */ -+ rcu_read_unlock(); -+ -+ MD5(block->host, block->used_length, md); -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -+ fprintf(stderr, "%02x", md[i]); -+ } -+ fprintf(stderr, "\n"); -+ error_report("end ram md5"); -+} -+ - void qemu_savevm_state_begin(QEMUFile *f, - const MigrationParams *params) - { -@@ -1056,6 +1079,10 @@ void -qemu_savevm_state_complete_precopy(QEMUFile *f, -bool iterable_only) - save_section_header(f, se, QEMU_VM_SECTION_END); - - ret = se->ops->save_live_complete_precopy(f, se->opaque); -+ -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -+ check_host_md5(); -+ - trace_savevm_section_end(se->idstr, se->section_id, ret); - save_section_footer(f, se); - if (ret < 0) { -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -MigrationIncomingState *mis) - section_id, le->se->idstr); - return ret; - } -+ if (section_type == QEMU_VM_SECTION_END) { -+ error_report("after loading state section id %d(%s)", -+ section_id, le->se->idstr); -+ check_host_md5(); -+ } - if (!check_section_footer(f, le)) { - return -EINVAL; - } -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) - } - - cpu_synchronize_all_post_init(); -+ error_report("%s: after cpu_synchronize_all_post_init\n", -__func__); -+ check_host_md5(); - - return ret; - } --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -. -. --- -Best regards. -Li Zhijian (8555) - -On 12/03/2015 05:24 PM, Dr. David Alan Gilbert wrote: -* Li Zhijian (address@hidden) wrote: -Hi all, - -Does anyboday remember the similar issue post by hailiang months ago -http://patchwork.ozlabs.org/patch/454322/ -At least tow bugs about migration had been fixed since that. -Yes, I wondered what happened to that. -And now we found the same issue at the tcg vm(kvm is fine), after migration, -the content VM's memory is inconsistent. -Hmm, TCG only - I don't know much about that; but I guess something must -be accessing memory without using the proper macros/functions so -it doesn't mark it as dirty. -we add a patch to check memory content, you can find it from affix - -steps to reporduce: -1) apply the patch and re-build qemu -2) prepare the ubuntu guest and run memtest in grub. -soruce side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off - -destination side: -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 - -3) start migration -with 1000M NIC, migration will finish within 3 min. - -at source: -(qemu) migrate tcp:192.168.2.66:8881 -after saving ram complete -e9e725df678d392b1a83b3a917f332bb -qemu-system-x86_64: end ram md5 -(qemu) - -at destination: -...skip... -Completed load of VM with exit code 0 seq iteration 1264 -Completed load of VM with exit code 0 seq iteration 1265 -Completed load of VM with exit code 0 seq iteration 1266 -qemu-system-x86_64: after loading state section id 2(ram) -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 -qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init - -49c2dac7bde0e5e22db7280dcb3824f9 -qemu-system-x86_64: end ram md5 - -This occurs occasionally and only at tcg machine. It seems that -some pages dirtied in source side don't transferred to destination. -This problem can be reproduced even if we disable virtio. - -Is it OK for some pages that not transferred to destination when do -migration ? Or is it a bug? -I'm pretty sure that means it's a bug. Hard to find though, I guess -at least memtest is smaller than a big OS. I think I'd dump the whole -of memory on both sides, hexdump and diff them - I'd guess it would -just be one byte/word different, maybe that would offer some idea what -wrote it. -I try to dump and compare them, more than 10 pages are different. -in source side, they are random value rather than always 'FF' 'FB' 'EF' -'BF'... in destination. -and not all of the different pages are continuous. - -thanks -Li -Dave -Any idea... - -=================md5 check patch============================= - -diff --git a/Makefile.target b/Makefile.target -index 962d004..e2cb8e9 100644 ---- a/Makefile.target -+++ b/Makefile.target -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o - obj-y += memory_mapping.o - obj-y += dump.o - obj-y += migration/ram.o migration/savevm.o --LIBS := $(libs_softmmu) $(LIBS) -+LIBS := $(libs_softmmu) $(LIBS) -lplumb - - # xen support - obj-$(CONFIG_XEN) += xen-common.o -diff --git a/migration/ram.c b/migration/ram.c -index 1eb155a..3b7a09d 100644 ---- a/migration/ram.c -+++ b/migration/ram.c -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -version_id) - } - - rcu_read_unlock(); -- DPRINTF("Completed load of VM with exit code %d seq iteration " -+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " - "%" PRIu64 "\n", ret, seq_iter); - return ret; - } -diff --git a/migration/savevm.c b/migration/savevm.c -index 0ad1b93..3feaa61 100644 ---- a/migration/savevm.c -+++ b/migration/savevm.c -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) - - } - -+#include "exec/ram_addr.h" -+#include "qemu/rcu_queue.h" -+#include <clplumbing/md5.h> -+#ifndef MD5_DIGEST_LENGTH -+#define MD5_DIGEST_LENGTH 16 -+#endif -+ -+static void check_host_md5(void) -+{ -+ int i; -+ unsigned char md[MD5_DIGEST_LENGTH]; -+ rcu_read_lock(); -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -'pc.ram' block */ -+ rcu_read_unlock(); -+ -+ MD5(block->host, block->used_length, md); -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -+ fprintf(stderr, "%02x", md[i]); -+ } -+ fprintf(stderr, "\n"); -+ error_report("end ram md5"); -+} -+ - void qemu_savevm_state_begin(QEMUFile *f, - const MigrationParams *params) - { -@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, -bool iterable_only) - save_section_header(f, se, QEMU_VM_SECTION_END); - - ret = se->ops->save_live_complete_precopy(f, se->opaque); -+ -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -+ check_host_md5(); -+ - trace_savevm_section_end(se->idstr, se->section_id, ret); - save_section_footer(f, se); - if (ret < 0) { -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -MigrationIncomingState *mis) - section_id, le->se->idstr); - return ret; - } -+ if (section_type == QEMU_VM_SECTION_END) { -+ error_report("after loading state section id %d(%s)", -+ section_id, le->se->idstr); -+ check_host_md5(); -+ } - if (!check_section_footer(f, le)) { - return -EINVAL; - } -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) - } - - cpu_synchronize_all_post_init(); -+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -+ check_host_md5(); - - return ret; - } --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - - -. --- -Best regards. -Li Zhijian (8555) - -* Li Zhijian (address@hidden) wrote: -> -> -> -On 12/03/2015 05:24 PM, Dr. David Alan Gilbert wrote: -> ->* Li Zhijian (address@hidden) wrote: -> ->>Hi all, -> ->> -> ->>Does anyboday remember the similar issue post by hailiang months ago -> ->> -http://patchwork.ozlabs.org/patch/454322/ -> ->>At least tow bugs about migration had been fixed since that. -> -> -> ->Yes, I wondered what happened to that. -> -> -> ->>And now we found the same issue at the tcg vm(kvm is fine), after migration, -> ->>the content VM's memory is inconsistent. -> -> -> ->Hmm, TCG only - I don't know much about that; but I guess something must -> ->be accessing memory without using the proper macros/functions so -> ->it doesn't mark it as dirty. -> -> -> ->>we add a patch to check memory content, you can find it from affix -> ->> -> ->>steps to reporduce: -> ->>1) apply the patch and re-build qemu -> ->>2) prepare the ubuntu guest and run memtest in grub. -> ->>soruce side: -> ->>x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> ->>e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> ->>if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> ->>virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> ->>-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> ->>tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> ->>pc-i440fx-2.3,accel=tcg,usb=off -> ->> -> ->>destination side: -> ->>x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> ->>e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> ->>if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> ->>virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> ->>-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> ->>tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> ->>pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 -> ->> -> ->>3) start migration -> ->>with 1000M NIC, migration will finish within 3 min. -> ->> -> ->>at source: -> ->>(qemu) migrate tcp:192.168.2.66:8881 -> ->>after saving ram complete -> ->>e9e725df678d392b1a83b3a917f332bb -> ->>qemu-system-x86_64: end ram md5 -> ->>(qemu) -> ->> -> ->>at destination: -> ->>...skip... -> ->>Completed load of VM with exit code 0 seq iteration 1264 -> ->>Completed load of VM with exit code 0 seq iteration 1265 -> ->>Completed load of VM with exit code 0 seq iteration 1266 -> ->>qemu-system-x86_64: after loading state section id 2(ram) -> ->>49c2dac7bde0e5e22db7280dcb3824f9 -> ->>qemu-system-x86_64: end ram md5 -> ->>qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init -> ->> -> ->>49c2dac7bde0e5e22db7280dcb3824f9 -> ->>qemu-system-x86_64: end ram md5 -> ->> -> ->>This occurs occasionally and only at tcg machine. It seems that -> ->>some pages dirtied in source side don't transferred to destination. -> ->>This problem can be reproduced even if we disable virtio. -> ->> -> ->>Is it OK for some pages that not transferred to destination when do -> ->>migration ? Or is it a bug? -> -> -> ->I'm pretty sure that means it's a bug. Hard to find though, I guess -> ->at least memtest is smaller than a big OS. I think I'd dump the whole -> ->of memory on both sides, hexdump and diff them - I'd guess it would -> ->just be one byte/word different, maybe that would offer some idea what -> ->wrote it. -> -> -I try to dump and compare them, more than 10 pages are different. -> -in source side, they are random value rather than always 'FF' 'FB' 'EF' -> -'BF'... in destination. -> -> -and not all of the different pages are continuous. -I wonder if it happens on all of memtest's different test patterns, -perhaps it might be possible to narrow it down if you tell memtest -to only run one test at a time. - -Dave - -> -> -thanks -> -Li -> -> -> -> -> ->Dave -> -> -> ->>Any idea... -> ->> -> ->>=================md5 check patch============================= -> ->> -> ->>diff --git a/Makefile.target b/Makefile.target -> ->>index 962d004..e2cb8e9 100644 -> ->>--- a/Makefile.target -> ->>+++ b/Makefile.target -> ->>@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o -> ->> obj-y += memory_mapping.o -> ->> obj-y += dump.o -> ->> obj-y += migration/ram.o migration/savevm.o -> ->>-LIBS := $(libs_softmmu) $(LIBS) -> ->>+LIBS := $(libs_softmmu) $(LIBS) -lplumb -> ->> -> ->> # xen support -> ->> obj-$(CONFIG_XEN) += xen-common.o -> ->>diff --git a/migration/ram.c b/migration/ram.c -> ->>index 1eb155a..3b7a09d 100644 -> ->>--- a/migration/ram.c -> ->>+++ b/migration/ram.c -> ->>@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int -> ->>version_id) -> ->> } -> ->> -> ->> rcu_read_unlock(); -> ->>- DPRINTF("Completed load of VM with exit code %d seq iteration " -> ->>+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " -> ->> "%" PRIu64 "\n", ret, seq_iter); -> ->> return ret; -> ->> } -> ->>diff --git a/migration/savevm.c b/migration/savevm.c -> ->>index 0ad1b93..3feaa61 100644 -> ->>--- a/migration/savevm.c -> ->>+++ b/migration/savevm.c -> ->>@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) -> ->> -> ->> } -> ->> -> ->>+#include "exec/ram_addr.h" -> ->>+#include "qemu/rcu_queue.h" -> ->>+#include <clplumbing/md5.h> -> ->>+#ifndef MD5_DIGEST_LENGTH -> ->>+#define MD5_DIGEST_LENGTH 16 -> ->>+#endif -> ->>+ -> ->>+static void check_host_md5(void) -> ->>+{ -> ->>+ int i; -> ->>+ unsigned char md[MD5_DIGEST_LENGTH]; -> ->>+ rcu_read_lock(); -> ->>+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -> ->>'pc.ram' block */ -> ->>+ rcu_read_unlock(); -> ->>+ -> ->>+ MD5(block->host, block->used_length, md); -> ->>+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -> ->>+ fprintf(stderr, "%02x", md[i]); -> ->>+ } -> ->>+ fprintf(stderr, "\n"); -> ->>+ error_report("end ram md5"); -> ->>+} -> ->>+ -> ->> void qemu_savevm_state_begin(QEMUFile *f, -> ->> const MigrationParams *params) -> ->> { -> ->>@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, -> ->>bool iterable_only) -> ->> save_section_header(f, se, QEMU_VM_SECTION_END); -> ->> -> ->> ret = se->ops->save_live_complete_precopy(f, se->opaque); -> ->>+ -> ->>+ fprintf(stderr, "after saving %s complete\n", se->idstr); -> ->>+ check_host_md5(); -> ->>+ -> ->> trace_savevm_section_end(se->idstr, se->section_id, ret); -> ->> save_section_footer(f, se); -> ->> if (ret < 0) { -> ->>@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -> ->>MigrationIncomingState *mis) -> ->> section_id, le->se->idstr); -> ->> return ret; -> ->> } -> ->>+ if (section_type == QEMU_VM_SECTION_END) { -> ->>+ error_report("after loading state section id %d(%s)", -> ->>+ section_id, le->se->idstr); -> ->>+ check_host_md5(); -> ->>+ } -> ->> if (!check_section_footer(f, le)) { -> ->> return -EINVAL; -> ->> } -> ->>@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) -> ->> } -> ->> -> ->> cpu_synchronize_all_post_init(); -> ->>+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -> ->>+ check_host_md5(); -> ->> -> ->> return ret; -> ->> } -> ->> -> ->> -> ->> -> ->-- -> ->Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> -> -> -> ->. -> -> -> -> --- -> -Best regards. -> -Li Zhijian (8555) -> -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -Li Zhijian <address@hidden> wrote: -> -Hi all, -> -> -Does anyboday remember the similar issue post by hailiang months ago -> -http://patchwork.ozlabs.org/patch/454322/ -> -At least tow bugs about migration had been fixed since that. -> -> -And now we found the same issue at the tcg vm(kvm is fine), after -> -migration, the content VM's memory is inconsistent. -> -> -we add a patch to check memory content, you can find it from affix -> -> -steps to reporduce: -> -1) apply the patch and re-build qemu -> -2) prepare the ubuntu guest and run memtest in grub. -> -soruce side: -> -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> -pc-i440fx-2.3,accel=tcg,usb=off -> -> -destination side: -> -x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device -> -e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive -> -if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device -> -virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -> --vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp -> -tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine -> -pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881 -> -> -3) start migration -> -with 1000M NIC, migration will finish within 3 min. -> -> -at source: -> -(qemu) migrate tcp:192.168.2.66:8881 -> -after saving ram complete -> -e9e725df678d392b1a83b3a917f332bb -> -qemu-system-x86_64: end ram md5 -> -(qemu) -> -> -at destination: -> -...skip... -> -Completed load of VM with exit code 0 seq iteration 1264 -> -Completed load of VM with exit code 0 seq iteration 1265 -> -Completed load of VM with exit code 0 seq iteration 1266 -> -qemu-system-x86_64: after loading state section id 2(ram) -> -49c2dac7bde0e5e22db7280dcb3824f9 -> -qemu-system-x86_64: end ram md5 -> -qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init -> -> -49c2dac7bde0e5e22db7280dcb3824f9 -> -qemu-system-x86_64: end ram md5 -> -> -This occurs occasionally and only at tcg machine. It seems that -> -some pages dirtied in source side don't transferred to destination. -> -This problem can be reproduced even if we disable virtio. -> -> -Is it OK for some pages that not transferred to destination when do -> -migration ? Or is it a bug? -> -> -Any idea... -Thanks for describing how to reproduce the bug. -If some pages are not transferred to destination then it is a bug, so we -need to know what the problem is, notice that the problem can be that -TCG is not marking dirty some page, that Migration code "forgets" about -that page, or anything eles altogether, that is what we need to find. - -There are more posibilities, I am not sure that memtest is on 32bit -mode, and it is inside posibility that we are missing some state when we -are on real mode. - -Will try to take a look at this. - -THanks, again. - - -> -> -=================md5 check patch============================= -> -> -diff --git a/Makefile.target b/Makefile.target -> -index 962d004..e2cb8e9 100644 -> ---- a/Makefile.target -> -+++ b/Makefile.target -> -@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o -> -obj-y += memory_mapping.o -> -obj-y += dump.o -> -obj-y += migration/ram.o migration/savevm.o -> --LIBS := $(libs_softmmu) $(LIBS) -> -+LIBS := $(libs_softmmu) $(LIBS) -lplumb -> -> -# xen support -> -obj-$(CONFIG_XEN) += xen-common.o -> -diff --git a/migration/ram.c b/migration/ram.c -> -index 1eb155a..3b7a09d 100644 -> ---- a/migration/ram.c -> -+++ b/migration/ram.c -> -@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, -> -int version_id) -> -} -> -> -rcu_read_unlock(); -> -- DPRINTF("Completed load of VM with exit code %d seq iteration " -> -+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration " -> -"%" PRIu64 "\n", ret, seq_iter); -> -return ret; -> -} -> -diff --git a/migration/savevm.c b/migration/savevm.c -> -index 0ad1b93..3feaa61 100644 -> ---- a/migration/savevm.c -> -+++ b/migration/savevm.c -> -@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f) -> -> -} -> -> -+#include "exec/ram_addr.h" -> -+#include "qemu/rcu_queue.h" -> -+#include <clplumbing/md5.h> -> -+#ifndef MD5_DIGEST_LENGTH -> -+#define MD5_DIGEST_LENGTH 16 -> -+#endif -> -+ -> -+static void check_host_md5(void) -> -+{ -> -+ int i; -> -+ unsigned char md[MD5_DIGEST_LENGTH]; -> -+ rcu_read_lock(); -> -+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check -> -'pc.ram' block */ -> -+ rcu_read_unlock(); -> -+ -> -+ MD5(block->host, block->used_length, md); -> -+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) { -> -+ fprintf(stderr, "%02x", md[i]); -> -+ } -> -+ fprintf(stderr, "\n"); -> -+ error_report("end ram md5"); -> -+} -> -+ -> -void qemu_savevm_state_begin(QEMUFile *f, -> -const MigrationParams *params) -> -{ -> -@@ -1056,6 +1079,10 @@ void -> -qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only) -> -save_section_header(f, se, QEMU_VM_SECTION_END); -> -> -ret = se->ops->save_live_complete_precopy(f, se->opaque); -> -+ -> -+ fprintf(stderr, "after saving %s complete\n", se->idstr); -> -+ check_host_md5(); -> -+ -> -trace_savevm_section_end(se->idstr, se->section_id, ret); -> -save_section_footer(f, se); -> -if (ret < 0) { -> -@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f, -> -MigrationIncomingState *mis) -> -section_id, le->se->idstr); -> -return ret; -> -} -> -+ if (section_type == QEMU_VM_SECTION_END) { -> -+ error_report("after loading state section id %d(%s)", -> -+ section_id, le->se->idstr); -> -+ check_host_md5(); -> -+ } -> -if (!check_section_footer(f, le)) { -> -return -EINVAL; -> -} -> -@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f) -> -} -> -> -cpu_synchronize_all_post_init(); -> -+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__); -> -+ check_host_md5(); -> -> -return ret; -> -} - -> -> -Thanks for describing how to reproduce the bug. -> -If some pages are not transferred to destination then it is a bug, so we need -> -to know what the problem is, notice that the problem can be that TCG is not -> -marking dirty some page, that Migration code "forgets" about that page, or -> -anything eles altogether, that is what we need to find. -> -> -There are more posibilities, I am not sure that memtest is on 32bit mode, and -> -it is inside posibility that we are missing some state when we are on real -> -mode. -> -> -Will try to take a look at this. -> -> -THanks, again. -> -Hi Juan & Amit - - Do you think we should add a mechanism to check the data integrity during LM -like Zhijian's patch did? it may be very helpful for developers. - Actually, I did the similar thing before in order to make sure that I did the -right thing we I change the code related to LM. - -Liang - -On (Fri) 04 Dec 2015 [01:43:07], Li, Liang Z wrote: -> -> -> -> Thanks for describing how to reproduce the bug. -> -> If some pages are not transferred to destination then it is a bug, so we -> -> need -> -> to know what the problem is, notice that the problem can be that TCG is not -> -> marking dirty some page, that Migration code "forgets" about that page, or -> -> anything eles altogether, that is what we need to find. -> -> -> -> There are more posibilities, I am not sure that memtest is on 32bit mode, -> -> and -> -> it is inside posibility that we are missing some state when we are on real -> -> mode. -> -> -> -> Will try to take a look at this. -> -> -> -> THanks, again. -> -> -> -> -Hi Juan & Amit -> -> -Do you think we should add a mechanism to check the data integrity during LM -> -like Zhijian's patch did? it may be very helpful for developers. -> -Actually, I did the similar thing before in order to make sure that I did -> -the right thing we I change the code related to LM. -If you mean for debugging, something that's not always on, then I'm -fine with it. - -A script that goes along that shows the result of comparison of the -diff will be helpful too, something that shows how many pages are -differnt, how many bytes in a page on average, and so on. - - Amit - diff --git a/results/classifier/02/mistranslation/74545755 b/results/classifier/02/mistranslation/74545755 deleted file mode 100644 index d1694069c..000000000 --- a/results/classifier/02/mistranslation/74545755 +++ /dev/null @@ -1,345 +0,0 @@ -mistranslation: 0.752 -instruction: 0.700 -other: 0.683 -semantic: 0.669 -boot: 0.607 - -[Bug Report][RFC PATCH 0/1] block: fix failing assert on paused VM migration - -There's a bug (failing assert) which is reproduced during migration of -a paused VM. I am able to reproduce it on a stand with 2 nodes and a common -NFS share, with VM's disk on that share. - -root@fedora40-1-vm:~# virsh domblklist alma8-vm - Target Source ------------------------------------------- - sda /mnt/shared/images/alma8.qcow2 - -root@fedora40-1-vm:~# df -Th /mnt/shared -Filesystem Type Size Used Avail Use% Mounted on -127.0.0.1:/srv/nfsd nfs4 63G 16G 48G 25% /mnt/shared - -On the 1st node: - -root@fedora40-1-vm:~# virsh start alma8-vm ; virsh suspend alma8-vm -root@fedora40-1-vm:~# virsh migrate --compressed --p2p --persistent ---undefinesource --live alma8-vm qemu+ssh://fedora40-2-vm/system - -Then on the 2nd node: - -root@fedora40-2-vm:~# virsh migrate --compressed --p2p --persistent ---undefinesource --live alma8-vm qemu+ssh://fedora40-1-vm/system -error: operation failed: domain is not running - -root@fedora40-2-vm:~# tail -3 /var/log/libvirt/qemu/alma8-vm.log -2024-09-19 13:53:33.336+0000: initiating migration -qemu-system-x86_64: ../block.c:6976: int -bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & -BDRV_O_INACTIVE)' failed. -2024-09-19 13:53:42.991+0000: shutting down, reason=crashed - -Backtrace: - -(gdb) bt -#0 0x00007f7eaa2f1664 in __pthread_kill_implementation () at /lib64/libc.so.6 -#1 0x00007f7eaa298c4e in raise () at /lib64/libc.so.6 -#2 0x00007f7eaa280902 in abort () at /lib64/libc.so.6 -#3 0x00007f7eaa28081e in __assert_fail_base.cold () at /lib64/libc.so.6 -#4 0x00007f7eaa290d87 in __assert_fail () at /lib64/libc.so.6 -#5 0x0000563c38b95eb8 in bdrv_inactivate_recurse (bs=0x563c3b6c60c0) at -../block.c:6976 -#6 0x0000563c38b95aeb in bdrv_inactivate_all () at ../block.c:7038 -#7 0x0000563c3884d354 in qemu_savevm_state_complete_precopy_non_iterable -(f=0x563c3b700c20, in_postcopy=false, inactivate_disks=true) - at ../migration/savevm.c:1571 -#8 0x0000563c3884dc1a in qemu_savevm_state_complete_precopy (f=0x563c3b700c20, -iterable_only=false, inactivate_disks=true) at ../migration/savevm.c:1631 -#9 0x0000563c3883a340 in migration_completion_precopy (s=0x563c3b4d51f0, -current_active_state=<optimized out>) at ../migration/migration.c:2780 -#10 migration_completion (s=0x563c3b4d51f0) at ../migration/migration.c:2844 -#11 migration_iteration_run (s=0x563c3b4d51f0) at ../migration/migration.c:3270 -#12 migration_thread (opaque=0x563c3b4d51f0) at ../migration/migration.c:3536 -#13 0x0000563c38dbcf14 in qemu_thread_start (args=0x563c3c2d5bf0) at -../util/qemu-thread-posix.c:541 -#14 0x00007f7eaa2ef6d7 in start_thread () at /lib64/libc.so.6 -#15 0x00007f7eaa373414 in clone () at /lib64/libc.so.6 - -What happens here is that after 1st migration BDS related to HDD remains -inactive as VM is still paused. Then when we initiate 2nd migration, -bdrv_inactivate_all() leads to the attempt to set BDRV_O_INACTIVE flag -on that node which is already set, thus assert fails. - -Attached patch which simply skips setting flag if it's already set is more -of a kludge than a clean solution. Should we use more sophisticated logic -which allows some of the nodes be in inactive state prior to the migration, -and takes them into account during bdrv_inactivate_all()? Comments would -be appreciated. - -Andrey - -Andrey Drobyshev (1): - block: do not fail when inactivating node which is inactive - - block.c | 10 +++++++++- - 1 file changed, 9 insertions(+), 1 deletion(-) - --- -2.39.3 - -Instead of throwing an assert let's just ignore that flag is already set -and return. We assume that it's going to be safe to ignore. Otherwise -this assert fails when migrating a paused VM back and forth. - -Ideally we'd like to have a more sophisticated solution, e.g. not even -scan the nodes which should be inactive at this point. - -Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> ---- - block.c | 10 +++++++++- - 1 file changed, 9 insertions(+), 1 deletion(-) - -diff --git a/block.c b/block.c -index 7d90007cae..c1dcf906d1 100644 ---- a/block.c -+++ b/block.c -@@ -6973,7 +6973,15 @@ static int GRAPH_RDLOCK -bdrv_inactivate_recurse(BlockDriverState *bs) - return 0; - } - -- assert(!(bs->open_flags & BDRV_O_INACTIVE)); -+ if (bs->open_flags & BDRV_O_INACTIVE) { -+ /* -+ * Return here instead of throwing assert as a workaround to -+ * prevent failure on migrating paused VM. -+ * Here we assume that if we're trying to inactivate BDS that's -+ * already inactive, it's safe to just ignore it. -+ */ -+ return 0; -+ } - - /* Inactivate this node */ - if (bs->drv->bdrv_inactivate) { --- -2.39.3 - -[add migration maintainers] - -On 24.09.24 15:56, Andrey Drobyshev wrote: -Instead of throwing an assert let's just ignore that flag is already set -and return. We assume that it's going to be safe to ignore. Otherwise -this assert fails when migrating a paused VM back and forth. - -Ideally we'd like to have a more sophisticated solution, e.g. not even -scan the nodes which should be inactive at this point. - -Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> ---- - block.c | 10 +++++++++- - 1 file changed, 9 insertions(+), 1 deletion(-) - -diff --git a/block.c b/block.c -index 7d90007cae..c1dcf906d1 100644 ---- a/block.c -+++ b/block.c -@@ -6973,7 +6973,15 @@ static int GRAPH_RDLOCK -bdrv_inactivate_recurse(BlockDriverState *bs) - return 0; - } -- assert(!(bs->open_flags & BDRV_O_INACTIVE)); -+ if (bs->open_flags & BDRV_O_INACTIVE) { -+ /* -+ * Return here instead of throwing assert as a workaround to -+ * prevent failure on migrating paused VM. -+ * Here we assume that if we're trying to inactivate BDS that's -+ * already inactive, it's safe to just ignore it. -+ */ -+ return 0; -+ } -/* Inactivate this node */ -if (bs->drv->bdrv_inactivate) { -I doubt that this a correct way to go. - -As far as I understand, "inactive" actually means that "storage is not belong to -qemu, but to someone else (another qemu process for example), and may be changed -transparently". In turn this means that Qemu should do nothing with inactive disks. So the -problem is that nobody called bdrv_activate_all on target, and we shouldn't ignore that. - -Hmm, I see in process_incoming_migration_bh() we do call bdrv_activate_all(), -but only in some scenarios. May be, the condition should be less strict here. - -Why we need any condition here at all? Don't we want to activate block-layer on -target after migration anyway? - --- -Best regards, -Vladimir - -On 9/30/24 12:25 PM, Vladimir Sementsov-Ogievskiy wrote: -> -[add migration maintainers] -> -> -On 24.09.24 15:56, Andrey Drobyshev wrote: -> -> [...] -> -> -I doubt that this a correct way to go. -> -> -As far as I understand, "inactive" actually means that "storage is not -> -belong to qemu, but to someone else (another qemu process for example), -> -and may be changed transparently". In turn this means that Qemu should -> -do nothing with inactive disks. So the problem is that nobody called -> -bdrv_activate_all on target, and we shouldn't ignore that. -> -> -Hmm, I see in process_incoming_migration_bh() we do call -> -bdrv_activate_all(), but only in some scenarios. May be, the condition -> -should be less strict here. -> -> -Why we need any condition here at all? Don't we want to activate -> -block-layer on target after migration anyway? -> -Hmm I'm not sure about the unconditional activation, since we at least -have to honor LATE_BLOCK_ACTIVATE cap if it's set (and probably delay it -in such a case). In current libvirt upstream I see such code: - -> -/* Migration capabilities which should always be enabled as long as they -> -> -* are supported by QEMU. If the capability is supposed to be enabled on both -> -> -* sides of migration, it won't be enabled unless both sides support it. -> -> -*/ -> -> -static const qemuMigrationParamsAlwaysOnItem qemuMigrationParamsAlwaysOn[] = -> -{ -> -> -{QEMU_MIGRATION_CAP_PAUSE_BEFORE_SWITCHOVER, -> -> -QEMU_MIGRATION_SOURCE}, -> -> -> -> -{QEMU_MIGRATION_CAP_LATE_BLOCK_ACTIVATE, -> -> -QEMU_MIGRATION_DESTINATION}, -> -> -}; -which means that libvirt always wants LATE_BLOCK_ACTIVATE to be set. - -The code from process_incoming_migration_bh() you're referring to: - -> -/* If capability late_block_activate is set: -> -> -* Only fire up the block code now if we're going to restart the -> -> -* VM, else 'cont' will do it. -> -> -* This causes file locking to happen; so we don't want it to happen -> -> -* unless we really are starting the VM. -> -> -*/ -> -> -if (!migrate_late_block_activate() || -> -> -(autostart && (!global_state_received() || -> -> -runstate_is_live(global_state_get_runstate())))) { -> -> -/* Make sure all file formats throw away their mutable metadata. -> -> -> -* If we get an error here, just don't restart the VM yet. */ -> -> -bdrv_activate_all(&local_err); -> -> -if (local_err) { -> -> -error_report_err(local_err); -> -> -local_err = NULL; -> -> -autostart = false; -> -> -} -> -> -} -It states explicitly that we're either going to start VM right at this -point if (autostart == true), or we wait till "cont" command happens. -None of this is going to happen if we start another migration while -still being in PAUSED state. So I think it seems reasonable to take -such case into account. For instance, this patch does prevent the crash: - -> -diff --git a/migration/migration.c b/migration/migration.c -> -index ae2be31557..3222f6745b 100644 -> ---- a/migration/migration.c -> -+++ b/migration/migration.c -> -@@ -733,7 +733,8 @@ static void process_incoming_migration_bh(void *opaque) -> -*/ -> -if (!migrate_late_block_activate() || -> -(autostart && (!global_state_received() || -> -- runstate_is_live(global_state_get_runstate())))) { -> -+ runstate_is_live(global_state_get_runstate()))) || -> -+ (!autostart && global_state_get_runstate() == RUN_STATE_PAUSED)) { -> -/* Make sure all file formats throw away their mutable metadata. -> -* If we get an error here, just don't restart the VM yet. */ -> -bdrv_activate_all(&local_err); -What are your thoughts on it? - -Andrey - diff --git a/results/classifier/02/mistranslation/80604314 b/results/classifier/02/mistranslation/80604314 deleted file mode 100644 index 8d80d133f..000000000 --- a/results/classifier/02/mistranslation/80604314 +++ /dev/null @@ -1,1481 +0,0 @@ -mistranslation: 0.922 -other: 0.898 -semantic: 0.890 -instruction: 0.877 -boot: 0.860 - -[BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device - -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) - -Starting qemu with no additional "-device virtio-net-ccw" (i.e., only -the autogenerated virtio-net-ccw device is present) works. Specifying -several "-device virtio-net-pci" works as well. - -Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net -client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config") -works (in-between state does not compile). - -This is reproducible with tcg as well. Same problem both with ---enable-vhost-vdpa and --disable-vhost-vdpa. - -Have not yet tried to figure out what might be special with -virtio-ccw... anyone have an idea? - -[This should probably be considered a blocker?] - -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> -When I start qemu with a second virtio-net-ccw device (i.e. adding -> --device virtio-net-ccw in addition to the autogenerated device), I get -> -a segfault. gdb points to -> -> -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> -config=0x55d6ad9e3f80 "RT") at -> -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> -(backtrace doesn't go further) -> -> -Starting qemu with no additional "-device virtio-net-ccw" (i.e., only -> -the autogenerated virtio-net-ccw device is present) works. Specifying -> -several "-device virtio-net-pci" works as well. -> -> -Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net -> -client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config") -> -works (in-between state does not compile). -Ouch. I didn't test all in-between states :( -But I wish we had a 0-day instrastructure like kernel has, -that catches things like that. - -> -This is reproducible with tcg as well. Same problem both with -> ---enable-vhost-vdpa and --disable-vhost-vdpa. -> -> -Have not yet tried to figure out what might be special with -> -virtio-ccw... anyone have an idea? -> -> -[This should probably be considered a blocker?] - -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin" <mst@redhat.com> wrote: - -> -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> -> When I start qemu with a second virtio-net-ccw device (i.e. adding -> -> -device virtio-net-ccw in addition to the autogenerated device), I get -> -> a segfault. gdb points to -> -> -> -> #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> -> config=0x55d6ad9e3f80 "RT") at -> -> /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -> 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> -> -> (backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. - -> -> -> -> Starting qemu with no additional "-device virtio-net-ccw" (i.e., only -> -> the autogenerated virtio-net-ccw device is present) works. Specifying -> -> several "-device virtio-net-pci" works as well. -> -> -> -> Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net -> -> client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config") -> -> works (in-between state does not compile). -> -> -Ouch. I didn't test all in-between states :( -> -But I wish we had a 0-day instrastructure like kernel has, -> -that catches things like that. -Yep, that would be useful... so patchew only builds the complete series? - -> -> -> This is reproducible with tcg as well. Same problem both with -> -> --enable-vhost-vdpa and --disable-vhost-vdpa. -> -> -> -> Have not yet tried to figure out what might be special with -> -> virtio-ccw... anyone have an idea? -> -> -> -> [This should probably be considered a blocker?] -I think so, as it makes s390x unusable with more that one -virtio-net-ccw device, and I don't even see a workaround. - -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> -On Fri, 24 Jul 2020 09:30:58 -0400 -> -"Michael S. Tsirkin" <mst@redhat.com> wrote: -> -> -> On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> -> > When I start qemu with a second virtio-net-ccw device (i.e. adding -> -> > -device virtio-net-ccw in addition to the autogenerated device), I get -> -> > a segfault. gdb points to -> -> > -> -> > #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> -> > config=0x55d6ad9e3f80 "RT") at -> -> > /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -> > 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> > -> -> > (backtrace doesn't go further) -> -> -The core was incomplete, but running under gdb directly shows that it -> -is just a bog-standard config space access (first for that device). -> -> -The cause of the crash is that nc->peer is not set... no idea how that -> -can happen, not that familiar with that part of QEMU. (Should the code -> -check, or is that really something that should not happen?) -> -> -What I don't understand is why it is set correctly for the first, -> -autogenerated virtio-net-ccw device, but not for the second one, and -> -why virtio-net-pci doesn't show these problems. The only difference -> -between -ccw and -pci that comes to my mind here is that config space -> -accesses for ccw are done via an asynchronous operation, so timing -> -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? - -> -> > -> -> > Starting qemu with no additional "-device virtio-net-ccw" (i.e., only -> -> > the autogenerated virtio-net-ccw device is present) works. Specifying -> -> > several "-device virtio-net-pci" works as well. -> -> > -> -> > Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net -> -> > client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config") -> -> > works (in-between state does not compile). -> -> -> -> Ouch. I didn't test all in-between states :( -> -> But I wish we had a 0-day instrastructure like kernel has, -> -> that catches things like that. -> -> -Yep, that would be useful... so patchew only builds the complete series? -> -> -> -> -> > This is reproducible with tcg as well. Same problem both with -> -> > --enable-vhost-vdpa and --disable-vhost-vdpa. -> -> > -> -> > Have not yet tried to figure out what might be special with -> -> > virtio-ccw... anyone have an idea? -> -> > -> -> > [This should probably be considered a blocker?] -> -> -I think so, as it makes s390x unusable with more that one -> -virtio-net-ccw device, and I don't even see a workaround. - -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin" <mst@redhat.com> wrote: - -> -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> -> On Fri, 24 Jul 2020 09:30:58 -0400 -> -> "Michael S. Tsirkin" <mst@redhat.com> wrote: -> -> -> -> > On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> -> > > When I start qemu with a second virtio-net-ccw device (i.e. adding -> -> > > -device virtio-net-ccw in addition to the autogenerated device), I get -> -> > > a segfault. gdb points to -> -> > > -> -> > > #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> -> > > config=0x55d6ad9e3f80 "RT") at -> -> > > /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -> > > 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> > > -> -> > > (backtrace doesn't go further) -> -> -> -> The core was incomplete, but running under gdb directly shows that it -> -> is just a bog-standard config space access (first for that device). -> -> -> -> The cause of the crash is that nc->peer is not set... no idea how that -> -> can happen, not that familiar with that part of QEMU. (Should the code -> -> check, or is that really something that should not happen?) -> -> -> -> What I don't understand is why it is set correctly for the first, -> -> autogenerated virtio-net-ccw device, but not for the second one, and -> -> why virtio-net-pci doesn't show these problems. The only difference -> -> between -ccw and -pci that comes to my mind here is that config space -> -> accesses for ccw are done via an asynchronous operation, so timing -> -> might be different. -> -> -Hopefully Jason has an idea. Could you post a full command line -> -please? Do you need a working guest to trigger this? Does this trigger -> -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 - --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) - -> -> -> > > -> -> > > Starting qemu with no additional "-device virtio-net-ccw" (i.e., only -> -> > > the autogenerated virtio-net-ccw device is present) works. Specifying -> -> > > several "-device virtio-net-pci" works as well. -> -> > > -> -> > > Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net -> -> > > client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config") -> -> > > works (in-between state does not compile). -> -> > -> -> > Ouch. I didn't test all in-between states :( -> -> > But I wish we had a 0-day instrastructure like kernel has, -> -> > that catches things like that. -> -> -> -> Yep, that would be useful... so patchew only builds the complete series? -> -> -> -> > -> -> > > This is reproducible with tcg as well. Same problem both with -> -> > > --enable-vhost-vdpa and --disable-vhost-vdpa. -> -> > > -> -> > > Have not yet tried to figure out what might be special with -> -> > > virtio-ccw... anyone have an idea? -> -> > > -> -> > > [This should probably be considered a blocker?] -> -> -> -> I think so, as it makes s390x unusable with more that one -> -> virtio-net-ccw device, and I don't even see a workaround. -> - -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) -It looks to me we forget the check the existence of peer. - -Please try the attached patch to see if it works. - -Thanks -0001-virtio-net-check-the-existence-of-peer-before-accesi.patch -Description: -Text Data - -On Sat, 25 Jul 2020 08:40:07 +0800 -Jason Wang <jasowang@redhat.com> wrote: - -> -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -> -> On Fri, 24 Jul 2020 11:17:57 -0400 -> -> "Michael S. Tsirkin"<mst@redhat.com> wrote: -> -> -> ->> On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> ->>> On Fri, 24 Jul 2020 09:30:58 -0400 -> ->>> "Michael S. Tsirkin"<mst@redhat.com> wrote: -> ->>> -> ->>>> On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> ->>>>> When I start qemu with a second virtio-net-ccw device (i.e. adding -> ->>>>> -device virtio-net-ccw in addition to the autogenerated device), I get -> ->>>>> a segfault. gdb points to -> ->>>>> -> ->>>>> #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> ->>>>> config=0x55d6ad9e3f80 "RT") at -> ->>>>> /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> ->>>>> 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> ->>>>> -> ->>>>> (backtrace doesn't go further) -> ->>> The core was incomplete, but running under gdb directly shows that it -> ->>> is just a bog-standard config space access (first for that device). -> ->>> -> ->>> The cause of the crash is that nc->peer is not set... no idea how that -> ->>> can happen, not that familiar with that part of QEMU. (Should the code -> ->>> check, or is that really something that should not happen?) -> ->>> -> ->>> What I don't understand is why it is set correctly for the first, -> ->>> autogenerated virtio-net-ccw device, but not for the second one, and -> ->>> why virtio-net-pci doesn't show these problems. The only difference -> ->>> between -ccw and -pci that comes to my mind here is that config space -> ->>> accesses for ccw are done via an asynchronous operation, so timing -> ->>> might be different. -> ->> Hopefully Jason has an idea. Could you post a full command line -> ->> please? Do you need a working guest to trigger this? Does this trigger -> ->> on an x86 host? -> -> Yes, it does trigger with tcg-on-x86 as well. I've been using -> -> -> -> s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu -> -> qemu,zpci=on -> -> -m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 -> -> -drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 -> -> -device -> -> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -> -> -device virtio-net-ccw -> -> -> -> It seems it needs the guest actually doing something with the nics; I -> -> cannot reproduce the crash if I use the old advent calendar moon buggy -> -> image and just add a virtio-net-ccw device. -> -> -> -> (I don't think it's a problem with my local build, as I see the problem -> -> both on my laptop and on an LPAR.) -> -> -> -It looks to me we forget the check the existence of peer. -> -> -Please try the attached patch to see if it works. -Thanks, that patch gets my guest up and running again. So, FWIW, - -Tested-by: Cornelia Huck <cohuck@redhat.com> - -Any idea why this did not hit with virtio-net-pci (or the autogenerated -virtio-net-ccw device)? - -On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -On Sat, 25 Jul 2020 08:40:07 +0800 -Jason Wang <jasowang@redhat.com> wrote: -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) -It looks to me we forget the check the existence of peer. - -Please try the attached patch to see if it works. -Thanks, that patch gets my guest up and running again. So, FWIW, - -Tested-by: Cornelia Huck <cohuck@redhat.com> - -Any idea why this did not hit with virtio-net-pci (or the autogenerated -virtio-net-ccw device)? -It can be hit with virtio-net-pci as well (just start without peer). -For autogenerated virtio-net-cww, I think the reason is that it has -already had a peer set. -Thanks - -On Mon, 27 Jul 2020 15:38:12 +0800 -Jason Wang <jasowang@redhat.com> wrote: - -> -On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -> -> On Sat, 25 Jul 2020 08:40:07 +0800 -> -> Jason Wang <jasowang@redhat.com> wrote: -> -> -> ->> On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -> ->>> On Fri, 24 Jul 2020 11:17:57 -0400 -> ->>> "Michael S. Tsirkin"<mst@redhat.com> wrote: -> ->>> -> ->>>> On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> ->>>>> On Fri, 24 Jul 2020 09:30:58 -0400 -> ->>>>> "Michael S. Tsirkin"<mst@redhat.com> wrote: -> ->>>>> -> ->>>>>> On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> ->>>>>>> When I start qemu with a second virtio-net-ccw device (i.e. adding -> ->>>>>>> -device virtio-net-ccw in addition to the autogenerated device), I get -> ->>>>>>> a segfault. gdb points to -> ->>>>>>> -> ->>>>>>> #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, -> ->>>>>>> config=0x55d6ad9e3f80 "RT") at -> ->>>>>>> /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> ->>>>>>> 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { -> ->>>>>>> -> ->>>>>>> (backtrace doesn't go further) -> ->>>>> The core was incomplete, but running under gdb directly shows that it -> ->>>>> is just a bog-standard config space access (first for that device). -> ->>>>> -> ->>>>> The cause of the crash is that nc->peer is not set... no idea how that -> ->>>>> can happen, not that familiar with that part of QEMU. (Should the code -> ->>>>> check, or is that really something that should not happen?) -> ->>>>> -> ->>>>> What I don't understand is why it is set correctly for the first, -> ->>>>> autogenerated virtio-net-ccw device, but not for the second one, and -> ->>>>> why virtio-net-pci doesn't show these problems. The only difference -> ->>>>> between -ccw and -pci that comes to my mind here is that config space -> ->>>>> accesses for ccw are done via an asynchronous operation, so timing -> ->>>>> might be different. -> ->>>> Hopefully Jason has an idea. Could you post a full command line -> ->>>> please? Do you need a working guest to trigger this? Does this trigger -> ->>>> on an x86 host? -> ->>> Yes, it does trigger with tcg-on-x86 as well. I've been using -> ->>> -> ->>> s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu -> ->>> qemu,zpci=on -> ->>> -m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 -> ->>> -drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 -> ->>> -device -> ->>> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -> ->>> -device virtio-net-ccw -> ->>> -> ->>> It seems it needs the guest actually doing something with the nics; I -> ->>> cannot reproduce the crash if I use the old advent calendar moon buggy -> ->>> image and just add a virtio-net-ccw device. -> ->>> -> ->>> (I don't think it's a problem with my local build, as I see the problem -> ->>> both on my laptop and on an LPAR.) -> ->> -> ->> It looks to me we forget the check the existence of peer. -> ->> -> ->> Please try the attached patch to see if it works. -> -> Thanks, that patch gets my guest up and running again. So, FWIW, -> -> -> -> Tested-by: Cornelia Huck <cohuck@redhat.com> -> -> -> -> Any idea why this did not hit with virtio-net-pci (or the autogenerated -> -> virtio-net-ccw device)? -> -> -> -It can be hit with virtio-net-pci as well (just start without peer). -Hm, I had not been able to reproduce the crash with a 'naked' -device -virtio-net-pci. But checking seems to be the right idea anyway. - -> -> -For autogenerated virtio-net-cww, I think the reason is that it has -> -already had a peer set. -Ok, that might well be. - -On 2020/7/27 ä¸å4:41, Cornelia Huck wrote: -On Mon, 27 Jul 2020 15:38:12 +0800 -Jason Wang <jasowang@redhat.com> wrote: -On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -On Sat, 25 Jul 2020 08:40:07 +0800 -Jason Wang <jasowang@redhat.com> wrote: -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) -It looks to me we forget the check the existence of peer. - -Please try the attached patch to see if it works. -Thanks, that patch gets my guest up and running again. So, FWIW, - -Tested-by: Cornelia Huck <cohuck@redhat.com> - -Any idea why this did not hit with virtio-net-pci (or the autogenerated -virtio-net-ccw device)? -It can be hit with virtio-net-pci as well (just start without peer). -Hm, I had not been able to reproduce the crash with a 'naked' -device -virtio-net-pci. But checking seems to be the right idea anyway. -Sorry for being unclear, I meant for networking part, you just need -start without peer, and you need a real guest (any Linux) that is trying -to access the config space of virtio-net. -Thanks -For autogenerated virtio-net-cww, I think the reason is that it has -already had a peer set. -Ok, that might well be. - -On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote: -> -> -On 2020/7/27 ä¸å4:41, Cornelia Huck wrote: -> -> On Mon, 27 Jul 2020 15:38:12 +0800 -> -> Jason Wang <jasowang@redhat.com> wrote: -> -> -> -> > On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -> -> > > On Sat, 25 Jul 2020 08:40:07 +0800 -> -> > > Jason Wang <jasowang@redhat.com> wrote: -> -> > > > On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -> -> > > > > On Fri, 24 Jul 2020 11:17:57 -0400 -> -> > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote: -> -> > > > > > On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> -> > > > > > > On Fri, 24 Jul 2020 09:30:58 -0400 -> -> > > > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote: -> -> > > > > > > > On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -> -> > > > > > > > > When I start qemu with a second virtio-net-ccw device (i.e. -> -> > > > > > > > > adding -> -> > > > > > > > > -device virtio-net-ccw in addition to the autogenerated -> -> > > > > > > > > device), I get -> -> > > > > > > > > a segfault. gdb points to -> -> > > > > > > > > -> -> > > > > > > > > #0 0x000055d6ab52681d in virtio_net_get_config -> -> > > > > > > > > (vdev=<optimized out>, -> -> > > > > > > > > config=0x55d6ad9e3f80 "RT") at -> -> > > > > > > > > /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -> > > > > > > > > 146 if (nc->peer->info->type == -> -> > > > > > > > > NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> > > > > > > > > -> -> > > > > > > > > (backtrace doesn't go further) -> -> > > > > > > The core was incomplete, but running under gdb directly shows -> -> > > > > > > that it -> -> > > > > > > is just a bog-standard config space access (first for that -> -> > > > > > > device). -> -> > > > > > > -> -> > > > > > > The cause of the crash is that nc->peer is not set... no idea -> -> > > > > > > how that -> -> > > > > > > can happen, not that familiar with that part of QEMU. (Should -> -> > > > > > > the code -> -> > > > > > > check, or is that really something that should not happen?) -> -> > > > > > > -> -> > > > > > > What I don't understand is why it is set correctly for the -> -> > > > > > > first, -> -> > > > > > > autogenerated virtio-net-ccw device, but not for the second -> -> > > > > > > one, and -> -> > > > > > > why virtio-net-pci doesn't show these problems. The only -> -> > > > > > > difference -> -> > > > > > > between -ccw and -pci that comes to my mind here is that config -> -> > > > > > > space -> -> > > > > > > accesses for ccw are done via an asynchronous operation, so -> -> > > > > > > timing -> -> > > > > > > might be different. -> -> > > > > > Hopefully Jason has an idea. Could you post a full command line -> -> > > > > > please? Do you need a working guest to trigger this? Does this -> -> > > > > > trigger -> -> > > > > > on an x86 host? -> -> > > > > Yes, it does trigger with tcg-on-x86 as well. I've been using -> -> > > > > -> -> > > > > s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu -> -> > > > > qemu,zpci=on -> -> > > > > -m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 -> -> > > > > -drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 -> -> > > > > -device -> -> > > > > scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -> -> > > > > -device virtio-net-ccw -> -> > > > > -> -> > > > > It seems it needs the guest actually doing something with the nics; -> -> > > > > I -> -> > > > > cannot reproduce the crash if I use the old advent calendar moon -> -> > > > > buggy -> -> > > > > image and just add a virtio-net-ccw device. -> -> > > > > -> -> > > > > (I don't think it's a problem with my local build, as I see the -> -> > > > > problem -> -> > > > > both on my laptop and on an LPAR.) -> -> > > > It looks to me we forget the check the existence of peer. -> -> > > > -> -> > > > Please try the attached patch to see if it works. -> -> > > Thanks, that patch gets my guest up and running again. So, FWIW, -> -> > > -> -> > > Tested-by: Cornelia Huck <cohuck@redhat.com> -> -> > > -> -> > > Any idea why this did not hit with virtio-net-pci (or the autogenerated -> -> > > virtio-net-ccw device)? -> -> > -> -> > It can be hit with virtio-net-pci as well (just start without peer). -> -> Hm, I had not been able to reproduce the crash with a 'naked' -device -> -> virtio-net-pci. But checking seems to be the right idea anyway. -> -> -> -Sorry for being unclear, I meant for networking part, you just need start -> -without peer, and you need a real guest (any Linux) that is trying to access -> -the config space of virtio-net. -> -> -Thanks -A pxe guest will do it, but that doesn't support ccw, right? - -I'm still unclear why this triggers with ccw but not pci - -any idea? - -> -> -> -> -> > For autogenerated virtio-net-cww, I think the reason is that it has -> -> > already had a peer set. -> -> Ok, that might well be. -> -> -> -> - -On 2020/7/27 ä¸å7:43, Michael S. Tsirkin wrote: -On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote: -On 2020/7/27 ä¸å4:41, Cornelia Huck wrote: -On Mon, 27 Jul 2020 15:38:12 +0800 -Jason Wang<jasowang@redhat.com> wrote: -On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -On Sat, 25 Jul 2020 08:40:07 +0800 -Jason Wang<jasowang@redhat.com> wrote: -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) -It looks to me we forget the check the existence of peer. - -Please try the attached patch to see if it works. -Thanks, that patch gets my guest up and running again. So, FWIW, - -Tested-by: Cornelia Huck<cohuck@redhat.com> - -Any idea why this did not hit with virtio-net-pci (or the autogenerated -virtio-net-ccw device)? -It can be hit with virtio-net-pci as well (just start without peer). -Hm, I had not been able to reproduce the crash with a 'naked' -device -virtio-net-pci. But checking seems to be the right idea anyway. -Sorry for being unclear, I meant for networking part, you just need start -without peer, and you need a real guest (any Linux) that is trying to access -the config space of virtio-net. - -Thanks -A pxe guest will do it, but that doesn't support ccw, right? -Yes, it depends on the cli actually. -I'm still unclear why this triggers with ccw but not pci - -any idea? -I don't test pxe but I can reproduce this with pci (just start a linux -guest without a peer). -Thanks - -On Mon, Jul 27, 2020 at 08:44:09PM +0800, Jason Wang wrote: -> -> -On 2020/7/27 ä¸å7:43, Michael S. Tsirkin wrote: -> -> On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote: -> -> > On 2020/7/27 ä¸å4:41, Cornelia Huck wrote: -> -> > > On Mon, 27 Jul 2020 15:38:12 +0800 -> -> > > Jason Wang<jasowang@redhat.com> wrote: -> -> > > -> -> > > > On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -> -> > > > > On Sat, 25 Jul 2020 08:40:07 +0800 -> -> > > > > Jason Wang<jasowang@redhat.com> wrote: -> -> > > > > > On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -> -> > > > > > > On Fri, 24 Jul 2020 11:17:57 -0400 -> -> > > > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote: -> -> > > > > > > > On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -> -> > > > > > > > > On Fri, 24 Jul 2020 09:30:58 -0400 -> -> > > > > > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote: -> -> > > > > > > > > > On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck -> -> > > > > > > > > > wrote: -> -> > > > > > > > > > > When I start qemu with a second virtio-net-ccw device -> -> > > > > > > > > > > (i.e. adding -> -> > > > > > > > > > > -device virtio-net-ccw in addition to the autogenerated -> -> > > > > > > > > > > device), I get -> -> > > > > > > > > > > a segfault. gdb points to -> -> > > > > > > > > > > -> -> > > > > > > > > > > #0 0x000055d6ab52681d in virtio_net_get_config -> -> > > > > > > > > > > (vdev=<optimized out>, -> -> > > > > > > > > > > config=0x55d6ad9e3f80 "RT") at -> -> > > > > > > > > > > /home/cohuck/git/qemu/hw/net/virtio-net.c:146 -> -> > > > > > > > > > > 146 if (nc->peer->info->type == -> -> > > > > > > > > > > NET_CLIENT_DRIVER_VHOST_VDPA) { -> -> > > > > > > > > > > -> -> > > > > > > > > > > (backtrace doesn't go further) -> -> > > > > > > > > The core was incomplete, but running under gdb directly -> -> > > > > > > > > shows that it -> -> > > > > > > > > is just a bog-standard config space access (first for that -> -> > > > > > > > > device). -> -> > > > > > > > > -> -> > > > > > > > > The cause of the crash is that nc->peer is not set... no -> -> > > > > > > > > idea how that -> -> > > > > > > > > can happen, not that familiar with that part of QEMU. -> -> > > > > > > > > (Should the code -> -> > > > > > > > > check, or is that really something that should not happen?) -> -> > > > > > > > > -> -> > > > > > > > > What I don't understand is why it is set correctly for the -> -> > > > > > > > > first, -> -> > > > > > > > > autogenerated virtio-net-ccw device, but not for the second -> -> > > > > > > > > one, and -> -> > > > > > > > > why virtio-net-pci doesn't show these problems. The only -> -> > > > > > > > > difference -> -> > > > > > > > > between -ccw and -pci that comes to my mind here is that -> -> > > > > > > > > config space -> -> > > > > > > > > accesses for ccw are done via an asynchronous operation, so -> -> > > > > > > > > timing -> -> > > > > > > > > might be different. -> -> > > > > > > > Hopefully Jason has an idea. Could you post a full command -> -> > > > > > > > line -> -> > > > > > > > please? Do you need a working guest to trigger this? Does -> -> > > > > > > > this trigger -> -> > > > > > > > on an x86 host? -> -> > > > > > > Yes, it does trigger with tcg-on-x86 as well. I've been using -> -> > > > > > > -> -> > > > > > > s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -> -> > > > > > > -cpu qemu,zpci=on -> -> > > > > > > -m 1024 -nographic -device -> -> > > > > > > virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 -> -> > > > > > > -drive -> -> > > > > > > file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 -> -> > > > > > > -device -> -> > > > > > > scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -> -> > > > > > > -device virtio-net-ccw -> -> > > > > > > -> -> > > > > > > It seems it needs the guest actually doing something with the -> -> > > > > > > nics; I -> -> > > > > > > cannot reproduce the crash if I use the old advent calendar -> -> > > > > > > moon buggy -> -> > > > > > > image and just add a virtio-net-ccw device. -> -> > > > > > > -> -> > > > > > > (I don't think it's a problem with my local build, as I see the -> -> > > > > > > problem -> -> > > > > > > both on my laptop and on an LPAR.) -> -> > > > > > It looks to me we forget the check the existence of peer. -> -> > > > > > -> -> > > > > > Please try the attached patch to see if it works. -> -> > > > > Thanks, that patch gets my guest up and running again. So, FWIW, -> -> > > > > -> -> > > > > Tested-by: Cornelia Huck<cohuck@redhat.com> -> -> > > > > -> -> > > > > Any idea why this did not hit with virtio-net-pci (or the -> -> > > > > autogenerated -> -> > > > > virtio-net-ccw device)? -> -> > > > It can be hit with virtio-net-pci as well (just start without peer). -> -> > > Hm, I had not been able to reproduce the crash with a 'naked' -device -> -> > > virtio-net-pci. But checking seems to be the right idea anyway. -> -> > Sorry for being unclear, I meant for networking part, you just need start -> -> > without peer, and you need a real guest (any Linux) that is trying to -> -> > access -> -> > the config space of virtio-net. -> -> > -> -> > Thanks -> -> A pxe guest will do it, but that doesn't support ccw, right? -> -> -> -Yes, it depends on the cli actually. -> -> -> -> -> -> I'm still unclear why this triggers with ccw but not pci - -> -> any idea? -> -> -> -I don't test pxe but I can reproduce this with pci (just start a linux guest -> -without a peer). -> -> -Thanks -> -Might be a good addition to a unit test. Not sure what would the -test do exactly: just make sure guest runs? Looks like a lot of work -for an empty test ... maybe we can poke at the guest config with -qtest commands at least. - --- -MST - -On 2020/7/27 ä¸å9:16, Michael S. Tsirkin wrote: -On Mon, Jul 27, 2020 at 08:44:09PM +0800, Jason Wang wrote: -On 2020/7/27 ä¸å7:43, Michael S. Tsirkin wrote: -On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote: -On 2020/7/27 ä¸å4:41, Cornelia Huck wrote: -On Mon, 27 Jul 2020 15:38:12 +0800 -Jason Wang<jasowang@redhat.com> wrote: -On 2020/7/27 ä¸å2:43, Cornelia Huck wrote: -On Sat, 25 Jul 2020 08:40:07 +0800 -Jason Wang<jasowang@redhat.com> wrote: -On 2020/7/24 ä¸å11:34, Cornelia Huck wrote: -On Fri, 24 Jul 2020 11:17:57 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: -On Fri, 24 Jul 2020 09:30:58 -0400 -"Michael S. Tsirkin"<mst@redhat.com> wrote: -On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote: -When I start qemu with a second virtio-net-ccw device (i.e. adding --device virtio-net-ccw in addition to the autogenerated device), I get -a segfault. gdb points to - -#0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>, - config=0x55d6ad9e3f80 "RT") at -/home/cohuck/git/qemu/hw/net/virtio-net.c:146 -146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { - -(backtrace doesn't go further) -The core was incomplete, but running under gdb directly shows that it -is just a bog-standard config space access (first for that device). - -The cause of the crash is that nc->peer is not set... no idea how that -can happen, not that familiar with that part of QEMU. (Should the code -check, or is that really something that should not happen?) - -What I don't understand is why it is set correctly for the first, -autogenerated virtio-net-ccw device, but not for the second one, and -why virtio-net-pci doesn't show these problems. The only difference -between -ccw and -pci that comes to my mind here is that config space -accesses for ccw are done via an asynchronous operation, so timing -might be different. -Hopefully Jason has an idea. Could you post a full command line -please? Do you need a working guest to trigger this? Does this trigger -on an x86 host? -Yes, it does trigger with tcg-on-x86 as well. I've been using - -s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on --m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 --drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 --device -scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 --device virtio-net-ccw - -It seems it needs the guest actually doing something with the nics; I -cannot reproduce the crash if I use the old advent calendar moon buggy -image and just add a virtio-net-ccw device. - -(I don't think it's a problem with my local build, as I see the problem -both on my laptop and on an LPAR.) -It looks to me we forget the check the existence of peer. - -Please try the attached patch to see if it works. -Thanks, that patch gets my guest up and running again. So, FWIW, - -Tested-by: Cornelia Huck<cohuck@redhat.com> - -Any idea why this did not hit with virtio-net-pci (or the autogenerated -virtio-net-ccw device)? -It can be hit with virtio-net-pci as well (just start without peer). -Hm, I had not been able to reproduce the crash with a 'naked' -device -virtio-net-pci. But checking seems to be the right idea anyway. -Sorry for being unclear, I meant for networking part, you just need start -without peer, and you need a real guest (any Linux) that is trying to access -the config space of virtio-net. - -Thanks -A pxe guest will do it, but that doesn't support ccw, right? -Yes, it depends on the cli actually. -I'm still unclear why this triggers with ccw but not pci - -any idea? -I don't test pxe but I can reproduce this with pci (just start a linux guest -without a peer). - -Thanks -Might be a good addition to a unit test. Not sure what would the -test do exactly: just make sure guest runs? Looks like a lot of work -for an empty test ... maybe we can poke at the guest config with -qtest commands at least. -That should work or we can simply extend the exist virtio-net qtest to -do that. -Thanks - diff --git a/results/classifier/02/mistranslation/80615920 b/results/classifier/02/mistranslation/80615920 deleted file mode 100644 index ec3473768..000000000 --- a/results/classifier/02/mistranslation/80615920 +++ /dev/null @@ -1,349 +0,0 @@ -mistranslation: 0.800 -other: 0.786 -instruction: 0.751 -boot: 0.750 -semantic: 0.737 - -[BUG] accel/tcg: cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu) - -It seems there is a bug in SIGALRM handling when 486 system emulates x86_64 -code. - -This code: - -#include <stdio.h> -#include <stdlib.h> -#include <pthread.h> -#include <signal.h> -#include <unistd.h> - -pthread_t thread1, thread2; - -// Signal handler for SIGALRM -void alarm_handler(int sig) { - // Do nothing, just wake up the other thread -} - -// Thread 1 function -void* thread1_func(void* arg) { - // Set up the signal handler for SIGALRM - signal(SIGALRM, alarm_handler); - - // Wait for 5 seconds - sleep(1); - - // Send SIGALRM signal to thread 2 - pthread_kill(thread2, SIGALRM); - - return NULL; -} - -// Thread 2 function -void* thread2_func(void* arg) { - // Wait for the SIGALRM signal - pause(); - - printf("Thread 2 woke up!\n"); - - return NULL; -} - -int main() { - // Create thread 1 - if (pthread_create(&thread1, NULL, thread1_func, NULL) != 0) { - fprintf(stderr, "Failed to create thread 1\n"); - return 1; - } - - // Create thread 2 - if (pthread_create(&thread2, NULL, thread2_func, NULL) != 0) { - fprintf(stderr, "Failed to create thread 2\n"); - return 1; - } - - // Wait for both threads to finish - pthread_join(thread1, NULL); - pthread_join(thread2, NULL); - - return 0; -} - - -Fails with this -strace log (there are also unsupported syscalls 334 and 435, -but it seems it doesn't affect the code much): - -... -736 rt_sigaction(SIGALRM,0x000000001123ec20,0x000000001123ecc0) = 0 -736 clock_nanosleep(CLOCK_REALTIME,0,{tv_sec = 1,tv_nsec = 0},{tv_sec = -1,tv_nsec = 0}) -736 rt_sigprocmask(SIG_BLOCK,0x00000000109fad20,0x0000000010800b38,8) = 0 -736 Unknown syscall 435 -736 -clone(CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID| - ... -736 rt_sigprocmask(SIG_SETMASK,0x0000000010800b38,NULL,8) -736 set_robust_list(0x11a419a0,0) = -1 errno=38 (Function not implemented) -736 rt_sigprocmask(SIG_SETMASK,0x0000000011a41fb0,NULL,8) = 0 - = 0 -736 pause(0,0,2,277186368,0,295966400) -736 -futex(0x000000001123f990,FUTEX_CLOCK_REALTIME|FUTEX_WAIT_BITSET,738,NULL,NULL,0) - = 0 -736 rt_sigprocmask(SIG_BLOCK,0x00000000109fad20,0x000000001123ee88,8) = 0 -736 getpid() = 736 -736 tgkill(736,739,SIGALRM) = 0 - = -1 errno=4 (Interrupted system call) ---- SIGALRM {si_signo=SIGALRM, si_code=SI_TKILL, si_pid=736, si_uid=0} --- -0x48874a != 0x3c69e10 -736 rt_sigprocmask(SIG_SETMASK,0x000000001123ee88,NULL,8) = 0 -** -ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: -(cpu == current_cpu) -Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -failed: (cpu == current_cpu) -0x48874a != 0x3c69e10 -** -ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: -(cpu == current_cpu) -Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -failed: (cpu == current_cpu) -# - -The code fails either with or without -singlestep, the command line: - -/usr/bin/qemu-x86_64 -L /opt/x86_64 -strace -singlestep /opt/x86_64/alarm.bin - -Source code of QEMU 8.1.1 was modified with patch "[PATCH] qemu/timer: Don't -use RDTSC on i486" [1], -with added few ioctls (not relevant) and cpu_exec_longjmp_cleanup() now prints -current pointers of -cpu and current_cpu (line "0x48874a != 0x3c69e10"). - -config.log (built as a part of buildroot, basically the minimal possible -configuration for running x86_64 on 486): - -# Configured with: -'/mnt/hd_8tb_p1/p1/home/crossgen/buildroot_486_2/output/build/qemu-8.1.1/configure' - '--prefix=/usr' -'--cross-prefix=/mnt/hd_8tb_p1/p1/home/crossgen/buildroot_486_2/output/host/bin/i486-buildroot-linux-gnu-' - '--audio-drv-list=' -'--python=/mnt/hd_8tb_p1/p1/home/crossgen/buildroot_486_2/output/host/bin/python3' - -'--ninja=/mnt/hd_8tb_p1/p1/home/crossgen/buildroot_486_2/output/host/bin/ninja' -'--disable-alsa' '--disable-bpf' '--disable-brlapi' '--disable-bsd-user' -'--disable-cap-ng' '--disable-capstone' '--disable-containers' -'--disable-coreaudio' '--disable-curl' '--disable-curses' -'--disable-dbus-display' '--disable-docs' '--disable-dsound' '--disable-hvf' -'--disable-jack' '--disable-libiscsi' '--disable-linux-aio' -'--disable-linux-io-uring' '--disable-malloc-trim' '--disable-membarrier' -'--disable-mpath' '--disable-netmap' '--disable-opengl' '--disable-oss' -'--disable-pa' '--disable-rbd' '--disable-sanitizers' '--disable-selinux' -'--disable-sparse' '--disable-strip' '--disable-vde' '--disable-vhost-crypto' -'--disable-vhost-user-blk-server' '--disable-virtfs' '--disable-whpx' -'--disable-xen' '--disable-attr' '--disable-kvm' '--disable-vhost-net' -'--disable-download' '--disable-hexagon-idef-parser' '--disable-system' -'--enable-linux-user' '--target-list=x86_64-linux-user' '--disable-vhost-user' -'--disable-slirp' '--disable-sdl' '--disable-fdt' '--enable-trace-backends=nop' -'--disable-tools' '--disable-guest-agent' '--disable-fuse' -'--disable-fuse-lseek' '--disable-seccomp' '--disable-libssh' -'--disable-libusb' '--disable-vnc' '--disable-nettle' '--disable-numa' -'--disable-pipewire' '--disable-spice' '--disable-usb-redir' -'--disable-install-blobs' - -Emulation of the same x86_64 code with qemu 6.2.0 installed on another x86_64 -native machine works fine. - -[1] -https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg05387.html -Best regards, -Petr - -On Sat, 25 Nov 2023 at 13:09, Petr Cvek <petrcvekcz@gmail.com> wrote: -> -> -It seems there is a bug in SIGALRM handling when 486 system emulates x86_64 -> -code. -486 host is pretty well out of support currently. Can you reproduce -this on a less ancient host CPU type ? - -> -ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: -> -(cpu == current_cpu) -> -Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: -> -assertion failed: (cpu == current_cpu) -> -0x48874a != 0x3c69e10 -> -** -> -ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: -> -(cpu == current_cpu) -> -Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: -> -assertion failed: (cpu == current_cpu) -What compiler version do you build QEMU with? That -assert is there because we have seen some buggy compilers -in the past which don't correctly preserve the variable -value as the setjmp/longjmp spec requires them to. - -thanks --- PMM - -Dne 27. 11. 23 v 10:37 Peter Maydell napsal(a): -> -On Sat, 25 Nov 2023 at 13:09, Petr Cvek <petrcvekcz@gmail.com> wrote: -> -> -> -> It seems there is a bug in SIGALRM handling when 486 system emulates x86_64 -> -> code. -> -> -486 host is pretty well out of support currently. Can you reproduce -> -this on a less ancient host CPU type ? -> -It seems it only fails when the code is compiled for i486. QEMU built with the -same compiler with -march=i586 and above runs on the same physical hardware -without a problem. All -march= variants were executed on ryzen 3600. - -> -> ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -> -> failed: (cpu == current_cpu) -> -> Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: -> -> assertion failed: (cpu == current_cpu) -> -> 0x48874a != 0x3c69e10 -> -> ** -> -> ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -> -> failed: (cpu == current_cpu) -> -> Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: -> -> assertion failed: (cpu == current_cpu) -> -> -What compiler version do you build QEMU with? That -> -assert is there because we have seen some buggy compilers -> -in the past which don't correctly preserve the variable -> -value as the setjmp/longjmp spec requires them to. -> -i486 and i586+ code variants were compiled with GCC 13.2.0 (more exactly, -slackware64 current multilib distribution). - -i486 binary which runs on the real 486 is also GCC 13.2.0 and installed as a -part of the buildroot crosscompiler (about two week old git snapshot). - -> -thanks -> --- PMM -best regards, -Petr - -On 11/25/23 07:08, Petr Cvek wrote: -ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: -(cpu == current_cpu) -Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -failed: (cpu == current_cpu) -# - -The code fails either with or without -singlestep, the command line: - -/usr/bin/qemu-x86_64 -L /opt/x86_64 -strace -singlestep /opt/x86_64/alarm.bin - -Source code of QEMU 8.1.1 was modified with patch "[PATCH] qemu/timer: Don't use -RDTSC on i486" [1], -with added few ioctls (not relevant) and cpu_exec_longjmp_cleanup() now prints -current pointers of -cpu and current_cpu (line "0x48874a != 0x3c69e10"). -If you try this again with 8.2-rc2, you should not see an assertion failure. -You should see instead - -QEMU internal SIGILL {code=ILLOPC, addr=0x12345678} -which I think more accurately summarizes the situation of attempting RDTSC on hardware -that does not support it. -r~ - -Dne 29. 11. 23 v 15:25 Richard Henderson napsal(a): -> -On 11/25/23 07:08, Petr Cvek wrote: -> -> ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion -> -> failed: (cpu == current_cpu) -> -> Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: -> -> assertion failed: (cpu == current_cpu) -> -> # -> -> -> -> The code fails either with or without -singlestep, the command line: -> -> -> -> /usr/bin/qemu-x86_64 -L /opt/x86_64 -strace -singlestep -> -> /opt/x86_64/alarm.bin -> -> -> -> Source code of QEMU 8.1.1 was modified with patch "[PATCH] qemu/timer: Don't -> -> use RDTSC on i486" [1], -> -> with added few ioctls (not relevant) and cpu_exec_longjmp_cleanup() now -> -> prints current pointers of -> -> cpu and current_cpu (line "0x48874a != 0x3c69e10"). -> -> -> -If you try this again with 8.2-rc2, you should not see an assertion failure. -> -You should see instead -> -> -QEMU internal SIGILL {code=ILLOPC, addr=0x12345678} -> -> -which I think more accurately summarizes the situation of attempting RDTSC on -> -hardware that does not support it. -> -> -Compilation of vanilla qemu v8.2.0-rc2 with -march=i486 by GCC 13.2.0 and -running the resulting binary on ryzen still leads to: - -** -ERROR:../accel/tcg/cpu-exec.c:533:cpu_exec_longjmp_cleanup: assertion failed: -(cpu == current_cpu) -Bail out! ERROR:../accel/tcg/cpu-exec.c:533:cpu_exec_longjmp_cleanup: assertion -failed: (cpu == current_cpu) -Aborted - -> -> -r~ -Petr - diff --git a/results/classifier/02/other/02364653 b/results/classifier/02/other/02364653 deleted file mode 100644 index 00b205295..000000000 --- a/results/classifier/02/other/02364653 +++ /dev/null @@ -1,364 +0,0 @@ -other: 0.956 -semantic: 0.942 -instruction: 0.927 -boot: 0.925 -mistranslation: 0.912 - -[Qemu-devel] [BUG] Inappropriate size of target_sigset_t - -Hello, Peter, Laurent, - -While working on another problem yesterday, I think I discovered a -long-standing bug in QEMU Linux user mode: our target_sigset_t structure is -eight times smaller as it should be! - -In this code segment from syscalls_def.h: - -#ifdef TARGET_MIPS -#define TARGET_NSIG 128 -#else -#define TARGET_NSIG 64 -#endif -#define TARGET_NSIG_BPW TARGET_ABI_BITS -#define TARGET_NSIG_WORDS (TARGET_NSIG / TARGET_NSIG_BPW) - -typedef struct { - abi_ulong sig[TARGET_NSIG_WORDS]; -} target_sigset_t; - -... TARGET_ABI_BITS should be replaced by eight times smaller constant (in -fact, semantically, we need TARGET_ABI_BYTES, but it is not defined) (what is -needed is actually "a byte per signal" in target_sigset_t, and we allow "a bit -per signal"). - -All this probably sounds to you like something impossible, since this code is -in QEMU "since forever", but I checked everything, and the bug seems real. I -wish you can prove me wrong. - -I just wanted to let you know about this, given the sensitive timing of current -softfreeze, and the fact that I won't be able to do more investigation on this -in coming weeks, since I am busy with other tasks, but perhaps you can analyze -and do something which you consider appropriate. - -Yours, -Aleksandar - -Le 03/07/2019 à 21:46, Aleksandar Markovic a écrit : -> -Hello, Peter, Laurent, -> -> -While working on another problem yesterday, I think I discovered a -> -long-standing bug in QEMU Linux user mode: our target_sigset_t structure is -> -eight times smaller as it should be! -> -> -In this code segment from syscalls_def.h: -> -> -#ifdef TARGET_MIPS -> -#define TARGET_NSIG 128 -> -#else -> -#define TARGET_NSIG 64 -> -#endif -> -#define TARGET_NSIG_BPW TARGET_ABI_BITS -> -#define TARGET_NSIG_WORDS (TARGET_NSIG / TARGET_NSIG_BPW) -> -> -typedef struct { -> -abi_ulong sig[TARGET_NSIG_WORDS]; -> -} target_sigset_t; -> -> -... TARGET_ABI_BITS should be replaced by eight times smaller constant (in -> -fact, semantically, we need TARGET_ABI_BYTES, but it is not defined) (what is -> -needed is actually "a byte per signal" in target_sigset_t, and we allow "a -> -bit per signal"). -TARGET_NSIG is divided by TARGET_ABI_BITS which gives you the number of -abi_ulong words we need in target_sigset_t. - -> -All this probably sounds to you like something impossible, since this code is -> -in QEMU "since forever", but I checked everything, and the bug seems real. I -> -wish you can prove me wrong. -> -> -I just wanted to let you know about this, given the sensitive timing of -> -current softfreeze, and the fact that I won't be able to do more -> -investigation on this in coming weeks, since I am busy with other tasks, but -> -perhaps you can analyze and do something which you consider appropriate. -If I compare with kernel, it looks good: - -In Linux: - - arch/mips/include/uapi/asm/signal.h - - #define _NSIG 128 - #define _NSIG_BPW (sizeof(unsigned long) * 8) - #define _NSIG_WORDS (_NSIG / _NSIG_BPW) - - typedef struct { - unsigned long sig[_NSIG_WORDS]; - } sigset_t; - -_NSIG_BPW is 8 * 8 = 64 on MIPS64 or 4 * 8 = 32 on MIPS - -In QEMU: - -TARGET_NSIG_BPW is TARGET_ABI_BITS which is TARGET_LONG_BITS which is -64 on MIPS64 and 32 on MIPS. - -I think there is no problem. - -Thanks, -Laurent - -From: Laurent Vivier <address@hidden> -> -If I compare with kernel, it looks good: -> -... -> -I think there is no problem. -Sure, thanks for such fast response - again, I am glad if you are right. -However, for some reason, glibc (and musl too) define sigset_t differently than -kernel. Please take a look. I am not sure if this is covered fine in our code. - -Yours, -Aleksandar - -> -Thanks, -> -Laurent - -On Wed, 3 Jul 2019 at 21:20, Aleksandar Markovic <address@hidden> wrote: -> -> -From: Laurent Vivier <address@hidden> -> -> If I compare with kernel, it looks good: -> -> ... -> -> I think there is no problem. -> -> -Sure, thanks for such fast response - again, I am glad if you are right. -> -However, for some reason, glibc (and musl too) define sigset_t differently -> -than kernel. Please take a look. I am not sure if this is covered fine in our -> -code. -Yeah, the libc definitions of sigset_t don't match the -kernel ones (this is for obscure historical reasons IIRC). -We're providing implementations of the target -syscall interface, so our target_sigset_t should be the -target kernel's version (and the target libc's version doesn't -matter to us). On the other hand we will be using the -host libc version, I think, so a little caution is required -and it's possible we have some bugs in our code. - -thanks --- PMM - -> -From: Peter Maydell <address@hidden> -> -> -On Wed, 3 Jul 2019 at 21:20, Aleksandar Markovic <address@hidden> wrote: -> -> -> -> From: Laurent Vivier <address@hidden> -> -> > If I compare with kernel, it looks good: -> -> > ... -> -> > I think there is no problem. -> -> -> -> Sure, thanks for such fast response - again, I am glad if you are right. -> -> However, for some reason, glibc (and musl too) define sigset_t differently -> -> than kernel. Please take a look. I am not sure if this is covered fine in -> -> our code. -> -> -Yeah, the libc definitions of sigset_t don't match the -> -kernel ones (this is for obscure historical reasons IIRC). -> -We're providing implementations of the target -> -syscall interface, so our target_sigset_t should be the -> -target kernel's version (and the target libc's version doesn't -> -matter to us). On the other hand we will be using the -> -host libc version, I think, so a little caution is required -> -and it's possible we have some bugs in our code. -OK, I gather than this is not something that requires our immediate attention -(for 4.1), but we can analyze it later on. - -Thanks for response!! - -Sincerely, -Aleksandar - -> -thanks -> --- PMM - -Le 03/07/2019 à 22:28, Peter Maydell a écrit : -> -On Wed, 3 Jul 2019 at 21:20, Aleksandar Markovic <address@hidden> wrote: -> -> -> -> From: Laurent Vivier <address@hidden> -> ->> If I compare with kernel, it looks good: -> ->> ... -> ->> I think there is no problem. -> -> -> -> Sure, thanks for such fast response - again, I am glad if you are right. -> -> However, for some reason, glibc (and musl too) define sigset_t differently -> -> than kernel. Please take a look. I am not sure if this is covered fine in -> -> our code. -> -> -Yeah, the libc definitions of sigset_t don't match the -> -kernel ones (this is for obscure historical reasons IIRC). -> -We're providing implementations of the target -> -syscall interface, so our target_sigset_t should be the -> -target kernel's version (and the target libc's version doesn't -> -matter to us). On the other hand we will be using the -> -host libc version, I think, so a little caution is required -> -and it's possible we have some bugs in our code. -It's why we need host_to_target_sigset_internal() and -target_to_host_sigset_internal() that translates bits and bytes between -guest kernel interface and host libc interface. - -void host_to_target_sigset_internal(target_sigset_t *d, - const sigset_t *s) -{ - int i; - target_sigemptyset(d); - for (i = 1; i <= TARGET_NSIG; i++) { - if (sigismember(s, i)) { - target_sigaddset(d, host_to_target_signal(i)); - } - } -} - -void target_to_host_sigset_internal(sigset_t *d, - const target_sigset_t *s) -{ - int i; - sigemptyset(d); - for (i = 1; i <= TARGET_NSIG; i++) { - if (target_sigismember(s, i)) { - sigaddset(d, target_to_host_signal(i)); - } - } -} - -Thanks, -Laurent - -Hi Aleksandar, - -On Wed, Jul 3, 2019 at 12:48 PM Aleksandar Markovic -<address@hidden> wrote: -> -#define TARGET_NSIG_BPW TARGET_ABI_BITS -> -#define TARGET_NSIG_WORDS (TARGET_NSIG / TARGET_NSIG_BPW) -> -> -typedef struct { -> -abi_ulong sig[TARGET_NSIG_WORDS]; -> -} target_sigset_t; -> -> -... TARGET_ABI_BITS should be replaced by eight times smaller constant (in -> -fact, -> -semantically, we need TARGET_ABI_BYTES, but it is not defined) (what is needed -> -is actually "a byte per signal" in target_sigset_t, and we allow "a bit per -> -signal"). -Why do we need a byte per target signal, if the functions in linux-user/signal.c -operate with bits? - --- -Thanks. --- Max - -> -Why do we need a byte per target signal, if the functions in -> -linux-user/signal.c -> -operate with bits? -Max, - -I did not base my findings on code analysis, but on dumping size/offsets of -elements of some structures, as they are emulated in QEMU, and in real systems. -So, I can't really answer your question. - -Yours, -Aleksandar - -> --- -> -Thanks. -> --- Max - diff --git a/results/classifier/02/other/02572177 b/results/classifier/02/other/02572177 deleted file mode 100644 index f21e4bbc0..000000000 --- a/results/classifier/02/other/02572177 +++ /dev/null @@ -1,422 +0,0 @@ -other: 0.869 -instruction: 0.794 -semantic: 0.770 -mistranslation: 0.693 -boot: 0.658 - -[Qemu-devel] 答复: Re: [BUG]COLO failover hang - -hi. - - -I test the git qemu master have the same problem. - - -(gdb) bt - - -#0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, niov=1, -fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 - - -#1 0x00007f658e4aa0c2 in qio_channel_read (address@hidden, address@hidden "", -address@hidden, address@hidden) at io/channel.c:114 - - -#2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -migration/qemu-file-channel.c:78 - - -#3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -migration/qemu-file.c:295 - - -#4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, address@hidden) at -migration/qemu-file.c:555 - - -#5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -migration/qemu-file.c:568 - - -#6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -migration/qemu-file.c:648 - - -#7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -address@hidden) at migration/colo.c:244 - - -#8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized outï¼, -address@hidden, address@hidden) - - - at migration/colo.c:264 - - -#9 0x00007f658e3e740e in colo_process_incoming_thread (opaque=0x7f658eb30360 -ï¼mis_current.31286ï¼) at migration/colo.c:577 - - -#10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 - - -#11 0x00007f65881983ed in clone () from /lib64/libc.so.6 - - -(gdb) p ioc-ï¼name - - -$2 = 0x7f658ff7d5c0 "migration-socket-incoming" - - -(gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN - - -$3 = 0 - - - - - -(gdb) bt - - -#0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, condition=G_IO_IN, -opaque=0x7fdcceeafa90) at migration/socket.c:137 - - -#1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -gmain.c:3054 - - -#2 g_main_context_dispatch (context=ï¼optimized outï¼, address@hidden) at -gmain.c:3630 - - -#3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 - - -#4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at util/main-loop.c:258 - - -#5 main_loop_wait (address@hidden) at util/main-loop.c:506 - - -#6 0x00007fdccb526187 in main_loop () at vl.c:1898 - - -#7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized outï¼) at -vl.c:4709 - - -(gdb) p ioc-ï¼features - - -$1 = 6 - - -(gdb) p ioc-ï¼name - - -$2 = 0x7fdcce1b1ab0 "migration-socket-listener" - - - - - -May be socket_accept_incoming_migration should call -qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? - - - - - -thank you. - - - - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ16æ¥ 14:46 -主 é¢ ï¼Re: [Qemu-devel] COLO failover hang - - - - - - - -On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ -ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ -ï¼ I found that the colo in qemu is not complete yet. -ï¼ Do the colo have any plan for development? - -Yes, We are developing. You can see some of patch we pushing. - -ï¼ Has anyone ever run it successfully? Any help is appreciated! - -In our internal version can run it successfully, -The failover detail you can ask Zhanghailiang for help. -Next time if you have some question about COLO, -please cc me and zhanghailiang address@hidden - - -Thanks -Zhang Chen - - -ï¼ -ï¼ -ï¼ -ï¼ centos7.2+qemu2.7.50 -ï¼ (gdb) bt -ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ io/channel-socket.c:497 -ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ address@hidden "", address@hidden, -ï¼ address@hidden) at io/channel.c:97 -ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ migration/qemu-file.c:257 -ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:523 -ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:603 -ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ address@hidden) at migration/colo.c:215 -ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ migration/colo.c:546 -ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ migration/colo.c:649 -ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -- -ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ -ï¼ -ï¼ -ï¼ - --- -Thanks -Zhang Chen - -Hi,Wang. - -You can test this branch: -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -and please follow wiki ensure your own configuration correctly. -http://wiki.qemu-project.org/Features/COLO -Thanks - -Zhang Chen - - -On 03/21/2017 03:27 PM, address@hidden wrote: -hi. - -I test the git qemu master have the same problem. - -(gdb) bt -#0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -#1 0x00007f658e4aa0c2 in qio_channel_read -(address@hidden, address@hidden "", -address@hidden, address@hidden) at io/channel.c:114 -#2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -migration/qemu-file-channel.c:78 -#3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -migration/qemu-file.c:295 -#4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -address@hidden) at migration/qemu-file.c:555 -#5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -migration/qemu-file.c:568 -#6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -migration/qemu-file.c:648 -#7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -address@hidden) at migration/colo.c:244 -#8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -outï¼, address@hidden, -address@hidden) -at migration/colo.c:264 -#9 0x00007f658e3e740e in colo_process_incoming_thread -(opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -#10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 - -#11 0x00007f65881983ed in clone () from /lib64/libc.so.6 - -(gdb) p ioc-ï¼name - -$2 = 0x7f658ff7d5c0 "migration-socket-incoming" - -(gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN - -$3 = 0 - - -(gdb) bt -#0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -#1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -gmain.c:3054 -#2 g_main_context_dispatch (context=ï¼optimized outï¼, -address@hidden) at gmain.c:3630 -#3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -#4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -util/main-loop.c:258 -#5 main_loop_wait (address@hidden) at -util/main-loop.c:506 -#6 0x00007fdccb526187 in main_loop () at vl.c:1898 -#7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -outï¼) at vl.c:4709 -(gdb) p ioc-ï¼features - -$1 = 6 - -(gdb) p ioc-ï¼name - -$2 = 0x7fdcce1b1ab0 "migration-socket-listener" -May be socket_accept_incoming_migration should -call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -thank you. - - - - - -åå§é®ä»¶ -address@hidden; -*æ¶ä»¶äººï¼*ç广10165992;address@hidden; -address@hidden;address@hidden; -*æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -*主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* - - - - -On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ -ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ -ï¼ I found that the colo in qemu is not complete yet. -ï¼ Do the colo have any plan for development? - -Yes, We are developing. You can see some of patch we pushing. - -ï¼ Has anyone ever run it successfully? Any help is appreciated! - -In our internal version can run it successfully, -The failover detail you can ask Zhanghailiang for help. -Next time if you have some question about COLO, -please cc me and zhanghailiang address@hidden - - -Thanks -Zhang Chen - - -ï¼ -ï¼ -ï¼ -ï¼ centos7.2+qemu2.7.50 -ï¼ (gdb) bt -ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ io/channel-socket.c:497 -ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ address@hidden "", address@hidden, -ï¼ address@hidden) at io/channel.c:97 -ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ migration/qemu-file.c:257 -ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:523 -ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:603 -ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ address@hidden) at migration/colo.c:215 -ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ migration/colo.c:546 -ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ migration/colo.c:649 -ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -- -ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ -ï¼ -ï¼ -ï¼ - --- -Thanks -Zhang Chen --- -Thanks -Zhang Chen - diff --git a/results/classifier/02/other/04472277 b/results/classifier/02/other/04472277 deleted file mode 100644 index d6400525a..000000000 --- a/results/classifier/02/other/04472277 +++ /dev/null @@ -1,577 +0,0 @@ -other: 0.846 -instruction: 0.845 -boot: 0.831 -mistranslation: 0.817 -semantic: 0.815 - -[BUG][KVM_SET_USER_MEMORY_REGION] KVM_SET_USER_MEMORY_REGION failed - -Hi all, -I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR. -Is there any one know this? -The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log -``` -2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument -kvm_set_phys_mem: error registering slot: Invalid argument -2023-03-14 10:09:18.198+0000: shutting down, reason=crashed -``` -The xml file -``` -root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml -<!-- -WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE -OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: - virsh edit instance-0000000e -or other application using the libvirt API. ---> -<domain type='kvm'> - <name>instance-0000000e</name> - <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid> - <metadata> -  <nova:instance xmlns:nova=" -http://openstack.org/xmlns/libvirt/nova/1.1 -"> -   <nova:package version="25.1.0"/> -   <nova:name>provider-instance</nova:name> -   <nova:creationTime>2023-03-14 10:09:13</nova:creationTime> -   <nova:flavor name="cirros-os-dpu-test-1"> -    <nova:memory>64</nova:memory> -    <nova:disk>1</nova:disk> -    <nova:swap>0</nova:swap> -    <nova:ephemeral>0</nova:ephemeral> -    <nova:vcpus>1</nova:vcpus> -   </nova:flavor> -   <nova:owner> -    <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user> -    <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project> -   </nova:owner> -   <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/> -   <nova:ports> -    <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340"> -     <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/> -    </nova:port> -   </nova:ports> -  </nova:instance> - </metadata> - <memory unit='KiB'>65536</memory> - <currentMemory unit='KiB'>65536</currentMemory> - <vcpu placement='static'>1</vcpu> - <sysinfo type='smbios'> -  <system> -   <entry name='manufacturer'>OpenStack Foundation</entry> -   <entry name='product'>OpenStack Nova</entry> -   <entry name='version'>25.1.0</entry> -   <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='family'>Virtual Machine</entry> -  </system> - </sysinfo> - <os> -  <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type> -  <boot dev='hd'/> -  <smbios mode='sysinfo'/> - </os> - <features> -  <acpi/> -  <apic/> -  <vmcoreinfo state='on'/> - </features> - <cpu mode='host-model' check='partial'> -  <topology sockets='1' dies='1' cores='1' threads='1'/> - </cpu> - <clock offset='utc'> -  <timer name='pit' tickpolicy='delay'/> -  <timer name='rtc' tickpolicy='catchup'/> -  <timer name='hpet' present='no'/> - </clock> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> -  <emulator>/usr/bin/qemu-system-x86_64</emulator> -  <disk type='file' device='disk'> -   <driver name='qemu' type='qcow2' cache='none'/> -   <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/> -   <target dev='vda' bus='virtio'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> -  </disk> -  <controller type='usb' index='0' model='piix3-uhci'> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> -  </controller> -  <controller type='pci' index='0' model='pci-root'/> -  <interface type='hostdev' managed='yes'> -   <mac address='fa:16:3e:aa:d9:23'/> -   <source> -    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> -  </interface> -  <serial type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='isa-serial' port='0'> -    <model name='isa-serial'/> -   </target> -  </serial> -  <console type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='serial' port='0'/> -  </console> -  <input type='tablet' bus='usb'> -   <address type='usb' bus='0' port='1'/> -  </input> -  <input type='mouse' bus='ps2'/> -  <input type='keyboard' bus='ps2'/> -  <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'> -   <listen type='address' address='0.0.0.0'/> -  </graphics> -  <audio id='1' type='none'/> -  <video> -   <model type='virtio' heads='1' primary='yes'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> -  </video> -  <hostdev mode='subsystem' type='pci' managed='yes'> -   <source> -    <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> -  </hostdev> -  <memballoon model='virtio'> -   <stats period='10'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> -  </memballoon> -  <rng model='virtio'> -   <backend model='random'>/dev/urandom</backend> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> -  </rng> - </devices> -</domain> -``` ----- -Simon Jones - -This is happened in ubuntu22.04. -QEMU is install by apt like this: -apt install -y qemu qemu-kvm qemu-system -and QEMU version is 6.2.0 ----- -Simon Jones -Simon Jones < -batmanustc@gmail.com -> äº2023å¹´3æ21æ¥å¨äº 08:40åéï¼ -Hi all, -I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR. -Is there any one know this? -The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log -``` -2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument -kvm_set_phys_mem: error registering slot: Invalid argument -2023-03-14 10:09:18.198+0000: shutting down, reason=crashed -``` -The xml file -``` -root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml -<!-- -WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE -OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: - virsh edit instance-0000000e -or other application using the libvirt API. ---> -<domain type='kvm'> - <name>instance-0000000e</name> - <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid> - <metadata> -  <nova:instance xmlns:nova=" -http://openstack.org/xmlns/libvirt/nova/1.1 -"> -   <nova:package version="25.1.0"/> -   <nova:name>provider-instance</nova:name> -   <nova:creationTime>2023-03-14 10:09:13</nova:creationTime> -   <nova:flavor name="cirros-os-dpu-test-1"> -    <nova:memory>64</nova:memory> -    <nova:disk>1</nova:disk> -    <nova:swap>0</nova:swap> -    <nova:ephemeral>0</nova:ephemeral> -    <nova:vcpus>1</nova:vcpus> -   </nova:flavor> -   <nova:owner> -    <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user> -    <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project> -   </nova:owner> -   <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/> -   <nova:ports> -    <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340"> -     <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/> -    </nova:port> -   </nova:ports> -  </nova:instance> - </metadata> - <memory unit='KiB'>65536</memory> - <currentMemory unit='KiB'>65536</currentMemory> - <vcpu placement='static'>1</vcpu> - <sysinfo type='smbios'> -  <system> -   <entry name='manufacturer'>OpenStack Foundation</entry> -   <entry name='product'>OpenStack Nova</entry> -   <entry name='version'>25.1.0</entry> -   <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='family'>Virtual Machine</entry> -  </system> - </sysinfo> - <os> -  <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type> -  <boot dev='hd'/> -  <smbios mode='sysinfo'/> - </os> - <features> -  <acpi/> -  <apic/> -  <vmcoreinfo state='on'/> - </features> - <cpu mode='host-model' check='partial'> -  <topology sockets='1' dies='1' cores='1' threads='1'/> - </cpu> - <clock offset='utc'> -  <timer name='pit' tickpolicy='delay'/> -  <timer name='rtc' tickpolicy='catchup'/> -  <timer name='hpet' present='no'/> - </clock> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> -  <emulator>/usr/bin/qemu-system-x86_64</emulator> -  <disk type='file' device='disk'> -   <driver name='qemu' type='qcow2' cache='none'/> -   <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/> -   <target dev='vda' bus='virtio'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> -  </disk> -  <controller type='usb' index='0' model='piix3-uhci'> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> -  </controller> -  <controller type='pci' index='0' model='pci-root'/> -  <interface type='hostdev' managed='yes'> -   <mac address='fa:16:3e:aa:d9:23'/> -   <source> -    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> -  </interface> -  <serial type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='isa-serial' port='0'> -    <model name='isa-serial'/> -   </target> -  </serial> -  <console type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='serial' port='0'/> -  </console> -  <input type='tablet' bus='usb'> -   <address type='usb' bus='0' port='1'/> -  </input> -  <input type='mouse' bus='ps2'/> -  <input type='keyboard' bus='ps2'/> -  <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'> -   <listen type='address' address='0.0.0.0'/> -  </graphics> -  <audio id='1' type='none'/> -  <video> -   <model type='virtio' heads='1' primary='yes'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> -  </video> -  <hostdev mode='subsystem' type='pci' managed='yes'> -   <source> -    <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> -  </hostdev> -  <memballoon model='virtio'> -   <stats period='10'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> -  </memballoon> -  <rng model='virtio'> -   <backend model='random'>/dev/urandom</backend> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> -  </rng> - </devices> -</domain> -``` ----- -Simon Jones - -This is full ERROR log -2023-03-23 08:00:52.362+0000: starting up libvirt version: 8.0.0, package: 1ubuntu7.4 (Christian Ehrhardt < -christian.ehrhardt@canonical.com -> Tue, 22 Nov 2022 15:59:28 +0100), qemu version: 6.2.0Debian 1:6.2+dfsg-2ubuntu6.6, kernel: 5.19.0-35-generic, hostname: c1c2 -LC_ALL=C \ -PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin \ -HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e \ -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.local/share \ -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.cache \ -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.config \ -/usr/bin/qemu-system-x86_64 \ --name guest=instance-0000000e,debug-threads=on \ --S \ --object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-instance-0000000e/master-key.aes"}' \ --machine pc-i440fx-6.2,usb=off,dump-guest-core=off,memory-backend=pc.ram \ --accel kvm \ --cpu Cooperlake,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,sha-ni=on,umip=on,waitpkg=on,gfni=on,vaes=on,vpclmulqdq=on,rdpid=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,avx-vnni=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,hle=off,rtm=off,avx512f=off,avx512dq=off,avx512cd=off,avx512bw=off,avx512vl=off,avx512vnni=off,avx512-bf16=off,taa-no=off \ --m 64 \ --object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":67108864}' \ --overcommit mem-lock=off \ --smp 1,sockets=1,dies=1,cores=1,threads=1 \ --uuid ff91d2dc-69a1-43ef-abde-c9e4e9a0305b \ --smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=25.1.0,serial=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,uuid=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,family=Virtual Machine' \ --no-user-config \ --nodefaults \ --chardev socket,id=charmonitor,fd=33,server=on,wait=off \ --mon chardev=charmonitor,id=monitor,mode=control \ --rtc base=utc,driftfix=slew \ --global kvm-pit.lost_tick_policy=delay \ --no-hpet \ --no-shutdown \ --boot strict=on \ --device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ --blockdev '{"driver":"file","filename":"/var/lib/nova/instances/_base/8b58db82a488248e7c5e769599954adaa47a5314","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ --blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \ --blockdev '{"driver":"file","filename":"/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ --blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \ --device virtio-blk-pci,bus=pci.0,addr=0x3,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \ --add-fd set=1,fd=34 \ --chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on \ --device isa-serial,chardev=charserial0,id=serial0 \ --device usb-tablet,id=input0,bus=usb.0,port=1 \ --audiodev '{"id":"audio1","driver":"none"}' \ --vnc -0.0.0.0:0 -,audiodev=audio1 \ --device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x2 \ --device vfio-pci,host=0000:01:00.5,id=hostdev0,bus=pci.0,addr=0x4 \ --device vfio-pci,host=0000:01:00.6,id=hostdev1,bus=pci.0,addr=0x5 \ --device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \ --object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ --device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 \ --device vmcoreinfo \ --sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ --msg timestamp=on -char device redirected to /dev/pts/3 (label charserial0) -2023-03-23T08:00:53.728550Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument -kvm_set_phys_mem: error registering slot: Invalid argument -2023-03-23 08:00:54.201+0000: shutting down, reason=crashed -2023-03-23 08:54:43.468+0000: starting up libvirt version: 8.0.0, package: 1ubuntu7.4 (Christian Ehrhardt < -christian.ehrhardt@canonical.com -> Tue, 22 Nov 2022 15:59:28 +0100), qemu version: 6.2.0Debian 1:6.2+dfsg-2ubuntu6.6, kernel: 5.19.0-35-generic, hostname: c1c2 -LC_ALL=C \ -PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin \ -HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e \ -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.local/share \ -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.cache \ -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.config \ -/usr/bin/qemu-system-x86_64 \ --name guest=instance-0000000e,debug-threads=on \ --S \ --object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-5-instance-0000000e/master-key.aes"}' \ --machine pc-i440fx-6.2,usb=off,dump-guest-core=off,memory-backend=pc.ram \ --accel kvm \ --cpu Cooperlake,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,sha-ni=on,umip=on,waitpkg=on,gfni=on,vaes=on,vpclmulqdq=on,rdpid=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,avx-vnni=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,hle=off,rtm=off,avx512f=off,avx512dq=off,avx512cd=off,avx512bw=off,avx512vl=off,avx512vnni=off,avx512-bf16=off,taa-no=off \ --m 64 \ --object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":67108864}' \ --overcommit mem-lock=off \ --smp 1,sockets=1,dies=1,cores=1,threads=1 \ --uuid ff91d2dc-69a1-43ef-abde-c9e4e9a0305b \ --smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=25.1.0,serial=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,uuid=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,family=Virtual Machine' \ --no-user-config \ --nodefaults \ --chardev socket,id=charmonitor,fd=33,server=on,wait=off \ --mon chardev=charmonitor,id=monitor,mode=control \ --rtc base=utc,driftfix=slew \ --global kvm-pit.lost_tick_policy=delay \ --no-hpet \ --no-shutdown \ --boot strict=on \ --device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ --blockdev '{"driver":"file","filename":"/var/lib/nova/instances/_base/8b58db82a488248e7c5e769599954adaa47a5314","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ --blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \ --blockdev '{"driver":"file","filename":"/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ --blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \ --device virtio-blk-pci,bus=pci.0,addr=0x3,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \ --add-fd set=1,fd=34 \ --chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on \ --device isa-serial,chardev=charserial0,id=serial0 \ --device usb-tablet,id=input0,bus=usb.0,port=1 \ --audiodev '{"id":"audio1","driver":"none"}' \ --vnc -0.0.0.0:0 -,audiodev=audio1 \ --device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x2 \ --device vfio-pci,host=0000:01:00.5,id=hostdev0,bus=pci.0,addr=0x4 \ --device vfio-pci,host=0000:01:00.6,id=hostdev1,bus=pci.0,addr=0x5 \ --device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \ --object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ --device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 \ --device vmcoreinfo \ --sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ --msg timestamp=on -char device redirected to /dev/pts/3 (label charserial0) -2023-03-23T08:54:44.755039Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument -kvm_set_phys_mem: error registering slot: Invalid argument -2023-03-23 08:54:45.230+0000: shutting down, reason=crashed ----- -Simon Jones -Simon Jones < -batmanustc@gmail.com -> äº2023å¹´3æ23æ¥å¨å 05:49åéï¼ -This is happened in ubuntu22.04. -QEMU is install by apt like this: -apt install -y qemu qemu-kvm qemu-system -and QEMU version is 6.2.0 ----- -Simon Jones -Simon Jones < -batmanustc@gmail.com -> äº2023å¹´3æ21æ¥å¨äº 08:40åéï¼ -Hi all, -I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR. -Is there any one know this? -The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log -``` -2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument -kvm_set_phys_mem: error registering slot: Invalid argument -2023-03-14 10:09:18.198+0000: shutting down, reason=crashed -``` -The xml file -``` -root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml -<!-- -WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE -OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: - virsh edit instance-0000000e -or other application using the libvirt API. ---> -<domain type='kvm'> - <name>instance-0000000e</name> - <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid> - <metadata> -  <nova:instance xmlns:nova=" -http://openstack.org/xmlns/libvirt/nova/1.1 -"> -   <nova:package version="25.1.0"/> -   <nova:name>provider-instance</nova:name> -   <nova:creationTime>2023-03-14 10:09:13</nova:creationTime> -   <nova:flavor name="cirros-os-dpu-test-1"> -    <nova:memory>64</nova:memory> -    <nova:disk>1</nova:disk> -    <nova:swap>0</nova:swap> -    <nova:ephemeral>0</nova:ephemeral> -    <nova:vcpus>1</nova:vcpus> -   </nova:flavor> -   <nova:owner> -    <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user> -    <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project> -   </nova:owner> -   <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/> -   <nova:ports> -    <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340"> -     <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/> -    </nova:port> -   </nova:ports> -  </nova:instance> - </metadata> - <memory unit='KiB'>65536</memory> - <currentMemory unit='KiB'>65536</currentMemory> - <vcpu placement='static'>1</vcpu> - <sysinfo type='smbios'> -  <system> -   <entry name='manufacturer'>OpenStack Foundation</entry> -   <entry name='product'>OpenStack Nova</entry> -   <entry name='version'>25.1.0</entry> -   <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry> -   <entry name='family'>Virtual Machine</entry> -  </system> - </sysinfo> - <os> -  <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type> -  <boot dev='hd'/> -  <smbios mode='sysinfo'/> - </os> - <features> -  <acpi/> -  <apic/> -  <vmcoreinfo state='on'/> - </features> - <cpu mode='host-model' check='partial'> -  <topology sockets='1' dies='1' cores='1' threads='1'/> - </cpu> - <clock offset='utc'> -  <timer name='pit' tickpolicy='delay'/> -  <timer name='rtc' tickpolicy='catchup'/> -  <timer name='hpet' present='no'/> - </clock> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> -  <emulator>/usr/bin/qemu-system-x86_64</emulator> -  <disk type='file' device='disk'> -   <driver name='qemu' type='qcow2' cache='none'/> -   <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/> -   <target dev='vda' bus='virtio'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> -  </disk> -  <controller type='usb' index='0' model='piix3-uhci'> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> -  </controller> -  <controller type='pci' index='0' model='pci-root'/> -  <interface type='hostdev' managed='yes'> -   <mac address='fa:16:3e:aa:d9:23'/> -   <source> -    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> -  </interface> -  <serial type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='isa-serial' port='0'> -    <model name='isa-serial'/> -   </target> -  </serial> -  <console type='pty'> -   <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/> -   <target type='serial' port='0'/> -  </console> -  <input type='tablet' bus='usb'> -   <address type='usb' bus='0' port='1'/> -  </input> -  <input type='mouse' bus='ps2'/> -  <input type='keyboard' bus='ps2'/> -  <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'> -   <listen type='address' address='0.0.0.0'/> -  </graphics> -  <audio id='1' type='none'/> -  <video> -   <model type='virtio' heads='1' primary='yes'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> -  </video> -  <hostdev mode='subsystem' type='pci' managed='yes'> -   <source> -    <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/> -   </source> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> -  </hostdev> -  <memballoon model='virtio'> -   <stats period='10'/> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> -  </memballoon> -  <rng model='virtio'> -   <backend model='random'>/dev/urandom</backend> -   <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> -  </rng> - </devices> -</domain> -``` ----- -Simon Jones - diff --git a/results/classifier/02/other/12869209 b/results/classifier/02/other/12869209 deleted file mode 100644 index c8aa623f4..000000000 --- a/results/classifier/02/other/12869209 +++ /dev/null @@ -1,89 +0,0 @@ -other: 0.964 -mistranslation: 0.935 -instruction: 0.919 -semantic: 0.891 -boot: 0.837 - -[BUG FIX][PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization - -This is a series fixing a bug in - host-user-blk. -Is there any chance for it to be considered for the next rc? -Thanks! -Denis -On 29.03.2021 16:44, Denis Plotnikov - wrote: -ping! -On 25.03.2021 18:12, Denis Plotnikov - wrote: -v3: - * 0003: a new patch added fixing the problem on vm shutdown - I stumbled on this bug after v2 sending. - * 0001: gramma fixing (Raphael) - * 0002: commit message fixing (Raphael) - -v2: - * split the initial patch into two (Raphael) - * rename init to realized (Raphael) - * remove unrelated comment (Raphael) - -When the vhost-user-blk device lose the connection to the daemon during -the initialization phase it kills qemu because of the assert in the code. -The series fixes the bug. - -0001 is preparation for the fix -0002 fixes the bug, patch description has the full motivation for the series -0003 (added in v3) fix bug on vm shutdown - -Denis Plotnikov (3): - vhost-user-blk: use different event handlers on initialization - vhost-user-blk: perform immediate cleanup if disconnect on - initialization - vhost-user-blk: add immediate cleanup on shutdown - - hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++--------------- - 1 file changed, 48 insertions(+), 31 deletions(-) - -On 01.04.2021 14:21, Denis Plotnikov wrote: -This is a series fixing a bug in host-user-blk. -More specifically, it's not just a bug but crasher. - -Valentine -Is there any chance for it to be considered for the next rc? - -Thanks! - -Denis - -On 29.03.2021 16:44, Denis Plotnikov wrote: -ping! - -On 25.03.2021 18:12, Denis Plotnikov wrote: -v3: - * 0003: a new patch added fixing the problem on vm shutdown - I stumbled on this bug after v2 sending. - * 0001: gramma fixing (Raphael) - * 0002: commit message fixing (Raphael) - -v2: - * split the initial patch into two (Raphael) - * rename init to realized (Raphael) - * remove unrelated comment (Raphael) - -When the vhost-user-blk device lose the connection to the daemon during -the initialization phase it kills qemu because of the assert in the code. -The series fixes the bug. - -0001 is preparation for the fix -0002 fixes the bug, patch description has the full motivation for the series -0003 (added in v3) fix bug on vm shutdown - -Denis Plotnikov (3): - vhost-user-blk: use different event handlers on initialization - vhost-user-blk: perform immediate cleanup if disconnect on - initialization - vhost-user-blk: add immediate cleanup on shutdown - - hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++--------------- - 1 file changed, 48 insertions(+), 31 deletions(-) - diff --git a/results/classifier/02/other/13442371 b/results/classifier/02/other/13442371 deleted file mode 100644 index 491193607..000000000 --- a/results/classifier/02/other/13442371 +++ /dev/null @@ -1,370 +0,0 @@ -other: 0.886 -instruction: 0.861 -mistranslation: 0.859 -semantic: 0.850 -boot: 0.815 - -[Qemu-devel] [BUG] nanoMIPS support problem related to extract2 support for i386 TCG target - -Hello, Richard, Peter, and others. - -As a part of activities before 4.1 release, I tested nanoMIPS support -in QEMU (which was officially fully integrated in 4.0, is currently -limited to system mode only, and was tested in a similar fashion right -prior to 4.0). - -This support appears to be broken now. Following command line works in -4.0, but results in kernel panic for the current tip of the tree: - -~/Build/qemu-test-revert-c6fb8c0cf704/mipsel-softmmu/qemu-system-mipsel --cpu I7200 -kernel generic_nano32r6el_page4k -M malta -serial stdio -m -1G -hda nanomips32r6_le_sf_2017.05-03-59-gf5595d6.ext4 -append -"mem=256m@0x0 rw console=ttyS0 vga=cirrus vesa=0x111 root=/dev/sda" - -(kernel and rootfs image files used in this commend line can be -downloaded from the locations mentioned in our user guide) - -The quick bisect points to the commit: - -commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab -Author: Richard Henderson <address@hidden> -Date: Mon Feb 25 11:42:35 2019 -0800 - - tcg/i386: Support INDEX_op_extract2_{i32,i64} - - Signed-off-by: Richard Henderson <address@hidden> - -Please advise on further actions. - -Yours, -Aleksandar - -On Fri, Jul 12, 2019 at 8:09 PM Aleksandar Markovic -<address@hidden> wrote: -> -> -Hello, Richard, Peter, and others. -> -> -As a part of activities before 4.1 release, I tested nanoMIPS support -> -in QEMU (which was officially fully integrated in 4.0, is currently -> -limited to system mode only, and was tested in a similar fashion right -> -prior to 4.0). -> -> -This support appears to be broken now. Following command line works in -> -4.0, but results in kernel panic for the current tip of the tree: -> -> -~/Build/qemu-test-revert-c6fb8c0cf704/mipsel-softmmu/qemu-system-mipsel -> --cpu I7200 -kernel generic_nano32r6el_page4k -M malta -serial stdio -m -> -1G -hda nanomips32r6_le_sf_2017.05-03-59-gf5595d6.ext4 -append -> -"mem=256m@0x0 rw console=ttyS0 vga=cirrus vesa=0x111 root=/dev/sda" -> -> -(kernel and rootfs image files used in this commend line can be -> -downloaded from the locations mentioned in our user guide) -> -> -The quick bisect points to the commit: -> -> -commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab -> -Author: Richard Henderson <address@hidden> -> -Date: Mon Feb 25 11:42:35 2019 -0800 -> -> -tcg/i386: Support INDEX_op_extract2_{i32,i64} -> -> -Signed-off-by: Richard Henderson <address@hidden> -> -> -Please advise on further actions. -> -Just to add a data point: - -If the following change is applied: - -diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h -index 928e8b8..b6a4cf2 100644 ---- a/tcg/i386/tcg-target.h -+++ b/tcg/i386/tcg-target.h -@@ -124,7 +124,7 @@ extern bool have_avx2; - #define TCG_TARGET_HAS_deposit_i32 1 - #define TCG_TARGET_HAS_extract_i32 1 - #define TCG_TARGET_HAS_sextract_i32 1 --#define TCG_TARGET_HAS_extract2_i32 1 -+#define TCG_TARGET_HAS_extract2_i32 0 - #define TCG_TARGET_HAS_movcond_i32 1 - #define TCG_TARGET_HAS_add2_i32 1 - #define TCG_TARGET_HAS_sub2_i32 1 -@@ -163,7 +163,7 @@ extern bool have_avx2; - #define TCG_TARGET_HAS_deposit_i64 1 - #define TCG_TARGET_HAS_extract_i64 1 - #define TCG_TARGET_HAS_sextract_i64 0 --#define TCG_TARGET_HAS_extract2_i64 1 -+#define TCG_TARGET_HAS_extract2_i64 0 - #define TCG_TARGET_HAS_movcond_i64 1 - #define TCG_TARGET_HAS_add2_i64 1 - #define TCG_TARGET_HAS_sub2_i64 1 - -... the problem disappears. - - -> -Yours, -> -Aleksandar - -On Fri, Jul 12, 2019 at 8:19 PM Aleksandar Markovic -<address@hidden> wrote: -> -> -On Fri, Jul 12, 2019 at 8:09 PM Aleksandar Markovic -> -<address@hidden> wrote: -> -> -> -> Hello, Richard, Peter, and others. -> -> -> -> As a part of activities before 4.1 release, I tested nanoMIPS support -> -> in QEMU (which was officially fully integrated in 4.0, is currently -> -> limited to system mode only, and was tested in a similar fashion right -> -> prior to 4.0). -> -> -> -> This support appears to be broken now. Following command line works in -> -> 4.0, but results in kernel panic for the current tip of the tree: -> -> -> -> ~/Build/qemu-test-revert-c6fb8c0cf704/mipsel-softmmu/qemu-system-mipsel -> -> -cpu I7200 -kernel generic_nano32r6el_page4k -M malta -serial stdio -m -> -> 1G -hda nanomips32r6_le_sf_2017.05-03-59-gf5595d6.ext4 -append -> -> "mem=256m@0x0 rw console=ttyS0 vga=cirrus vesa=0x111 root=/dev/sda" -> -> -> -> (kernel and rootfs image files used in this commend line can be -> -> downloaded from the locations mentioned in our user guide) -> -> -> -> The quick bisect points to the commit: -> -> -> -> commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab -> -> Author: Richard Henderson <address@hidden> -> -> Date: Mon Feb 25 11:42:35 2019 -0800 -> -> -> -> tcg/i386: Support INDEX_op_extract2_{i32,i64} -> -> -> -> Signed-off-by: Richard Henderson <address@hidden> -> -> -> -> Please advise on further actions. -> -> -> -> -Just to add a data point: -> -> -If the following change is applied: -> -> -diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h -> -index 928e8b8..b6a4cf2 100644 -> ---- a/tcg/i386/tcg-target.h -> -+++ b/tcg/i386/tcg-target.h -> -@@ -124,7 +124,7 @@ extern bool have_avx2; -> -#define TCG_TARGET_HAS_deposit_i32 1 -> -#define TCG_TARGET_HAS_extract_i32 1 -> -#define TCG_TARGET_HAS_sextract_i32 1 -> --#define TCG_TARGET_HAS_extract2_i32 1 -> -+#define TCG_TARGET_HAS_extract2_i32 0 -> -#define TCG_TARGET_HAS_movcond_i32 1 -> -#define TCG_TARGET_HAS_add2_i32 1 -> -#define TCG_TARGET_HAS_sub2_i32 1 -> -@@ -163,7 +163,7 @@ extern bool have_avx2; -> -#define TCG_TARGET_HAS_deposit_i64 1 -> -#define TCG_TARGET_HAS_extract_i64 1 -> -#define TCG_TARGET_HAS_sextract_i64 0 -> --#define TCG_TARGET_HAS_extract2_i64 1 -> -+#define TCG_TARGET_HAS_extract2_i64 0 -> -#define TCG_TARGET_HAS_movcond_i64 1 -> -#define TCG_TARGET_HAS_add2_i64 1 -> -#define TCG_TARGET_HAS_sub2_i64 1 -> -> -... the problem disappears. -> -It looks the problem is in this code segment in of tcg_gen_deposit_i32(): - - if (ofs == 0) { - tcg_gen_extract2_i32(ret, arg1, arg2, len); - tcg_gen_rotli_i32(ret, ret, len); - goto done; - } - -) - -If that code segment is deleted altogether (which effectively forces -usage of "fallback" part of tcg_gen_deposit_i32()), the problem also -vanishes (without changes from my previous mail). - -> -> -> Yours, -> -> Aleksandar - -Aleksandar Markovic <address@hidden> writes: - -> -Hello, Richard, Peter, and others. -> -> -As a part of activities before 4.1 release, I tested nanoMIPS support -> -in QEMU (which was officially fully integrated in 4.0, is currently -> -limited to system mode only, and was tested in a similar fashion right -> -prior to 4.0). -> -> -This support appears to be broken now. Following command line works in -> -4.0, but results in kernel panic for the current tip of the tree: -> -> -~/Build/qemu-test-revert-c6fb8c0cf704/mipsel-softmmu/qemu-system-mipsel -> --cpu I7200 -kernel generic_nano32r6el_page4k -M malta -serial stdio -m -> -1G -hda nanomips32r6_le_sf_2017.05-03-59-gf5595d6.ext4 -append -> -"mem=256m@0x0 rw console=ttyS0 vga=cirrus vesa=0x111 root=/dev/sda" -> -> -(kernel and rootfs image files used in this commend line can be -> -downloaded from the locations mentioned in our user guide) -> -> -The quick bisect points to the commit: -> -> -commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab -> -Author: Richard Henderson <address@hidden> -> -Date: Mon Feb 25 11:42:35 2019 -0800 -> -> -tcg/i386: Support INDEX_op_extract2_{i32,i64} -> -> -Signed-off-by: Richard Henderson <address@hidden> -> -> -Please advise on further actions. -Please see the fix: - - Subject: [PATCH for-4.1] tcg: Fix constant folding of INDEX_op_extract2_i32 - Date: Tue, 9 Jul 2019 14:19:00 +0200 - Message-Id: <address@hidden> - -> -> -Yours, -> -Aleksandar --- -Alex Bennée - -On Sat, Jul 13, 2019 at 9:21 AM Alex Bennée <address@hidden> wrote: -> -> -Please see the fix: -> -> -Subject: [PATCH for-4.1] tcg: Fix constant folding of INDEX_op_extract2_i32 -> -Date: Tue, 9 Jul 2019 14:19:00 +0200 -> -Message-Id: <address@hidden> -> -Thanks, this fixed the behavior. - -Sincerely, -Aleksandar - -> -> -> -> -> Yours, -> -> Aleksandar -> -> -> --- -> -Alex Bennée -> - diff --git a/results/classifier/02/other/14488057 b/results/classifier/02/other/14488057 deleted file mode 100644 index b00b838c4..000000000 --- a/results/classifier/02/other/14488057 +++ /dev/null @@ -1,712 +0,0 @@ -other: 0.922 -instruction: 0.908 -semantic: 0.905 -boot: 0.892 -mistranslation: 0.885 - -[Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching - -This is an issue in QEMU's system emulation for X86 in TCG mode. -The issue permits an attacker who can execute code in guest ring 3 -with normal user privileges to inject code into other processes that -are running in guest ring 3, in particular root-owned processes. - -== reproduction steps == - - - Create an x86-64 VM and install Debian Jessie in it. The following - steps should all be executed inside the VM. - - Verify that procmail is installed and the correct version: - address@hidden:~# apt-cache show procmail | egrep 'Version|SHA' - Version: 3.22-24 - SHA1: 54ed2d51db0e76f027f06068ab5371048c13434c - SHA256: 4488cf6975af9134a9b5238d5d70e8be277f70caa45a840dfbefd2dc444bfe7f - - Install build-essential and nasm ("apt install build-essential nasm"). - - Unpack the exploit, compile it and run it: - address@hidden:~$ tar xvf procmail_cache_attack.tar - procmail_cache_attack/ - procmail_cache_attack/shellcode.asm - procmail_cache_attack/xp.c - procmail_cache_attack/compile.sh - procmail_cache_attack/attack.c - address@hidden:~$ cd procmail_cache_attack - address@hidden:~/procmail_cache_attack$ ./compile.sh - address@hidden:~/procmail_cache_attack$ ./attack - memory mappings set up - child is dead, codegen should be complete - executing code as root! :) - address@hidden:~/procmail_cache_attack# id - uid=0(root) gid=0(root) groups=0(root),[...] - -Note: While the exploit depends on the precise version of procmail, -the actual vulnerability is in QEMU, not in procmail. procmail merely -serves as a seldomly-executed setuid root binary into which code can -be injected. - - -== detailed issue description == -QEMU caches translated basic blocks. To look up a translated basic -block, the function tb_find() is used, which uses tb_htable_lookup() -in its slowpath, which in turn compares translated basic blocks -(TranslationBlock) to the lookup information (struct tb_desc) using -tb_cmp(). - -tb_cmp() attempts to ensure (among other things) that both the virtual -start address of the basic block and the physical addresses that the -basic block covers match. When checking the physical addresses, it -assumes that a basic block can span at most two pages. - -gen_intermediate_code() attempts to enforce this by stopping the -translation of a basic block if nearly one page of instructions has -been translated already: - - /* if too long translation, stop generation too */ - if (tcg_op_buf_full() || - (pc_ptr - pc_start) >= (TARGET_PAGE_SIZE - 32) || - num_insns >= max_insns) { - gen_jmp_im(pc_ptr - dc->cs_base); - gen_eob(dc); - break; - } - -However, while real X86 processors have a maximum instruction length -of 15 bytes, QEMU's instruction decoder for X86 does not place any -limit on the instruction length or the number of instruction prefixes. -Therefore, it is possible to create an arbitrarily long instruction -by e.g. prepending an arbitrary number of LOCK prefixes to a normal -instruction. This permits creating a basic block that spans three -pages by simply appending an approximately page-sized instruction to -the end of a normal basic block that starts close to the end of a -page. - -Such an overlong basic block causes the basic block caching to fail as -follows: If code is generated and cached for a basic block that spans -the physical pages (A,E,B), this basic block will be returned by -lookups in a process in which the physical pages (A,B,C) are mapped -in the same virtual address range (assuming that all other lookup -parameters match). - -This behavior can be abused by an attacker e.g. as follows: If a -non-relocatable world-readable setuid executable legitimately contains -the pages (A,B,C), an attacker can map (A,E,B) into his own process, -at the normal load address of A, where E is an attacker-controlled -page. If a legitimate basic block spans the pages A and B, an attacker -can write arbitrary non-branch instructions at the start of E, then -append an overlong instruction -that ends behind the start of C, yielding a modified basic block that -spans all three pages. If the attacker then executes the modified -basic block in his process, the modified basic block is cached. -Next, the attacker can execute the setuid binary, which will reuse the -cached modified basic block, executing attacker-controlled -instructions in the context of the privileged process. - -I am sending this to qemu-devel because a QEMU security contact -told me that QEMU does not consider privilege escalation inside a -TCG VM to be a security concern. -procmail_cache_attack.tar -Description: -Unix tar archive - -On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote: -> -This is an issue in QEMU's system emulation for X86 in TCG mode. -> -The issue permits an attacker who can execute code in guest ring 3 -> -with normal user privileges to inject code into other processes that -> -are running in guest ring 3, in particular root-owned processes. -> -I am sending this to qemu-devel because a QEMU security contact -> -told me that QEMU does not consider privilege escalation inside a -> -TCG VM to be a security concern. -Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. - -We should really fix the crossing-a-page-boundary code for x86. -I believe we do get it correct for ARM Thumb instructions. - -thanks --- PMM - -On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote: -> -On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote: -> -> This is an issue in QEMU's system emulation for X86 in TCG mode. -> -> The issue permits an attacker who can execute code in guest ring 3 -> -> with normal user privileges to inject code into other processes that -> -> are running in guest ring 3, in particular root-owned processes. -> -> -> I am sending this to qemu-devel because a QEMU security contact -> -> told me that QEMU does not consider privilege escalation inside a -> -> TCG VM to be a security concern. -> -> -Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. -> -> -We should really fix the crossing-a-page-boundary code for x86. -> -I believe we do get it correct for ARM Thumb instructions. -How about doing the instruction size check as follows? - -diff --git a/target/i386/translate.c b/target/i386/translate.c -index 72c1b03a2a..94cf3da719 100644 ---- a/target/i386/translate.c -+++ b/target/i386/translate.c -@@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State -*env, DisasContext *s, - default: - goto unknown_op; - } -+ if (s->pc - pc_start > 15) { -+ s->pc = pc_start; -+ goto illegal_op; -+ } - return s->pc; - illegal_op: - gen_illegal_opcode(s); - -Thanks, --- -Pranith - -On 22 March 2017 at 14:55, Pranith Kumar <address@hidden> wrote: -> -On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote: -> -> On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote: -> ->> This is an issue in QEMU's system emulation for X86 in TCG mode. -> ->> The issue permits an attacker who can execute code in guest ring 3 -> ->> with normal user privileges to inject code into other processes that -> ->> are running in guest ring 3, in particular root-owned processes. -> -> -> ->> I am sending this to qemu-devel because a QEMU security contact -> ->> told me that QEMU does not consider privilege escalation inside a -> ->> TCG VM to be a security concern. -> -> -> -> Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. -> -> -> -> We should really fix the crossing-a-page-boundary code for x86. -> -> I believe we do get it correct for ARM Thumb instructions. -> -> -How about doing the instruction size check as follows? -> -> -diff --git a/target/i386/translate.c b/target/i386/translate.c -> -index 72c1b03a2a..94cf3da719 100644 -> ---- a/target/i386/translate.c -> -+++ b/target/i386/translate.c -> -@@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State -> -*env, DisasContext *s, -> -default: -> -goto unknown_op; -> -} -> -+ if (s->pc - pc_start > 15) { -> -+ s->pc = pc_start; -> -+ goto illegal_op; -> -+ } -> -return s->pc; -> -illegal_op: -> -gen_illegal_opcode(s); -This doesn't look right because it means we'll check -only after we've emitted all the code to do the -instruction operation, so the effect will be -"execute instruction, then take illegal-opcode -exception". - -We should check what the x86 architecture spec actually -says and implement that. - -thanks --- PMM - -On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell -<address@hidden> wrote: -> -> -> -> How about doing the instruction size check as follows? -> -> -> -> diff --git a/target/i386/translate.c b/target/i386/translate.c -> -> index 72c1b03a2a..94cf3da719 100644 -> -> --- a/target/i386/translate.c -> -> +++ b/target/i386/translate.c -> -> @@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State -> -> *env, DisasContext *s, -> -> default: -> -> goto unknown_op; -> -> } -> -> + if (s->pc - pc_start > 15) { -> -> + s->pc = pc_start; -> -> + goto illegal_op; -> -> + } -> -> return s->pc; -> -> illegal_op: -> -> gen_illegal_opcode(s); -> -> -This doesn't look right because it means we'll check -> -only after we've emitted all the code to do the -> -instruction operation, so the effect will be -> -"execute instruction, then take illegal-opcode -> -exception". -> -The pc is restored to original address (s->pc = pc_start), so the -exception will overwrite the generated illegal instruction and will be -executed first. - -But yes, it's better to follow the architecture manual. - -Thanks, --- -Pranith - -On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote: -> -On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell -> -<address@hidden> wrote: -> -> This doesn't look right because it means we'll check -> -> only after we've emitted all the code to do the -> -> instruction operation, so the effect will be -> -> "execute instruction, then take illegal-opcode -> -> exception". -> -The pc is restored to original address (s->pc = pc_start), so the -> -exception will overwrite the generated illegal instruction and will be -> -executed first. -s->pc is the guest PC -- moving that backwards will -not do anything about the generated TCG IR that's -already been written. You'd need to rewind the -write pointer in the IR stream, which there is -no support for doing AFAIK. - -thanks --- PMM - -On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydell -<address@hidden> wrote: -> -On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote: -> -> On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell -> -> <address@hidden> wrote: -> ->> This doesn't look right because it means we'll check -> ->> only after we've emitted all the code to do the -> ->> instruction operation, so the effect will be -> ->> "execute instruction, then take illegal-opcode -> ->> exception". -> -> -> The pc is restored to original address (s->pc = pc_start), so the -> -> exception will overwrite the generated illegal instruction and will be -> -> executed first. -> -> -s->pc is the guest PC -- moving that backwards will -> -not do anything about the generated TCG IR that's -> -already been written. You'd need to rewind the -> -write pointer in the IR stream, which there is -> -no support for doing AFAIK. -Ah, OK. Thanks for the explanation. May be we should check the size of -the instruction while decoding the prefixes and error out once we -exceed the limit. We would not generate any IR code. - --- -Pranith - -On 03/23/2017 02:29 AM, Pranith Kumar wrote: -On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydell -<address@hidden> wrote: -On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote: -On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell -<address@hidden> wrote: -This doesn't look right because it means we'll check -only after we've emitted all the code to do the -instruction operation, so the effect will be -"execute instruction, then take illegal-opcode -exception". -The pc is restored to original address (s->pc = pc_start), so the -exception will overwrite the generated illegal instruction and will be -executed first. -s->pc is the guest PC -- moving that backwards will -not do anything about the generated TCG IR that's -already been written. You'd need to rewind the -write pointer in the IR stream, which there is -no support for doing AFAIK. -Ah, OK. Thanks for the explanation. May be we should check the size of -the instruction while decoding the prefixes and error out once we -exceed the limit. We would not generate any IR code. -Yes. -It would not enforce a true limit of 15 bytes, since you can't know that until -you've done the rest of the decode. But you'd be able to say that no more than -14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 bytes is used. -Which does fix the bug. - - -r~ - -On 22/03/2017 21:01, Richard Henderson wrote: -> -> -> -> Ah, OK. Thanks for the explanation. May be we should check the size of -> -> the instruction while decoding the prefixes and error out once we -> -> exceed the limit. We would not generate any IR code. -> -> -Yes. -> -> -It would not enforce a true limit of 15 bytes, since you can't know that -> -until you've done the rest of the decode. But you'd be able to say that -> -no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 -> -bytes is used. -> -> -Which does fix the bug. -Yeah, that would work for 2.9 if somebody wants to put together a patch. - Ensuring that all instruction fetching happens before translation side -effects is a little harder, but perhaps it's also the opportunity to get -rid of s->rip_offset which is a little ugly. - -Paolo - -On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote: -> -> -> -On 22/03/2017 21:01, Richard Henderson wrote: -> ->> -> ->> Ah, OK. Thanks for the explanation. May be we should check the size of -> ->> the instruction while decoding the prefixes and error out once we -> ->> exceed the limit. We would not generate any IR code. -> -> -> -> Yes. -> -> -> -> It would not enforce a true limit of 15 bytes, since you can't know that -> -> until you've done the rest of the decode. But you'd be able to say that -> -> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 -> -> bytes is used. -> -> -> -> Which does fix the bug. -> -> -Yeah, that would work for 2.9 if somebody wants to put together a patch. -> -Ensuring that all instruction fetching happens before translation side -> -effects is a little harder, but perhaps it's also the opportunity to get -> -rid of s->rip_offset which is a little ugly. -How about the following? - -diff --git a/target/i386/translate.c b/target/i386/translate.c -index 72c1b03a2a..67c58b8900 100644 ---- a/target/i386/translate.c -+++ b/target/i386/translate.c -@@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State -*env, DisasContext *s, - s->vex_l = 0; - s->vex_v = 0; - next_byte: -+ /* The prefixes can atmost be 14 bytes since x86 has an upper -+ limit of 15 bytes for the instruction */ -+ if (s->pc - pc_start > 14) { -+ goto illegal_op; -+ } - b = cpu_ldub_code(env, s->pc); - s->pc++; - /* Collect prefixes. */ - --- -Pranith - -On 23/03/2017 17:50, Pranith Kumar wrote: -> -On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote: -> -> -> -> -> -> On 22/03/2017 21:01, Richard Henderson wrote: -> ->>> -> ->>> Ah, OK. Thanks for the explanation. May be we should check the size of -> ->>> the instruction while decoding the prefixes and error out once we -> ->>> exceed the limit. We would not generate any IR code. -> ->> -> ->> Yes. -> ->> -> ->> It would not enforce a true limit of 15 bytes, since you can't know that -> ->> until you've done the rest of the decode. But you'd be able to say that -> ->> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 -> ->> bytes is used. -> ->> -> ->> Which does fix the bug. -> -> -> -> Yeah, that would work for 2.9 if somebody wants to put together a patch. -> -> Ensuring that all instruction fetching happens before translation side -> -> effects is a little harder, but perhaps it's also the opportunity to get -> -> rid of s->rip_offset which is a little ugly. -> -> -How about the following? -> -> -diff --git a/target/i386/translate.c b/target/i386/translate.c -> -index 72c1b03a2a..67c58b8900 100644 -> ---- a/target/i386/translate.c -> -+++ b/target/i386/translate.c -> -@@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State -> -*env, DisasContext *s, -> -s->vex_l = 0; -> -s->vex_v = 0; -> -next_byte: -> -+ /* The prefixes can atmost be 14 bytes since x86 has an upper -> -+ limit of 15 bytes for the instruction */ -> -+ if (s->pc - pc_start > 14) { -> -+ goto illegal_op; -> -+ } -> -b = cpu_ldub_code(env, s->pc); -> -s->pc++; -> -/* Collect prefixes. */ -Please make the comment more verbose, based on Richard's remark. We -should apply it to 2.9. - -Also, QEMU usually formats comments with stars on every line. - -Paolo - -On Thu, Mar 23, 2017 at 1:37 PM, Paolo Bonzini <address@hidden> wrote: -> -> -> -On 23/03/2017 17:50, Pranith Kumar wrote: -> -> On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote: -> ->> -> ->> -> ->> On 22/03/2017 21:01, Richard Henderson wrote: -> ->>>> -> ->>>> Ah, OK. Thanks for the explanation. May be we should check the size of -> ->>>> the instruction while decoding the prefixes and error out once we -> ->>>> exceed the limit. We would not generate any IR code. -> ->>> -> ->>> Yes. -> ->>> -> ->>> It would not enforce a true limit of 15 bytes, since you can't know that -> ->>> until you've done the rest of the decode. But you'd be able to say that -> ->>> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 -> ->>> bytes is used. -> ->>> -> ->>> Which does fix the bug. -> ->> -> ->> Yeah, that would work for 2.9 if somebody wants to put together a patch. -> ->> Ensuring that all instruction fetching happens before translation side -> ->> effects is a little harder, but perhaps it's also the opportunity to get -> ->> rid of s->rip_offset which is a little ugly. -> -> -> -> How about the following? -> -> -> -> diff --git a/target/i386/translate.c b/target/i386/translate.c -> -> index 72c1b03a2a..67c58b8900 100644 -> -> --- a/target/i386/translate.c -> -> +++ b/target/i386/translate.c -> -> @@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State -> -> *env, DisasContext *s, -> -> s->vex_l = 0; -> -> s->vex_v = 0; -> -> next_byte: -> -> + /* The prefixes can atmost be 14 bytes since x86 has an upper -> -> + limit of 15 bytes for the instruction */ -> -> + if (s->pc - pc_start > 14) { -> -> + goto illegal_op; -> -> + } -> -> b = cpu_ldub_code(env, s->pc); -> -> s->pc++; -> -> /* Collect prefixes. */ -> -> -Please make the comment more verbose, based on Richard's remark. We -> -should apply it to 2.9. -> -> -Also, QEMU usually formats comments with stars on every line. -OK. I'll send a proper patch with updated comment. - -Thanks, --- -Pranith - diff --git a/results/classifier/02/other/16056596 b/results/classifier/02/other/16056596 deleted file mode 100644 index 4891231ac..000000000 --- a/results/classifier/02/other/16056596 +++ /dev/null @@ -1,99 +0,0 @@ -other: 0.980 -semantic: 0.979 -instruction: 0.975 -boot: 0.971 -mistranslation: 0.961 - -[BUG][powerpc] KVM Guest Boot Failure and Hang at "Booting Linux via __start()" - -Bug Description: -Encountering a boot failure when launching a KVM guest with -'qemu-system-ppc64'. The guest hangs at boot, and the QEMU monitor -crashes. -Reproduction Steps: -# qemu-system-ppc64 --version -QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f) -Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers -# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine -pseries,accel=kvm \ --m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \ - -device virtio-scsi-pci,id=scsi \ --drive -file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2 -\ --device scsi-hd,drive=drive0,bus=scsi.0 \ - -netdev bridge,id=net0,br=virbr0 \ - -device virtio-net-pci,netdev=net0 \ - -serial pty \ - -device virtio-balloon-pci \ - -cpu host -QEMU 9.2.50 monitor - type 'help' for more information -char device redirected to /dev/pts/2 (label serial0) -(qemu) -(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but -unavailable: IRQ_XIVE capability must be present for KVM -Falling back to kernel-irqchip=off -** Qemu Hang - -(In another ssh session) -# screen /dev/pts/2 -Preparing to boot Linux version 6.10.4-200.fc40.ppc64le -(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801 -(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11 -15:20:17 UTC 2024 -Detected machine type: 0000000000000101 -command line: -BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le -root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M -Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) -Calling ibm,client-architecture-support... done -memory layout at init: - memory_limit : 0000000000000000 (16 MB aligned) - alloc_bottom : 0000000008200000 - alloc_top : 0000000030000000 - alloc_top_hi : 0000000800000000 - rmo_top : 0000000030000000 - ram_top : 0000000800000000 -instantiating rtas at 0x000000002fff0000... done -prom_hold_cpus: skipped -copying OF device tree... -Building dt strings... -Building dt structure... -Device tree strings 0x0000000008210000 -> 0x0000000008210bd0 -Device tree struct 0x0000000008220000 -> 0x0000000008230000 -Quiescing Open Firmware ... -Booting Linux via __start() @ 0x0000000000440000 ... -** Guest Console Hang - - -Git Bisect: -Performing git bisect points to the following patch: -# git bisect bad -e8291ec16da80566c121c68d9112be458954d90b is the first bad commit -commit e8291ec16da80566c121c68d9112be458954d90b (HEAD) -Author: Nicholas Piggin <npiggin@gmail.com> -Date: Thu Dec 19 13:40:31 2024 +1000 - - target/ppc: fix timebase register reset state -(H)DEC and PURR get reset before icount does, which causes them to -be -skewed and not match the init state. This can cause replay to not -match the recorded trace exactly. For DEC and HDEC this is usually -not -noticable since they tend to get programmed before affecting the - target machine. PURR has been observed to cause replay bugs when - running Linux. - - Fix this by resetting using a time of 0. - - Message-ID: <20241219034035.1826173-2-npiggin@gmail.com> - Signed-off-by: Nicholas Piggin <npiggin@gmail.com> - - hw/ppc/ppc.c | 11 ++++++++--- - 1 file changed, 8 insertions(+), 3 deletions(-) - - -Reverting the patch helps boot the guest. -Thanks, -Misbah Anjum N - diff --git a/results/classifier/02/other/16201167 b/results/classifier/02/other/16201167 deleted file mode 100644 index 03d454cf4..000000000 --- a/results/classifier/02/other/16201167 +++ /dev/null @@ -1,101 +0,0 @@ -other: 0.954 -mistranslation: 0.947 -semantic: 0.933 -instruction: 0.922 -boot: 0.845 - -[BUG] Qemu abort with error "kvm_mem_ioeventfd_add: error adding ioeventfd: File exists (17)" - -Hi list, - -When I did some tests in my virtual domain with live-attached virtio deivces, I -got a coredump file of Qemu. - -The error print from qemu is "kvm_mem_ioeventfd_add: error adding ioeventfd: -File exists (17)". -And the call trace in the coredump file displays as below: -#0 0x0000ffff89acecc8 in ?? () from /usr/lib64/libc.so.6 -#1 0x0000ffff89a8acbc in raise () from /usr/lib64/libc.so.6 -#2 0x0000ffff89a78d2c in abort () from /usr/lib64/libc.so.6 -#3 0x0000aaaabd7ccf1c in kvm_mem_ioeventfd_add (listener=<optimized out>, -section=<optimized out>, match_data=<optimized out>, data=<optimized out>, -e=<optimized out>) at ../accel/kvm/kvm-all.c:1607 -#4 0x0000aaaabd6e0304 in address_space_add_del_ioeventfds (fds_old_nb=164, -fds_old=0xffff5c80a1d0, fds_new_nb=160, fds_new=0xffff5c565080, -as=0xaaaabdfa8810 <address_space_memory>) - at ../softmmu/memory.c:795 -#5 address_space_update_ioeventfds (as=0xaaaabdfa8810 <address_space_memory>) -at ../softmmu/memory.c:856 -#6 0x0000aaaabd6e24d8 in memory_region_commit () at ../softmmu/memory.c:1113 -#7 0x0000aaaabd6e25c4 in memory_region_transaction_commit () at -../softmmu/memory.c:1144 -#8 0x0000aaaabd394eb4 in pci_bridge_update_mappings -(br=br@entry=0xaaaae755f7c0) at ../hw/pci/pci_bridge.c:248 -#9 0x0000aaaabd394f4c in pci_bridge_write_config (d=0xaaaae755f7c0, -address=44, val=<optimized out>, len=4) at ../hw/pci/pci_bridge.c:272 -#10 0x0000aaaabd39a928 in rp_write_config (d=0xaaaae755f7c0, address=44, -val=128, len=4) at ../hw/pci-bridge/pcie_root_port.c:39 -#11 0x0000aaaabd6df328 in memory_region_write_accessor (mr=0xaaaae63898d0, -addr=65580, value=<optimized out>, size=4, shift=<optimized out>, -mask=<optimized out>, attrs=...) at ../softmmu/memory.c:494 -#12 0x0000aaaabd6dcb6c in access_with_adjusted_size (addr=addr@entry=65580, -value=value@entry=0xffff817adc78, size=size@entry=4, access_size_min=<optimized -out>, access_size_max=<optimized out>, - access_fn=access_fn@entry=0xaaaabd6df284 <memory_region_write_accessor>, -mr=mr@entry=0xaaaae63898d0, attrs=attrs@entry=...) at ../softmmu/memory.c:556 -#13 0x0000aaaabd6e0dc8 in memory_region_dispatch_write -(mr=mr@entry=0xaaaae63898d0, addr=65580, data=<optimized out>, op=MO_32, -attrs=attrs@entry=...) at ../softmmu/memory.c:1534 -#14 0x0000aaaabd6d0574 in flatview_write_continue (fv=fv@entry=0xffff5c02da00, -addr=addr@entry=275146407980, attrs=attrs@entry=..., -ptr=ptr@entry=0xffff8aa8c028, len=len@entry=4, - addr1=<optimized out>, l=<optimized out>, mr=mr@entry=0xaaaae63898d0) at -/usr/src/debug/qemu-6.2.0-226.aarch64/include/qemu/host-utils.h:165 -#15 0x0000aaaabd6d4584 in flatview_write (len=4, buf=0xffff8aa8c028, attrs=..., -addr=275146407980, fv=0xffff5c02da00) at ../softmmu/physmem.c:3375 -#16 address_space_write (as=<optimized out>, addr=275146407980, attrs=..., -buf=buf@entry=0xffff8aa8c028, len=4) at ../softmmu/physmem.c:3467 -#17 0x0000aaaabd6d462c in address_space_rw (as=<optimized out>, addr=<optimized -out>, attrs=..., attrs@entry=..., buf=buf@entry=0xffff8aa8c028, len=<optimized -out>, is_write=<optimized out>) - at ../softmmu/physmem.c:3477 -#18 0x0000aaaabd7cf6e8 in kvm_cpu_exec (cpu=cpu@entry=0xaaaae625dfd0) at -../accel/kvm/kvm-all.c:2970 -#19 0x0000aaaabd7d09bc in kvm_vcpu_thread_fn (arg=arg@entry=0xaaaae625dfd0) at -../accel/kvm/kvm-accel-ops.c:49 -#20 0x0000aaaabd94ccd8 in qemu_thread_start (args=<optimized out>) at -../util/qemu-thread-posix.c:559 - - -By printing more info in the coredump file, I found that the addr of -fds_old[146] and fds_new[146] are same, but fds_old[146] belonged to a -live-attached virtio-scsi device while fds_new[146] was owned by another -live-attached virtio-net. -The reason why addr conflicted was then been found from vm's console log. Just -before qemu aborted, the guest kernel crashed and kdump.service booted the -dump-capture kernel where re-alloced address for the devices. -Because those virtio devices were live-attached after vm creating, different -addr may been assigned to them in the dump-capture kernel: - -the initial kernel booting log: -[ 1.663297] pci 0000:00:02.1: BAR 14: assigned [mem 0x11900000-0x11afffff] -[ 1.664560] pci 0000:00:02.1: BAR 15: assigned [mem -0x8001800000-0x80019fffff 64bit pref] - -the dump-capture kernel booting log: -[ 1.845211] pci 0000:00:02.0: BAR 14: assigned [mem 0x11900000-0x11bfffff] -[ 1.846542] pci 0000:00:02.0: BAR 15: assigned [mem -0x8001800000-0x8001afffff 64bit pref] - - -I think directly aborting the qemu process may not be the best choice in this -case cuz it will interrupt the work of kdump.service so that failed to generate -memory dump of the crashed guest kernel. -Perhaps, IMO, the error could be simply ignored in this case and just let kdump -to reboot the system after memory-dump finishing, but I failed to find a -suitable judgment in the codes. - -Any solution for this problem? Hope I can get some helps here. - -Hao - diff --git a/results/classifier/02/other/16228234 b/results/classifier/02/other/16228234 deleted file mode 100644 index b530c329a..000000000 --- a/results/classifier/02/other/16228234 +++ /dev/null @@ -1,1845 +0,0 @@ -other: 0.535 -mistranslation: 0.518 -instruction: 0.442 -semantic: 0.411 -boot: 0.402 - -[Qemu-devel] [Bug?] BQL about live migration - -Hello Juan & Dave, - -We hit a bug in our test: -Network error occurs when migrating a guest, libvirt then rollback the -migration, causes qemu coredump -qemu log: -2017-03-01T12:54:33.904949+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: - {"timestamp": {"seconds": 1488344073, "microseconds": 904914}, "event": "STOP"} -2017-03-01T12:54:37.522500+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: - qmp_cmd_name: migrate_cancel -2017-03-01T12:54:37.522607+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: - {"timestamp": {"seconds": 1488344077, "microseconds": 522556}, "event": -"MIGRATION", "data": {"status": "cancelling"}} -2017-03-01T12:54:37.524671+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: - qmp_cmd_name: cont -2017-03-01T12:54:37.524733+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: - virtio-balloon device status is 7 that means DRIVER OK -2017-03-01T12:54:37.525434+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: - virtio-net device status is 7 that means DRIVER OK -2017-03-01T12:54:37.525484+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: - virtio-blk device status is 7 that means DRIVER OK -2017-03-01T12:54:37.525562+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: - virtio-serial device status is 7 that means DRIVER OK -2017-03-01T12:54:37.527653+08:00|info|qemu[17672]|[17672]|vm_start[981]|: -vm_state-notify:3ms -2017-03-01T12:54:37.528523+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: - {"timestamp": {"seconds": 1488344077, "microseconds": 527699}, "event": -"RESUME"} -2017-03-01T12:54:37.530680+08:00|info|qemu[17672]|[33614]|migration_bitmap_sync[720]|: - this iteration cycle takes 3s, new dirtied data:0MB -2017-03-01T12:54:37.530909+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: - {"timestamp": {"seconds": 1488344077, "microseconds": 530733}, "event": -"MIGRATION_PASS", "data": {"pass": 3}} -2017-03-01T04:54:37.530997Z qemu-kvm: socket_writev_buffer: Got err=32 for -(131583/18446744073709551615) -qemu-kvm: /home/abuild/rpmbuild/BUILD/qemu-kvm-2.6.0/hw/net/virtio_net.c:1519: -virtio_net_save: Assertion `!n->vhost_started' failed. -2017-03-01 12:54:43.028: shutting down - -> -From qemu log, qemu received and processed migrate_cancel/cont qmp commands -after guest been stopped and entered the last round of migration. Then -migration thread try to save device state when guest is running(started by -cont command), causes assert and coredump. -This is because in last iter, we call cpu_synchronize_all_states() to -synchronize vcpu states, this call will release qemu_global_mutex and wait -for do_kvm_cpu_synchronize_state() to be executed on target vcpu: -(gdb) bt -#0 0x00007f763d1046d5 in pthread_cond_wait@@GLIBC_2.3.2 () from -/lib64/libpthread.so.0 -#1 0x00007f7643e51d7f in qemu_cond_wait (cond=0x7f764445eca0 <qemu_work_cond>, -mutex=0x7f764445eba0 <qemu_global_mutex>) at util/qemu-thread-posix.c:132 -#2 0x00007f7643a2e154 in run_on_cpu (cpu=0x7f7644e06d80, func=0x7f7643a46413 -<do_kvm_cpu_synchronize_state>, data=0x7f7644e06d80) at -/mnt/public/yanghy/qemu-kvm/cpus.c:995 -#3 0x00007f7643a46487 in kvm_cpu_synchronize_state (cpu=0x7f7644e06d80) at -/mnt/public/yanghy/qemu-kvm/kvm-all.c:1805 -#4 0x00007f7643a2c700 in cpu_synchronize_state (cpu=0x7f7644e06d80) at -/mnt/public/yanghy/qemu-kvm/include/sysemu/kvm.h:457 -#5 0x00007f7643a2db0c in cpu_synchronize_all_states () at -/mnt/public/yanghy/qemu-kvm/cpus.c:766 -#6 0x00007f7643a67b5b in qemu_savevm_state_complete_precopy (f=0x7f76462f2d30, -iterable_only=false) at /mnt/public/yanghy/qemu-kvm/migration/savevm.c:1051 -#7 0x00007f7643d121e9 in migration_completion (s=0x7f76443e78c0 -<current_migration.37571>, current_active_state=4, -old_vm_running=0x7f74343fda00, start_time=0x7f74343fda08) at -migration/migration.c:1753 -#8 0x00007f7643d126c5 in migration_thread (opaque=0x7f76443e78c0 -<current_migration.37571>) at migration/migration.c:1922 -#9 0x00007f763d100dc5 in start_thread () from /lib64/libpthread.so.0 -#10 0x00007f763ce2e71d in clone () from /lib64/libc.so.6 -(gdb) p iothread_locked -$1 = true - -and then, qemu main thread been executed, it won't block because migration -thread released the qemu_global_mutex: -(gdb) thr 1 -[Switching to thread 1 (Thread 0x7fe298e08bc0 (LWP 30767))] -#0 os_host_main_loop_wait (timeout=931565) at main-loop.c:270 -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout %d\n", -timeout); -(gdb) p iothread_locked -$2 = true -(gdb) l 268 -263 -264 ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, -timeout); -265 -266 -267 if (timeout) { -268 qemu_mutex_lock_iothread(); -269 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) { -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout %d\n", -timeout); -271 } -272 } -(gdb) - -So, although we've hold iothread_lock in stop© phase of migration, we -can't guarantee the iothread been locked all through the stop & copy phase, -any thoughts on how to solve this problem? - - -Thanks, --Gonglei - -On Fri, 03/03 09:29, Gonglei (Arei) wrote: -> -Hello Juan & Dave, -> -> -We hit a bug in our test: -> -Network error occurs when migrating a guest, libvirt then rollback the -> -migration, causes qemu coredump -> -qemu log: -> -2017-03-01T12:54:33.904949+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344073, "microseconds": 904914}, "event": -> -"STOP"} -> -2017-03-01T12:54:37.522500+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -qmp_cmd_name: migrate_cancel -> -2017-03-01T12:54:37.522607+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 522556}, "event": -> -"MIGRATION", "data": {"status": "cancelling"}} -> -2017-03-01T12:54:37.524671+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -qmp_cmd_name: cont -> -2017-03-01T12:54:37.524733+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-balloon device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525434+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-net device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525484+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-blk device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525562+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-serial device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.527653+08:00|info|qemu[17672]|[17672]|vm_start[981]|: -> -vm_state-notify:3ms -> -2017-03-01T12:54:37.528523+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 527699}, "event": -> -"RESUME"} -> -2017-03-01T12:54:37.530680+08:00|info|qemu[17672]|[33614]|migration_bitmap_sync[720]|: -> -this iteration cycle takes 3s, new dirtied data:0MB -> -2017-03-01T12:54:37.530909+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 530733}, "event": -> -"MIGRATION_PASS", "data": {"pass": 3}} -> -2017-03-01T04:54:37.530997Z qemu-kvm: socket_writev_buffer: Got err=32 for -> -(131583/18446744073709551615) -> -qemu-kvm: -> -/home/abuild/rpmbuild/BUILD/qemu-kvm-2.6.0/hw/net/virtio_net.c:1519: -> -virtio_net_save: Assertion `!n->vhost_started' failed. -> -2017-03-01 12:54:43.028: shutting down -> -> -From qemu log, qemu received and processed migrate_cancel/cont qmp commands -> -after guest been stopped and entered the last round of migration. Then -> -migration thread try to save device state when guest is running(started by -> -cont command), causes assert and coredump. -> -This is because in last iter, we call cpu_synchronize_all_states() to -> -synchronize vcpu states, this call will release qemu_global_mutex and wait -> -for do_kvm_cpu_synchronize_state() to be executed on target vcpu: -> -(gdb) bt -> -#0 0x00007f763d1046d5 in pthread_cond_wait@@GLIBC_2.3.2 () from -> -/lib64/libpthread.so.0 -> -#1 0x00007f7643e51d7f in qemu_cond_wait (cond=0x7f764445eca0 -> -<qemu_work_cond>, mutex=0x7f764445eba0 <qemu_global_mutex>) at -> -util/qemu-thread-posix.c:132 -> -#2 0x00007f7643a2e154 in run_on_cpu (cpu=0x7f7644e06d80, func=0x7f7643a46413 -> -<do_kvm_cpu_synchronize_state>, data=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/cpus.c:995 -> -#3 0x00007f7643a46487 in kvm_cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/kvm-all.c:1805 -> -#4 0x00007f7643a2c700 in cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/include/sysemu/kvm.h:457 -> -#5 0x00007f7643a2db0c in cpu_synchronize_all_states () at -> -/mnt/public/yanghy/qemu-kvm/cpus.c:766 -> -#6 0x00007f7643a67b5b in qemu_savevm_state_complete_precopy -> -(f=0x7f76462f2d30, iterable_only=false) at -> -/mnt/public/yanghy/qemu-kvm/migration/savevm.c:1051 -> -#7 0x00007f7643d121e9 in migration_completion (s=0x7f76443e78c0 -> -<current_migration.37571>, current_active_state=4, -> -old_vm_running=0x7f74343fda00, start_time=0x7f74343fda08) at -> -migration/migration.c:1753 -> -#8 0x00007f7643d126c5 in migration_thread (opaque=0x7f76443e78c0 -> -<current_migration.37571>) at migration/migration.c:1922 -> -#9 0x00007f763d100dc5 in start_thread () from /lib64/libpthread.so.0 -> -#10 0x00007f763ce2e71d in clone () from /lib64/libc.so.6 -> -(gdb) p iothread_locked -> -$1 = true -> -> -and then, qemu main thread been executed, it won't block because migration -> -thread released the qemu_global_mutex: -> -(gdb) thr 1 -> -[Switching to thread 1 (Thread 0x7fe298e08bc0 (LWP 30767))] -> -#0 os_host_main_loop_wait (timeout=931565) at main-loop.c:270 -> -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -%d\n", timeout); -> -(gdb) p iothread_locked -> -$2 = true -> -(gdb) l 268 -> -263 -> -264 ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, -> -timeout); -> -265 -> -266 -> -267 if (timeout) { -> -268 qemu_mutex_lock_iothread(); -> -269 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) { -> -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -%d\n", timeout); -> -271 } -> -272 } -> -(gdb) -> -> -So, although we've hold iothread_lock in stop© phase of migration, we -> -can't guarantee the iothread been locked all through the stop & copy phase, -> -any thoughts on how to solve this problem? -Could you post a backtrace of the assertion? - -Fam - -On 2017/3/3 18:42, Fam Zheng wrote: -> -On Fri, 03/03 09:29, Gonglei (Arei) wrote: -> -> Hello Juan & Dave, -> -> -> -> We hit a bug in our test: -> -> Network error occurs when migrating a guest, libvirt then rollback the -> -> migration, causes qemu coredump -> -> qemu log: -> -> 2017-03-01T12:54:33.904949+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -> {"timestamp": {"seconds": 1488344073, "microseconds": 904914}, "event": -> -> "STOP"} -> -> 2017-03-01T12:54:37.522500+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -> qmp_cmd_name: migrate_cancel -> -> 2017-03-01T12:54:37.522607+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -> {"timestamp": {"seconds": 1488344077, "microseconds": 522556}, "event": -> -> "MIGRATION", "data": {"status": "cancelling"}} -> -> 2017-03-01T12:54:37.524671+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -> qmp_cmd_name: cont -> -> 2017-03-01T12:54:37.524733+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -> virtio-balloon device status is 7 that means DRIVER OK -> -> 2017-03-01T12:54:37.525434+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -> virtio-net device status is 7 that means DRIVER OK -> -> 2017-03-01T12:54:37.525484+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -> virtio-blk device status is 7 that means DRIVER OK -> -> 2017-03-01T12:54:37.525562+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -> virtio-serial device status is 7 that means DRIVER OK -> -> 2017-03-01T12:54:37.527653+08:00|info|qemu[17672]|[17672]|vm_start[981]|: -> -> vm_state-notify:3ms -> -> 2017-03-01T12:54:37.528523+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -> {"timestamp": {"seconds": 1488344077, "microseconds": 527699}, "event": -> -> "RESUME"} -> -> 2017-03-01T12:54:37.530680+08:00|info|qemu[17672]|[33614]|migration_bitmap_sync[720]|: -> -> this iteration cycle takes 3s, new dirtied data:0MB -> -> 2017-03-01T12:54:37.530909+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -> {"timestamp": {"seconds": 1488344077, "microseconds": 530733}, "event": -> -> "MIGRATION_PASS", "data": {"pass": 3}} -> -> 2017-03-01T04:54:37.530997Z qemu-kvm: socket_writev_buffer: Got err=32 for -> -> (131583/18446744073709551615) -> -> qemu-kvm: -> -> /home/abuild/rpmbuild/BUILD/qemu-kvm-2.6.0/hw/net/virtio_net.c:1519: -> -> virtio_net_save: Assertion `!n->vhost_started' failed. -> -> 2017-03-01 12:54:43.028: shutting down -> -> -> -> From qemu log, qemu received and processed migrate_cancel/cont qmp commands -> -> after guest been stopped and entered the last round of migration. Then -> -> migration thread try to save device state when guest is running(started by -> -> cont command), causes assert and coredump. -> -> This is because in last iter, we call cpu_synchronize_all_states() to -> -> synchronize vcpu states, this call will release qemu_global_mutex and wait -> -> for do_kvm_cpu_synchronize_state() to be executed on target vcpu: -> -> (gdb) bt -> -> #0 0x00007f763d1046d5 in pthread_cond_wait@@GLIBC_2.3.2 () from -> -> /lib64/libpthread.so.0 -> -> #1 0x00007f7643e51d7f in qemu_cond_wait (cond=0x7f764445eca0 -> -> <qemu_work_cond>, mutex=0x7f764445eba0 <qemu_global_mutex>) at -> -> util/qemu-thread-posix.c:132 -> -> #2 0x00007f7643a2e154 in run_on_cpu (cpu=0x7f7644e06d80, -> -> func=0x7f7643a46413 <do_kvm_cpu_synchronize_state>, data=0x7f7644e06d80) at -> -> /mnt/public/yanghy/qemu-kvm/cpus.c:995 -> -> #3 0x00007f7643a46487 in kvm_cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -> /mnt/public/yanghy/qemu-kvm/kvm-all.c:1805 -> -> #4 0x00007f7643a2c700 in cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -> /mnt/public/yanghy/qemu-kvm/include/sysemu/kvm.h:457 -> -> #5 0x00007f7643a2db0c in cpu_synchronize_all_states () at -> -> /mnt/public/yanghy/qemu-kvm/cpus.c:766 -> -> #6 0x00007f7643a67b5b in qemu_savevm_state_complete_precopy -> -> (f=0x7f76462f2d30, iterable_only=false) at -> -> /mnt/public/yanghy/qemu-kvm/migration/savevm.c:1051 -> -> #7 0x00007f7643d121e9 in migration_completion (s=0x7f76443e78c0 -> -> <current_migration.37571>, current_active_state=4, -> -> old_vm_running=0x7f74343fda00, start_time=0x7f74343fda08) at -> -> migration/migration.c:1753 -> -> #8 0x00007f7643d126c5 in migration_thread (opaque=0x7f76443e78c0 -> -> <current_migration.37571>) at migration/migration.c:1922 -> -> #9 0x00007f763d100dc5 in start_thread () from /lib64/libpthread.so.0 -> -> #10 0x00007f763ce2e71d in clone () from /lib64/libc.so.6 -> -> (gdb) p iothread_locked -> -> $1 = true -> -> -> -> and then, qemu main thread been executed, it won't block because migration -> -> thread released the qemu_global_mutex: -> -> (gdb) thr 1 -> -> [Switching to thread 1 (Thread 0x7fe298e08bc0 (LWP 30767))] -> -> #0 os_host_main_loop_wait (timeout=931565) at main-loop.c:270 -> -> 270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -> %d\n", timeout); -> -> (gdb) p iothread_locked -> -> $2 = true -> -> (gdb) l 268 -> -> 263 -> -> 264 ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, -> -> timeout); -> -> 265 -> -> 266 -> -> 267 if (timeout) { -> -> 268 qemu_mutex_lock_iothread(); -> -> 269 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) { -> -> 270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -> %d\n", timeout); -> -> 271 } -> -> 272 } -> -> (gdb) -> -> -> -> So, although we've hold iothread_lock in stop© phase of migration, we -> -> can't guarantee the iothread been locked all through the stop & copy phase, -> -> any thoughts on how to solve this problem? -> -> -Could you post a backtrace of the assertion? -#0 0x00007f97b1fbe5d7 in raise () from /usr/lib64/libc.so.6 -#1 0x00007f97b1fbfcc8 in abort () from /usr/lib64/libc.so.6 -#2 0x00007f97b1fb7546 in __assert_fail_base () from /usr/lib64/libc.so.6 -#3 0x00007f97b1fb75f2 in __assert_fail () from /usr/lib64/libc.so.6 -#4 0x000000000049fd19 in virtio_net_save (f=0x7f97a8ca44d0, -opaque=0x7f97a86e9018) at /usr/src/debug/qemu-kvm-2.6.0/hw/ -#5 0x000000000047e380 in vmstate_save_old_style (address@hidden, -address@hidden, se=0x7f9 -#6 0x000000000047fb93 in vmstate_save (address@hidden, address@hidden, -address@hidden -#7 0x0000000000481ad2 in qemu_savevm_state_complete_precopy (f=0x7f97a8ca44d0, -address@hidden) -#8 0x00000000006c6b60 in migration_completion (address@hidden -<current_migration.38312>, current_active_state=curre - address@hidden) at migration/migration.c:1761 -#9 0x00000000006c71db in migration_thread (address@hidden -<current_migration.38312>) at migration/migrati - -> -> -Fam -> --- -Thanks, -Yang - -* Gonglei (Arei) (address@hidden) wrote: -> -Hello Juan & Dave, -cc'ing in pbonzini since it's magic involving cpu_synrhonize_all_states() - -> -We hit a bug in our test: -> -Network error occurs when migrating a guest, libvirt then rollback the -> -migration, causes qemu coredump -> -qemu log: -> -2017-03-01T12:54:33.904949+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344073, "microseconds": 904914}, "event": -> -"STOP"} -> -2017-03-01T12:54:37.522500+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -qmp_cmd_name: migrate_cancel -> -2017-03-01T12:54:37.522607+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 522556}, "event": -> -"MIGRATION", "data": {"status": "cancelling"}} -> -2017-03-01T12:54:37.524671+08:00|info|qemu[17672]|[17672]|handle_qmp_command[3930]|: -> -qmp_cmd_name: cont -> -2017-03-01T12:54:37.524733+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-balloon device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525434+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-net device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525484+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-blk device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.525562+08:00|info|qemu[17672]|[17672]|virtio_set_status[725]|: -> -virtio-serial device status is 7 that means DRIVER OK -> -2017-03-01T12:54:37.527653+08:00|info|qemu[17672]|[17672]|vm_start[981]|: -> -vm_state-notify:3ms -> -2017-03-01T12:54:37.528523+08:00|info|qemu[17672]|[17672]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 527699}, "event": -> -"RESUME"} -> -2017-03-01T12:54:37.530680+08:00|info|qemu[17672]|[33614]|migration_bitmap_sync[720]|: -> -this iteration cycle takes 3s, new dirtied data:0MB -> -2017-03-01T12:54:37.530909+08:00|info|qemu[17672]|[33614]|monitor_qapi_event_emit[479]|: -> -{"timestamp": {"seconds": 1488344077, "microseconds": 530733}, "event": -> -"MIGRATION_PASS", "data": {"pass": 3}} -> -2017-03-01T04:54:37.530997Z qemu-kvm: socket_writev_buffer: Got err=32 for -> -(131583/18446744073709551615) -> -qemu-kvm: -> -/home/abuild/rpmbuild/BUILD/qemu-kvm-2.6.0/hw/net/virtio_net.c:1519: -> -virtio_net_save: Assertion `!n->vhost_started' failed. -> -2017-03-01 12:54:43.028: shutting down -> -> -From qemu log, qemu received and processed migrate_cancel/cont qmp commands -> -after guest been stopped and entered the last round of migration. Then -> -migration thread try to save device state when guest is running(started by -> -cont command), causes assert and coredump. -> -This is because in last iter, we call cpu_synchronize_all_states() to -> -synchronize vcpu states, this call will release qemu_global_mutex and wait -> -for do_kvm_cpu_synchronize_state() to be executed on target vcpu: -> -(gdb) bt -> -#0 0x00007f763d1046d5 in pthread_cond_wait@@GLIBC_2.3.2 () from -> -/lib64/libpthread.so.0 -> -#1 0x00007f7643e51d7f in qemu_cond_wait (cond=0x7f764445eca0 -> -<qemu_work_cond>, mutex=0x7f764445eba0 <qemu_global_mutex>) at -> -util/qemu-thread-posix.c:132 -> -#2 0x00007f7643a2e154 in run_on_cpu (cpu=0x7f7644e06d80, func=0x7f7643a46413 -> -<do_kvm_cpu_synchronize_state>, data=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/cpus.c:995 -> -#3 0x00007f7643a46487 in kvm_cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/kvm-all.c:1805 -> -#4 0x00007f7643a2c700 in cpu_synchronize_state (cpu=0x7f7644e06d80) at -> -/mnt/public/yanghy/qemu-kvm/include/sysemu/kvm.h:457 -> -#5 0x00007f7643a2db0c in cpu_synchronize_all_states () at -> -/mnt/public/yanghy/qemu-kvm/cpus.c:766 -> -#6 0x00007f7643a67b5b in qemu_savevm_state_complete_precopy -> -(f=0x7f76462f2d30, iterable_only=false) at -> -/mnt/public/yanghy/qemu-kvm/migration/savevm.c:1051 -> -#7 0x00007f7643d121e9 in migration_completion (s=0x7f76443e78c0 -> -<current_migration.37571>, current_active_state=4, -> -old_vm_running=0x7f74343fda00, start_time=0x7f74343fda08) at -> -migration/migration.c:1753 -> -#8 0x00007f7643d126c5 in migration_thread (opaque=0x7f76443e78c0 -> -<current_migration.37571>) at migration/migration.c:1922 -> -#9 0x00007f763d100dc5 in start_thread () from /lib64/libpthread.so.0 -> -#10 0x00007f763ce2e71d in clone () from /lib64/libc.so.6 -> -(gdb) p iothread_locked -> -$1 = true -> -> -and then, qemu main thread been executed, it won't block because migration -> -thread released the qemu_global_mutex: -> -(gdb) thr 1 -> -[Switching to thread 1 (Thread 0x7fe298e08bc0 (LWP 30767))] -> -#0 os_host_main_loop_wait (timeout=931565) at main-loop.c:270 -> -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -%d\n", timeout); -> -(gdb) p iothread_locked -> -$2 = true -> -(gdb) l 268 -> -263 -> -264 ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, -> -timeout); -> -265 -> -266 -> -267 if (timeout) { -> -268 qemu_mutex_lock_iothread(); -> -269 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) { -> -270 QEMU_LOG(LOG_INFO,"***** after qemu_pool_ns: timeout -> -%d\n", timeout); -> -271 } -> -272 } -> -(gdb) -> -> -So, although we've hold iothread_lock in stop© phase of migration, we -> -can't guarantee the iothread been locked all through the stop & copy phase, -> -any thoughts on how to solve this problem? -Ouch that's pretty nasty; I remember Paolo explaining to me a while ago that -their were times when run_on_cpu would have to drop the BQL and I worried about -it, -but this is the 1st time I've seen an error due to it. - -Do you know what the migration state was at that point? Was it -MIGRATION_STATUS_CANCELLING? -I'm thinking perhaps we should stop 'cont' from continuing while migration is in -MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - so -that -perhaps libvirt could avoid sending the 'cont' until then? - -Dave - - -> -> -Thanks, -> --Gonglei -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> -Ouch that's pretty nasty; I remember Paolo explaining to me a while ago that -> -their were times when run_on_cpu would have to drop the BQL and I worried -> -about it, -> -but this is the 1st time I've seen an error due to it. -> -> -Do you know what the migration state was at that point? Was it -> -MIGRATION_STATUS_CANCELLING? -> -I'm thinking perhaps we should stop 'cont' from continuing while migration is -> -in -> -MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - so -> -that -> -perhaps libvirt could avoid sending the 'cont' until then? -No, there's no event, though I thought libvirt would poll until -"query-migrate" returns the cancelled state. Of course that is a small -consolation, because a segfault is unacceptable. - -One possibility is to suspend the monitor in qmp_migrate_cancel and -resume it (with add_migration_state_change_notifier) when we hit the -CANCELLED state. I'm not sure what the latency would be between the end -of migrate_fd_cancel and finally reaching CANCELLED. - -Paolo - -* Paolo Bonzini (address@hidden) wrote: -> -> -> -On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> -> Ouch that's pretty nasty; I remember Paolo explaining to me a while ago that -> -> their were times when run_on_cpu would have to drop the BQL and I worried -> -> about it, -> -> but this is the 1st time I've seen an error due to it. -> -> -> -> Do you know what the migration state was at that point? Was it -> -> MIGRATION_STATUS_CANCELLING? -> -> I'm thinking perhaps we should stop 'cont' from continuing while migration -> -> is in -> -> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - -> -> so that -> -> perhaps libvirt could avoid sending the 'cont' until then? -> -> -No, there's no event, though I thought libvirt would poll until -> -"query-migrate" returns the cancelled state. Of course that is a small -> -consolation, because a segfault is unacceptable. -I think you might get an event if you set the new migrate capability called -'events' on! - -void migrate_set_state(int *state, int old_state, int new_state) -{ - if (atomic_cmpxchg(state, old_state, new_state) == old_state) { - trace_migrate_set_state(new_state); - migrate_generate_event(new_state); - } -} - -static void migrate_generate_event(int new_state) -{ - if (migrate_use_events()) { - qapi_event_send_migration(new_state, &error_abort); - } -} - -That event feature went in sometime after 2.3.0. - -> -One possibility is to suspend the monitor in qmp_migrate_cancel and -> -resume it (with add_migration_state_change_notifier) when we hit the -> -CANCELLED state. I'm not sure what the latency would be between the end -> -of migrate_fd_cancel and finally reaching CANCELLED. -I don't like suspending monitors; it can potentially take quite a significant -time to do a cancel. -How about making 'cont' fail if we're in CANCELLING? - -I'd really love to see the 'run_on_cpu' being more careful about the BQL; -we really need all of the rest of the devices to stay quiesced at times. - -Dave - -> -Paolo --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> -* Paolo Bonzini (address@hidden) wrote: -> -> -> -> -> -> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> ->> Ouch that's pretty nasty; I remember Paolo explaining to me a while ago that -> ->> their were times when run_on_cpu would have to drop the BQL and I worried -> ->> about it, -> ->> but this is the 1st time I've seen an error due to it. -> ->> -> ->> Do you know what the migration state was at that point? Was it -> ->> MIGRATION_STATUS_CANCELLING? -> ->> I'm thinking perhaps we should stop 'cont' from continuing while migration -> ->> is in -> ->> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - -> ->> so that -> ->> perhaps libvirt could avoid sending the 'cont' until then? -> -> -> -> No, there's no event, though I thought libvirt would poll until -> -> "query-migrate" returns the cancelled state. Of course that is a small -> -> consolation, because a segfault is unacceptable. -> -> -I think you might get an event if you set the new migrate capability called -> -'events' on! -> -> -void migrate_set_state(int *state, int old_state, int new_state) -> -{ -> -if (atomic_cmpxchg(state, old_state, new_state) == old_state) { -> -trace_migrate_set_state(new_state); -> -migrate_generate_event(new_state); -> -} -> -} -> -> -static void migrate_generate_event(int new_state) -> -{ -> -if (migrate_use_events()) { -> -qapi_event_send_migration(new_state, &error_abort); -> -} -> -} -> -> -That event feature went in sometime after 2.3.0. -> -> -> One possibility is to suspend the monitor in qmp_migrate_cancel and -> -> resume it (with add_migration_state_change_notifier) when we hit the -> -> CANCELLED state. I'm not sure what the latency would be between the end -> -> of migrate_fd_cancel and finally reaching CANCELLED. -> -> -I don't like suspending monitors; it can potentially take quite a significant -> -time to do a cancel. -> -How about making 'cont' fail if we're in CANCELLING? -Actually I thought that would be the case already (in fact CANCELLING is -internal only; the outside world sees it as "active" in query-migrate). - -Lei, what is the runstate? (That is, why did cont succeed at all)? - -Paolo - -> -I'd really love to see the 'run_on_cpu' being more careful about the BQL; -> -we really need all of the rest of the devices to stay quiesced at times. -That's not really possible, because of how condition variables work. :( - -* Paolo Bonzini (address@hidden) wrote: -> -> -> -On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> -> * Paolo Bonzini (address@hidden) wrote: -> ->> -> ->> -> ->> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> ->>> Ouch that's pretty nasty; I remember Paolo explaining to me a while ago -> ->>> that -> ->>> their were times when run_on_cpu would have to drop the BQL and I worried -> ->>> about it, -> ->>> but this is the 1st time I've seen an error due to it. -> ->>> -> ->>> Do you know what the migration state was at that point? Was it -> ->>> MIGRATION_STATUS_CANCELLING? -> ->>> I'm thinking perhaps we should stop 'cont' from continuing while -> ->>> migration is in -> ->>> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - -> ->>> so that -> ->>> perhaps libvirt could avoid sending the 'cont' until then? -> ->> -> ->> No, there's no event, though I thought libvirt would poll until -> ->> "query-migrate" returns the cancelled state. Of course that is a small -> ->> consolation, because a segfault is unacceptable. -> -> -> -> I think you might get an event if you set the new migrate capability called -> -> 'events' on! -> -> -> -> void migrate_set_state(int *state, int old_state, int new_state) -> -> { -> -> if (atomic_cmpxchg(state, old_state, new_state) == old_state) { -> -> trace_migrate_set_state(new_state); -> -> migrate_generate_event(new_state); -> -> } -> -> } -> -> -> -> static void migrate_generate_event(int new_state) -> -> { -> -> if (migrate_use_events()) { -> -> qapi_event_send_migration(new_state, &error_abort); -> -> } -> -> } -> -> -> -> That event feature went in sometime after 2.3.0. -> -> -> ->> One possibility is to suspend the monitor in qmp_migrate_cancel and -> ->> resume it (with add_migration_state_change_notifier) when we hit the -> ->> CANCELLED state. I'm not sure what the latency would be between the end -> ->> of migrate_fd_cancel and finally reaching CANCELLED. -> -> -> -> I don't like suspending monitors; it can potentially take quite a -> -> significant -> -> time to do a cancel. -> -> How about making 'cont' fail if we're in CANCELLING? -> -> -Actually I thought that would be the case already (in fact CANCELLING is -> -internal only; the outside world sees it as "active" in query-migrate). -> -> -Lei, what is the runstate? (That is, why did cont succeed at all)? -I suspect it's RUN_STATE_FINISH_MIGRATE - we set that before we do the device -save, and that's what we get at the end of a migrate and it's legal to restart -from there. - -> -Paolo -> -> -> I'd really love to see the 'run_on_cpu' being more careful about the BQL; -> -> we really need all of the rest of the devices to stay quiesced at times. -> -> -That's not really possible, because of how condition variables work. :( -*Really* we need to find a solution to that - there's probably lots of -other things that can spring up in that small window other than the -'cont'. - -Dave - --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 03/03/2017 14:26, Dr. David Alan Gilbert wrote: -> -* Paolo Bonzini (address@hidden) wrote: -> -> -> -> -> -> On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> ->> * Paolo Bonzini (address@hidden) wrote: -> ->>> -> ->>> -> ->>> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> ->>>> Ouch that's pretty nasty; I remember Paolo explaining to me a while ago -> ->>>> that -> ->>>> their were times when run_on_cpu would have to drop the BQL and I worried -> ->>>> about it, -> ->>>> but this is the 1st time I've seen an error due to it. -> ->>>> -> ->>>> Do you know what the migration state was at that point? Was it -> ->>>> MIGRATION_STATUS_CANCELLING? -> ->>>> I'm thinking perhaps we should stop 'cont' from continuing while -> ->>>> migration is in -> ->>>> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - -> ->>>> so that -> ->>>> perhaps libvirt could avoid sending the 'cont' until then? -> ->>> -> ->>> No, there's no event, though I thought libvirt would poll until -> ->>> "query-migrate" returns the cancelled state. Of course that is a small -> ->>> consolation, because a segfault is unacceptable. -> ->> -> ->> I think you might get an event if you set the new migrate capability called -> ->> 'events' on! -> ->> -> ->> void migrate_set_state(int *state, int old_state, int new_state) -> ->> { -> ->> if (atomic_cmpxchg(state, old_state, new_state) == old_state) { -> ->> trace_migrate_set_state(new_state); -> ->> migrate_generate_event(new_state); -> ->> } -> ->> } -> ->> -> ->> static void migrate_generate_event(int new_state) -> ->> { -> ->> if (migrate_use_events()) { -> ->> qapi_event_send_migration(new_state, &error_abort); -> ->> } -> ->> } -> ->> -> ->> That event feature went in sometime after 2.3.0. -> ->> -> ->>> One possibility is to suspend the monitor in qmp_migrate_cancel and -> ->>> resume it (with add_migration_state_change_notifier) when we hit the -> ->>> CANCELLED state. I'm not sure what the latency would be between the end -> ->>> of migrate_fd_cancel and finally reaching CANCELLED. -> ->> -> ->> I don't like suspending monitors; it can potentially take quite a -> ->> significant -> ->> time to do a cancel. -> ->> How about making 'cont' fail if we're in CANCELLING? -> -> -> -> Actually I thought that would be the case already (in fact CANCELLING is -> -> internal only; the outside world sees it as "active" in query-migrate). -> -> -> -> Lei, what is the runstate? (That is, why did cont succeed at all)? -> -> -I suspect it's RUN_STATE_FINISH_MIGRATE - we set that before we do the device -> -save, and that's what we get at the end of a migrate and it's legal to restart -> -from there. -Yeah, but I think we get there at the end of a failed migrate only. So -perhaps we can introduce a new state RUN_STATE_FAILED_MIGRATE and forbid -"cont" from finish-migrate (only allow it from failed-migrate)? - -Paolo - -> -> Paolo -> -> -> ->> I'd really love to see the 'run_on_cpu' being more careful about the BQL; -> ->> we really need all of the rest of the devices to stay quiesced at times. -> -> -> -> That's not really possible, because of how condition variables work. :( -> -> -*Really* we need to find a solution to that - there's probably lots of -> -other things that can spring up in that small window other than the -> -'cont'. -> -> -Dave -> -> --- -> -Dr. David Alan Gilbert / address@hidden / Manchester, UK -> - -Hi Paolo, - -On Fri, Mar 3, 2017 at 9:33 PM, Paolo Bonzini <address@hidden> wrote: - -> -> -> -On 03/03/2017 14:26, Dr. David Alan Gilbert wrote: -> -> * Paolo Bonzini (address@hidden) wrote: -> ->> -> ->> -> ->> On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> ->>> * Paolo Bonzini (address@hidden) wrote: -> ->>>> -> ->>>> -> ->>>> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> ->>>>> Ouch that's pretty nasty; I remember Paolo explaining to me a while -> -ago that -> ->>>>> their were times when run_on_cpu would have to drop the BQL and I -> -worried about it, -> ->>>>> but this is the 1st time I've seen an error due to it. -> ->>>>> -> ->>>>> Do you know what the migration state was at that point? Was it -> -MIGRATION_STATUS_CANCELLING? -> ->>>>> I'm thinking perhaps we should stop 'cont' from continuing while -> -migration is in -> ->>>>> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit -> -CANCELLED - so that -> ->>>>> perhaps libvirt could avoid sending the 'cont' until then? -> ->>>> -> ->>>> No, there's no event, though I thought libvirt would poll until -> ->>>> "query-migrate" returns the cancelled state. Of course that is a -> -small -> ->>>> consolation, because a segfault is unacceptable. -> ->>> -> ->>> I think you might get an event if you set the new migrate capability -> -called -> ->>> 'events' on! -> ->>> -> ->>> void migrate_set_state(int *state, int old_state, int new_state) -> ->>> { -> ->>> if (atomic_cmpxchg(state, old_state, new_state) == old_state) { -> ->>> trace_migrate_set_state(new_state); -> ->>> migrate_generate_event(new_state); -> ->>> } -> ->>> } -> ->>> -> ->>> static void migrate_generate_event(int new_state) -> ->>> { -> ->>> if (migrate_use_events()) { -> ->>> qapi_event_send_migration(new_state, &error_abort); -> ->>> } -> ->>> } -> ->>> -> ->>> That event feature went in sometime after 2.3.0. -> ->>> -> ->>>> One possibility is to suspend the monitor in qmp_migrate_cancel and -> ->>>> resume it (with add_migration_state_change_notifier) when we hit the -> ->>>> CANCELLED state. I'm not sure what the latency would be between the -> -end -> ->>>> of migrate_fd_cancel and finally reaching CANCELLED. -> ->>> -> ->>> I don't like suspending monitors; it can potentially take quite a -> -significant -> ->>> time to do a cancel. -> ->>> How about making 'cont' fail if we're in CANCELLING? -> ->> -> ->> Actually I thought that would be the case already (in fact CANCELLING is -> ->> internal only; the outside world sees it as "active" in query-migrate). -> ->> -> ->> Lei, what is the runstate? (That is, why did cont succeed at all)? -> -> -> -> I suspect it's RUN_STATE_FINISH_MIGRATE - we set that before we do the -> -device -> -> save, and that's what we get at the end of a migrate and it's legal to -> -restart -> -> from there. -> -> -Yeah, but I think we get there at the end of a failed migrate only. So -> -perhaps we can introduce a new state RUN_STATE_FAILED_MIGRATE -I think we do not need to introduce a new state here. If we hit 'cont' and -the run state is RUN_STATE_FINISH_MIGRATE, we could assume that -migration failed because 'RUN_STATE_FINISH_MIGRATE' only exists on -source side, means we are finishing migration, a 'cont' at the meantime -indicates that we are rolling back, otherwise source side should be -destroyed. - - -> -and forbid -> -"cont" from finish-migrate (only allow it from failed-migrate)? -> -The problem of forbid 'cont' here is that it will result in a failed -migration and the source -side will remain paused. We actually expect a usable guest when rollback. -Is there a way to kill migration thread when we're under main thread, if -there is, we -could do the following to solve this problem: -1. 'cont' received during runstate RUN_STATE_FINISH_MIGRATE -2. kill migration thread -3. vm_start() - -But this only solves 'cont' problem. As Dave said before, other things could -happen during the small windows while we are finishing migration, that's -what I was worried about... - - -> -Paolo -> -> ->> Paolo -> ->> -> ->>> I'd really love to see the 'run_on_cpu' being more careful about the -> -BQL; -> ->>> we really need all of the rest of the devices to stay quiesced at -> -times. -> ->> -> ->> That's not really possible, because of how condition variables work. :( -> -> -> -> *Really* we need to find a solution to that - there's probably lots of -> -> other things that can spring up in that small window other than the -> -> 'cont'. -> -> -> -> Dave -> -> -> -> -- -> -> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> -> -> - -* Paolo Bonzini (address@hidden) wrote: -> -> -> -On 03/03/2017 14:26, Dr. David Alan Gilbert wrote: -> -> * Paolo Bonzini (address@hidden) wrote: -> ->> -> ->> -> ->> On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> ->>> * Paolo Bonzini (address@hidden) wrote: -> ->>>> -> ->>>> -> ->>>> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> ->>>>> Ouch that's pretty nasty; I remember Paolo explaining to me a while ago -> ->>>>> that -> ->>>>> their were times when run_on_cpu would have to drop the BQL and I -> ->>>>> worried about it, -> ->>>>> but this is the 1st time I've seen an error due to it. -> ->>>>> -> ->>>>> Do you know what the migration state was at that point? Was it -> ->>>>> MIGRATION_STATUS_CANCELLING? -> ->>>>> I'm thinking perhaps we should stop 'cont' from continuing while -> ->>>>> migration is in -> ->>>>> MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED -> ->>>>> - so that -> ->>>>> perhaps libvirt could avoid sending the 'cont' until then? -> ->>>> -> ->>>> No, there's no event, though I thought libvirt would poll until -> ->>>> "query-migrate" returns the cancelled state. Of course that is a small -> ->>>> consolation, because a segfault is unacceptable. -> ->>> -> ->>> I think you might get an event if you set the new migrate capability -> ->>> called -> ->>> 'events' on! -> ->>> -> ->>> void migrate_set_state(int *state, int old_state, int new_state) -> ->>> { -> ->>> if (atomic_cmpxchg(state, old_state, new_state) == old_state) { -> ->>> trace_migrate_set_state(new_state); -> ->>> migrate_generate_event(new_state); -> ->>> } -> ->>> } -> ->>> -> ->>> static void migrate_generate_event(int new_state) -> ->>> { -> ->>> if (migrate_use_events()) { -> ->>> qapi_event_send_migration(new_state, &error_abort); -> ->>> } -> ->>> } -> ->>> -> ->>> That event feature went in sometime after 2.3.0. -> ->>> -> ->>>> One possibility is to suspend the monitor in qmp_migrate_cancel and -> ->>>> resume it (with add_migration_state_change_notifier) when we hit the -> ->>>> CANCELLED state. I'm not sure what the latency would be between the end -> ->>>> of migrate_fd_cancel and finally reaching CANCELLED. -> ->>> -> ->>> I don't like suspending monitors; it can potentially take quite a -> ->>> significant -> ->>> time to do a cancel. -> ->>> How about making 'cont' fail if we're in CANCELLING? -> ->> -> ->> Actually I thought that would be the case already (in fact CANCELLING is -> ->> internal only; the outside world sees it as "active" in query-migrate). -> ->> -> ->> Lei, what is the runstate? (That is, why did cont succeed at all)? -> -> -> -> I suspect it's RUN_STATE_FINISH_MIGRATE - we set that before we do the -> -> device -> -> save, and that's what we get at the end of a migrate and it's legal to -> -> restart -> -> from there. -> -> -Yeah, but I think we get there at the end of a failed migrate only. So -> -perhaps we can introduce a new state RUN_STATE_FAILED_MIGRATE and forbid -> -"cont" from finish-migrate (only allow it from failed-migrate)? -OK, I was wrong in my previous statement; we actually go -FINISH_MIGRATE->POSTMIGRATE -so no new state is needed; you shouldn't be restarting the cpu in -FINISH_MIGRATE. - -My preference is to get libvirt to wait for the transition to POSTMIGRATE before -it issues the 'cont'. I'd rather not block the monitor with 'cont' but I'm -not sure how we'd cleanly make cont fail without breaking existing libvirts -that usually don't hit this race. (cc'ing in Jiri). - -Dave - -> -Paolo -> -> ->> Paolo -> ->> -> ->>> I'd really love to see the 'run_on_cpu' being more careful about the BQL; -> ->>> we really need all of the rest of the devices to stay quiesced at times. -> ->> -> ->> That's not really possible, because of how condition variables work. :( -> -> -> -> *Really* we need to find a solution to that - there's probably lots of -> -> other things that can spring up in that small window other than the -> -> 'cont'. -> -> -> -> Dave -> -> -> -> -- -> -> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -Hi Dave, - -On Fri, Mar 3, 2017 at 9:26 PM, Dr. David Alan Gilbert <address@hidden> -wrote: - -> -* Paolo Bonzini (address@hidden) wrote: -> -> -> -> -> -> On 03/03/2017 14:11, Dr. David Alan Gilbert wrote: -> -> > * Paolo Bonzini (address@hidden) wrote: -> -> >> -> -> >> -> -> >> On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: -> -... -> -> > That event feature went in sometime after 2.3.0. -> -> > -> -> >> One possibility is to suspend the monitor in qmp_migrate_cancel and -> -> >> resume it (with add_migration_state_change_notifier) when we hit the -> -> >> CANCELLED state. I'm not sure what the latency would be between the -> -end -> -> >> of migrate_fd_cancel and finally reaching CANCELLED. -> -> > -> -> > I don't like suspending monitors; it can potentially take quite a -> -significant -> -> > time to do a cancel. -> -> > How about making 'cont' fail if we're in CANCELLING? -> -> -> -> Actually I thought that would be the case already (in fact CANCELLING is -> -> internal only; the outside world sees it as "active" in query-migrate). -> -> -> -> Lei, what is the runstate? (That is, why did cont succeed at all)? -> -> -I suspect it's RUN_STATE_FINISH_MIGRATE - we set that before we do the -> -device -> -It is RUN_STATE_FINISH_MIGRATE. - - -> -save, and that's what we get at the end of a migrate and it's legal to -> -restart -> -from there. -> -> -> Paolo -> -> -> -> > I'd really love to see the 'run_on_cpu' being more careful about the -> -BQL; -> -> > we really need all of the rest of the devices to stay quiesced at -> -times. -> -> -> -> That's not really possible, because of how condition variables work. :( -> -> -*Really* we need to find a solution to that - there's probably lots of -> -other things that can spring up in that small window other than the -> -'cont'. -> -This is what I was worry about. Not only sync_cpu_state() will call -run_on_cpu() -but also vm_stop_force_state() will, both of them did hit the small windows -in our -test. - - -> -> -Dave -> -> --- -> -Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> - diff --git a/results/classifier/02/other/17743720 b/results/classifier/02/other/17743720 deleted file mode 100644 index 84c4786cc..000000000 --- a/results/classifier/02/other/17743720 +++ /dev/null @@ -1,772 +0,0 @@ -other: 0.984 -instruction: 0.966 -semantic: 0.962 -mistranslation: 0.959 -boot: 0.945 - -[Qemu-devel] [BUG] living migrate vm pause forever - -Sometimes, living migrate vm pause forever, migrate job stop, but very small -probability, I canât reproduce. -qemu wait semaphore from libvirt send migrate continue, however libvirt wait -semaphore from qemu send vm pause. - -follow stack: -qemu: -Thread 6 (Thread 0x7f50445f3700 (LWP 18120)): -#0 0x00007f504b84d670 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 -#1 0x00005574eda1e164 in qemu_sem_wait (sem=sem@entry=0x5574ef6930e0) at -qemu-2.12/util/qemu-thread-posix.c:322 -#2 0x00005574ed8dd72e in migration_maybe_pause (s=0x5574ef692f50, -current_active_state=0x7f50445f2ae4, new_state=10) - at qemu-2.12/migration/migration.c:2106 -#3 0x00005574ed8df51a in migration_completion (s=0x5574ef692f50) at -qemu-2.12/migration/migration.c:2137 -#4 migration_iteration_run (s=0x5574ef692f50) at -qemu-2.12/migration/migration.c:2311 -#5 migration_thread (opaque=0x5574ef692f50) -atqemu-2.12/migration/migration.c:2415 -#6 0x00007f504b847184 in start_thread () from -/lib/x86_64-linux-gnu/libpthread.so.0 -#7 0x00007f504b574bed in clone () from /lib/x86_64-linux-gnu/libc.so.6 - -libvirt: -Thread 95 (Thread 0x7fdb82ffd700 (LWP 28775)): -#0 0x00007fdd177dc404 in pthread_cond_wait@@GLIBC_2.3.2 () from -/lib/x86_64-linux-gnu/libpthread.so.0 -#1 0x00007fdd198c3b07 in virCondWait (c=0x7fdbc4003000, m=0x7fdbc4002f30) at -../../../src/util/virthread.c:252 -#2 0x00007fdd198f36d2 in virDomainObjWait (vm=0x7fdbc4002f20) at -../../../src/conf/domain_conf.c:3303 -#3 0x00007fdd09ffaa44 in qemuMigrationRun (driver=0x7fdd000037b0, -vm=0x7fdbc4002f20, persist_xml=0x0, - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss -</hostname>\n -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -flags=777, - resource=0, spec=0x7fdb82ffc670, dconn=0x0, graphicsuri=0x0, -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -migParams=0x7fdb82ffc900) - at ../../../src/qemu/qemu_migration.c:3937 -#4 0x00007fdd09ffb26a in doNativeMigrate (driver=0x7fdd000037b0, -vm=0x7fdbc4002f20, persist_xml=0x0, uri=0x7fdb780073a0 -"tcp://172.16.202.17:49152", - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss</hostname>\n - <hos---Type <return> to continue, or q <return> to quit--- -tuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -flags=777, - resource=0, dconn=0x0, graphicsuri=0x0, nmigrate_disks=0, -migrate_disks=0x0, compression=0x7fdb78007990, migParams=0x7fdb82ffc900) - at ../../../src/qemu/qemu_migration.c:4118 -#5 0x00007fdd09ffd808 in qemuMigrationPerformPhase (driver=0x7fdd000037b0, -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, persist_xml=0x0, - uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -migParams=0x7fdb82ffc900, - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss</hostname>\n - <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -flags=777, - resource=0) at ../../../src/qemu/qemu_migration.c:5030 -#6 0x00007fdd09ffdbb5 in qemuMigrationPerform (driver=0x7fdd000037b0, -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, xmlin=0x0, persist_xml=0x0, -dconnuri=0x0, - uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -listenAddress=0x0, nmigrate_disks=0, migrate_disks=0x0, nbdPort=0, -compression=0x7fdb78007990, - migParams=0x7fdb82ffc900, - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss</hostname>\n - <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -flags=777, - dname=0x0, resource=0, v3proto=true) at -../../../src/qemu/qemu_migration.c:5124 -#7 0x00007fdd0a054725 in qemuDomainMigratePerform3 (dom=0x7fdb78007b00, -xmlin=0x0, - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss</hostname>\n - <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -dconnuri=0x0, - uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, dname=0x0, -resource=0) at ../../../src/qemu/qemu_driver.c:12996 -#8 0x00007fdd199ad0f0 in virDomainMigratePerform3 (domain=0x7fdb78007b00, -xmlin=0x0, - cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss</hostname>\n - <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -dconnuri=0x0, - uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, dname=0x0, -bandwidth=0) at ../../../src/libvirt-domain.c:4698 -#9 0x000055d13923a939 in remoteDispatchDomainMigratePerform3 -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -rerr=0x7fdb82ffcbc0, - args=0x7fdb7800b220, ret=0x7fdb78021e90) at ../../../daemon/remote.c:4528 -#10 0x000055d13921a043 in remoteDispatchDomainMigratePerform3Helper -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -rerr=0x7fdb82ffcbc0, - args=0x7fdb7800b220, ret=0x7fdb78021e90) at -../../../daemon/remote_dispatch.h:7944 -#11 0x00007fdd19a260b4 in virNetServerProgramDispatchCall (prog=0x55d13af98b50, -server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620) - at ../../../src/rpc/virnetserverprogram.c:436 -#12 0x00007fdd19a25c17 in virNetServerProgramDispatch (prog=0x55d13af98b50, -server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620) - at ../../../src/rpc/virnetserverprogram.c:307 -#13 0x000055d13925933b in virNetServerProcessMsg (srv=0x55d13af90e60, -client=0x55d13b0156f0, prog=0x55d13af98b50, msg=0x55d13afbf620) - at ../../../src/rpc/virnetserver.c:148 -------------------------------------------------------------------------------------------------------------------------------------- -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -é®ä»¶ï¼ -This e-mail and its attachments contain confidential information from New H3C, -which is -intended only for the person or entity whose address is listed above. Any use -of the -information contained herein in any way (including, but not limited to, total -or partial -disclosure, reproduction, or dissemination) by persons other than the intended -recipient(s) is prohibited. If you receive this e-mail in error, please notify -the sender -by phone or email immediately and delete it! - -* Yuchen (address@hidden) wrote: -> -Sometimes, living migrate vm pause forever, migrate job stop, but very small -> -probability, I canât reproduce. -> -qemu wait semaphore from libvirt send migrate continue, however libvirt wait -> -semaphore from qemu send vm pause. -Hi, - I've copied in Jiri Denemark from libvirt. -Can you confirm exactly which qemu and libvirt versions you're using -please. - -> -follow stack: -> -qemu: -> -Thread 6 (Thread 0x7f50445f3700 (LWP 18120)): -> -#0 0x00007f504b84d670 in sem_wait () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#1 0x00005574eda1e164 in qemu_sem_wait (sem=sem@entry=0x5574ef6930e0) at -> -qemu-2.12/util/qemu-thread-posix.c:322 -> -#2 0x00005574ed8dd72e in migration_maybe_pause (s=0x5574ef692f50, -> -current_active_state=0x7f50445f2ae4, new_state=10) -> -at qemu-2.12/migration/migration.c:2106 -> -#3 0x00005574ed8df51a in migration_completion (s=0x5574ef692f50) at -> -qemu-2.12/migration/migration.c:2137 -> -#4 migration_iteration_run (s=0x5574ef692f50) at -> -qemu-2.12/migration/migration.c:2311 -> -#5 migration_thread (opaque=0x5574ef692f50) -> -atqemu-2.12/migration/migration.c:2415 -> -#6 0x00007f504b847184 in start_thread () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#7 0x00007f504b574bed in clone () from /lib/x86_64-linux-gnu/libc.so.6 -In migration_maybe_pause we have: - - migrate_set_state(&s->state, *current_active_state, - MIGRATION_STATUS_PRE_SWITCHOVER); - qemu_sem_wait(&s->pause_sem); - migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER, - new_state); - -the line numbers don't match my 2.12.0 checkout; so I guess that it's -that qemu_sem_wait it's stuck at. - -QEMU must have sent the switch to PRE_SWITCHOVER and that should have -sent an event to libvirt, and libvirt should notice that - I'm -not sure how to tell whether libvirt has seen that event yet or not? - -Dave - -> -libvirt: -> -Thread 95 (Thread 0x7fdb82ffd700 (LWP 28775)): -> -#0 0x00007fdd177dc404 in pthread_cond_wait@@GLIBC_2.3.2 () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#1 0x00007fdd198c3b07 in virCondWait (c=0x7fdbc4003000, m=0x7fdbc4002f30) at -> -../../../src/util/virthread.c:252 -> -#2 0x00007fdd198f36d2 in virDomainObjWait (vm=0x7fdbc4002f20) at -> -../../../src/conf/domain_conf.c:3303 -> -#3 0x00007fdd09ffaa44 in qemuMigrationRun (driver=0x7fdd000037b0, -> -vm=0x7fdbc4002f20, persist_xml=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss -> -</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -resource=0, spec=0x7fdb82ffc670, dconn=0x0, graphicsuri=0x0, -> -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900) -> -at ../../../src/qemu/qemu_migration.c:3937 -> -#4 0x00007fdd09ffb26a in doNativeMigrate (driver=0x7fdd000037b0, -> -vm=0x7fdbc4002f20, persist_xml=0x0, uri=0x7fdb780073a0 -> -"tcp://172.16.202.17:49152", -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n <hos---Type <return> to continue, or q <return> -> -to quit--- -> -tuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -resource=0, dconn=0x0, graphicsuri=0x0, nmigrate_disks=0, -> -migrate_disks=0x0, compression=0x7fdb78007990, migParams=0x7fdb82ffc900) -> -at ../../../src/qemu/qemu_migration.c:4118 -> -#5 0x00007fdd09ffd808 in qemuMigrationPerformPhase (driver=0x7fdd000037b0, -> -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, persist_xml=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -> -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -resource=0) at ../../../src/qemu/qemu_migration.c:5030 -> -#6 0x00007fdd09ffdbb5 in qemuMigrationPerform (driver=0x7fdd000037b0, -> -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, xmlin=0x0, persist_xml=0x0, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -> -listenAddress=0x0, nmigrate_disks=0, migrate_disks=0x0, nbdPort=0, -> -compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -dname=0x0, resource=0, v3proto=true) at -> -../../../src/qemu/qemu_migration.c:5124 -> -#7 0x00007fdd0a054725 in qemuDomainMigratePerform3 (dom=0x7fdb78007b00, -> -xmlin=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, dname=0x0, -> -resource=0) at ../../../src/qemu/qemu_driver.c:12996 -> -#8 0x00007fdd199ad0f0 in virDomainMigratePerform3 (domain=0x7fdb78007b00, -> -xmlin=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, dname=0x0, -> -bandwidth=0) at ../../../src/libvirt-domain.c:4698 -> -#9 0x000055d13923a939 in remoteDispatchDomainMigratePerform3 -> -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -> -rerr=0x7fdb82ffcbc0, -> -args=0x7fdb7800b220, ret=0x7fdb78021e90) at ../../../daemon/remote.c:4528 -> -#10 0x000055d13921a043 in remoteDispatchDomainMigratePerform3Helper -> -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -> -rerr=0x7fdb82ffcbc0, -> -args=0x7fdb7800b220, ret=0x7fdb78021e90) at -> -../../../daemon/remote_dispatch.h:7944 -> -#11 0x00007fdd19a260b4 in virNetServerProgramDispatchCall -> -(prog=0x55d13af98b50, server=0x55d13af90e60, client=0x55d13b0156f0, -> -msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserverprogram.c:436 -> -#12 0x00007fdd19a25c17 in virNetServerProgramDispatch (prog=0x55d13af98b50, -> -server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserverprogram.c:307 -> -#13 0x000055d13925933b in virNetServerProcessMsg (srv=0x55d13af90e60, -> -client=0x55d13b0156f0, prog=0x55d13af98b50, msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserver.c:148 -> -------------------------------------------------------------------------------------------------------------------------------------- -> -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -> -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -> -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -> -é®ä»¶ï¼ -> -This e-mail and its attachments contain confidential information from New -> -H3C, which is -> -intended only for the person or entity whose address is listed above. Any use -> -of the -> -information contained herein in any way (including, but not limited to, total -> -or partial -> -disclosure, reproduction, or dissemination) by persons other than the intended -> -recipient(s) is prohibited. If you receive this e-mail in error, please -> -notify the sender -> -by phone or email immediately and delete it! --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -In migration_maybe_pause we have: - - migrate_set_state(&s->state, *current_active_state, - MIGRATION_STATUS_PRE_SWITCHOVER); - qemu_sem_wait(&s->pause_sem); - migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER, - new_state); - -the line numbers don't match my 2.12.0 checkout; so I guess that it's that -qemu_sem_wait it's stuck at. - -QEMU must have sent the switch to PRE_SWITCHOVER and that should have sent an -event to libvirt, and libvirt should notice that - I'm not sure how to tell -whether libvirt has seen that event yet or not? - - -Thank you for your attention. -Yes, you are right, QEMU wait semaphore in this place. -I use qemu-2.12.1, libvirt-4.0.0. -Because I added some debug code, so the line numbers doesn't match open qemu - ------é®ä»¶åä»¶----- -å件人: Dr. David Alan Gilbert [ -mailto:address@hidden -] -åéæ¶é´: 2019å¹´8æ21æ¥ 19:13 -æ¶ä»¶äºº: yuchen (Cloud) <address@hidden>; address@hidden -æé: address@hidden -主é¢: Re: [Qemu-devel] [BUG] living migrate vm pause forever - -* Yuchen (address@hidden) wrote: -> -Sometimes, living migrate vm pause forever, migrate job stop, but very small -> -probability, I canât reproduce. -> -qemu wait semaphore from libvirt send migrate continue, however libvirt wait -> -semaphore from qemu send vm pause. -Hi, - I've copied in Jiri Denemark from libvirt. -Can you confirm exactly which qemu and libvirt versions you're using please. - -> -follow stack: -> -qemu: -> -Thread 6 (Thread 0x7f50445f3700 (LWP 18120)): -> -#0 0x00007f504b84d670 in sem_wait () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#1 0x00005574eda1e164 in qemu_sem_wait (sem=sem@entry=0x5574ef6930e0) -> -at qemu-2.12/util/qemu-thread-posix.c:322 -> -#2 0x00005574ed8dd72e in migration_maybe_pause (s=0x5574ef692f50, -> -current_active_state=0x7f50445f2ae4, new_state=10) -> -at qemu-2.12/migration/migration.c:2106 -> -#3 0x00005574ed8df51a in migration_completion (s=0x5574ef692f50) at -> -qemu-2.12/migration/migration.c:2137 -> -#4 migration_iteration_run (s=0x5574ef692f50) at -> -qemu-2.12/migration/migration.c:2311 -> -#5 migration_thread (opaque=0x5574ef692f50) -> -atqemu-2.12/migration/migration.c:2415 -> -#6 0x00007f504b847184 in start_thread () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#7 0x00007f504b574bed in clone () from -> -/lib/x86_64-linux-gnu/libc.so.6 -In migration_maybe_pause we have: - - migrate_set_state(&s->state, *current_active_state, - MIGRATION_STATUS_PRE_SWITCHOVER); - qemu_sem_wait(&s->pause_sem); - migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER, - new_state); - -the line numbers don't match my 2.12.0 checkout; so I guess that it's that -qemu_sem_wait it's stuck at. - -QEMU must have sent the switch to PRE_SWITCHOVER and that should have sent an -event to libvirt, and libvirt should notice that - I'm not sure how to tell -whether libvirt has seen that event yet or not? - -Dave - -> -libvirt: -> -Thread 95 (Thread 0x7fdb82ffd700 (LWP 28775)): -> -#0 0x00007fdd177dc404 in pthread_cond_wait@@GLIBC_2.3.2 () from -> -/lib/x86_64-linux-gnu/libpthread.so.0 -> -#1 0x00007fdd198c3b07 in virCondWait (c=0x7fdbc4003000, -> -m=0x7fdbc4002f30) at ../../../src/util/virthread.c:252 -> -#2 0x00007fdd198f36d2 in virDomainObjWait (vm=0x7fdbc4002f20) at -> -../../../src/conf/domain_conf.c:3303 -> -#3 0x00007fdd09ffaa44 in qemuMigrationRun (driver=0x7fdd000037b0, -> -vm=0x7fdbc4002f20, persist_xml=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss -> -</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -resource=0, spec=0x7fdb82ffc670, dconn=0x0, graphicsuri=0x0, -> -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900) -> -at ../../../src/qemu/qemu_migration.c:3937 -> -#4 0x00007fdd09ffb26a in doNativeMigrate (driver=0x7fdd000037b0, -> -vm=0x7fdbc4002f20, persist_xml=0x0, uri=0x7fdb780073a0 -> -"tcp://172.16.202.17:49152", -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n -> -<name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n <hos---Type <return> to continue, or q -> -<return> to quit--- -> -tuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra".. -> -tuuid>., cookieinlen=207, cookieout=0x7fdb82ffcad0, -> -tuuid>cookieoutlen=0x7fdb82ffcac8, flags=777, -> -resource=0, dconn=0x0, graphicsuri=0x0, nmigrate_disks=0, -> -migrate_disks=0x0, compression=0x7fdb78007990, migParams=0x7fdb82ffc900) -> -at ../../../src/qemu/qemu_migration.c:4118 -> -#5 0x00007fdd09ffd808 in qemuMigrationPerformPhase (driver=0x7fdd000037b0, -> -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, persist_xml=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -> -nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -resource=0) at ../../../src/qemu/qemu_migration.c:5030 -> -#6 0x00007fdd09ffdbb5 in qemuMigrationPerform (driver=0x7fdd000037b0, -> -conn=0x7fdb500205d0, vm=0x7fdbc4002f20, xmlin=0x0, persist_xml=0x0, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, -> -listenAddress=0x0, nmigrate_disks=0, migrate_disks=0x0, nbdPort=0, -> -compression=0x7fdb78007990, -> -migParams=0x7fdb82ffc900, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -flags=777, -> -dname=0x0, resource=0, v3proto=true) at -> -../../../src/qemu/qemu_migration.c:5124 -> -#7 0x00007fdd0a054725 in qemuDomainMigratePerform3 (dom=0x7fdb78007b00, -> -xmlin=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, -> -dname=0x0, resource=0) at ../../../src/qemu/qemu_driver.c:12996 -> -#8 0x00007fdd199ad0f0 in virDomainMigratePerform3 (domain=0x7fdb78007b00, -> -xmlin=0x0, -> -cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n -> -<uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n -> -<hostname>mss</hostname>\n -> -<hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., -> -cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, -> -dconnuri=0x0, -> -uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, -> -dname=0x0, bandwidth=0) at ../../../src/libvirt-domain.c:4698 -> -#9 0x000055d13923a939 in remoteDispatchDomainMigratePerform3 -> -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -> -rerr=0x7fdb82ffcbc0, -> -args=0x7fdb7800b220, ret=0x7fdb78021e90) at -> -../../../daemon/remote.c:4528 -> -#10 0x000055d13921a043 in remoteDispatchDomainMigratePerform3Helper -> -(server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, -> -rerr=0x7fdb82ffcbc0, -> -args=0x7fdb7800b220, ret=0x7fdb78021e90) at -> -../../../daemon/remote_dispatch.h:7944 -> -#11 0x00007fdd19a260b4 in virNetServerProgramDispatchCall -> -(prog=0x55d13af98b50, server=0x55d13af90e60, client=0x55d13b0156f0, -> -msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserverprogram.c:436 -> -#12 0x00007fdd19a25c17 in virNetServerProgramDispatch (prog=0x55d13af98b50, -> -server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserverprogram.c:307 -> -#13 0x000055d13925933b in virNetServerProcessMsg (srv=0x55d13af90e60, -> -client=0x55d13b0156f0, prog=0x55d13af98b50, msg=0x55d13afbf620) -> -at ../../../src/rpc/virnetserver.c:148 -> ----------------------------------------------------------------------- -> ---------------------------------------------------------------- -> -æ¬é®ä»¶åå ¶é件嫿æ°åä¸éå¢çä¿å¯ä¿¡æ¯ï¼ä» éäºåéç»ä¸é¢å°åä¸ååº -> -ç个人æç¾¤ç»ãç¦æ¢ä»»ä½å ¶ä»äººä»¥ä»»ä½å½¢å¼ä½¿ç¨ï¼å æ¬ä½ä¸éäºå ¨é¨æé¨åå°æ³é²ãå¤å¶ã -> -ææ£åï¼æ¬é®ä»¶ä¸çä¿¡æ¯ã妿æ¨éæ¶äºæ¬é®ä»¶ï¼è¯·æ¨ç«å³çµè¯æé®ä»¶éç¥å件人并å 餿¬ -> -é®ä»¶ï¼ -> -This e-mail and its attachments contain confidential information from -> -New H3C, which is intended only for the person or entity whose address -> -is listed above. Any use of the information contained herein in any -> -way (including, but not limited to, total or partial disclosure, -> -reproduction, or dissemination) by persons other than the intended -> -recipient(s) is prohibited. If you receive this e-mail in error, -> -please notify the sender by phone or email immediately and delete it! --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - diff --git a/results/classifier/02/other/21221931 b/results/classifier/02/other/21221931 deleted file mode 100644 index 930026f48..000000000 --- a/results/classifier/02/other/21221931 +++ /dev/null @@ -1,329 +0,0 @@ -other: 0.979 -instruction: 0.974 -semantic: 0.967 -boot: 0.947 -mistranslation: 0.933 - -[BUG] qemu git error with virgl - -Hello, - -i can't start any system if i use virgl. I get the following error: -qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion -`con->gl' failed. -./and.sh: line 27: 3337167 Aborted                qemu-x86_64 -m 4096 --smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device -virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device -intel-hda,id=sound0,msi=on -device -hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci --device usb-tablet,bus=xhci.0 -net -nic,macaddr=52:54:00:12:34:62,model=e1000 -net -tap,ifname=$INTERFACE,script=no,downscript=no -drive -file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads -Set 'tap3' nonpersistent - -i have bicected the issue: - -towo:Defiant> git bisect good -b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit -commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 -Author: Paolo Bonzini <pbonzini@redhat.com> -Date:  Tue Oct 27 08:44:23 2020 -0400 - -   vl: remove separate preconfig main_loop -   Move post-preconfig initialization to the x-exit-preconfig. If -preconfig -   is not requested, just exit preconfig mode immediately with the QMP -   command. - -   As a result, the preconfig loop will run with accel_setup_post -   and os_setup_post restrictions (xen_restrict, chroot, etc.) -   already done. - -   Reviewed-by: Igor Mammedov <imammedo@redhat.com> -   Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> - - include/sysemu/runstate.h | 1 - - monitor/qmp-cmds.c       | 9 ----- - softmmu/vl.c             | 95 -++++++++++++++++++++--------------------------- - 3 files changed, 41 insertions(+), 64 deletions(-) - -Regards, - -Torsten Wohlfarth - -Cc'ing Gerd + patch author/reviewer. - -On 1/2/21 2:11 PM, Torsten Wohlfarth wrote: -> -Hello, -> -> -i can't start any system if i use virgl. I get the following error: -> -> -qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion -> -`con->gl' failed. -> -./and.sh: line 27: 3337167 Aborted                qemu-x86_64 -m 4096 -> --smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device -> -virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device -> -intel-hda,id=sound0,msi=on -device -> -hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci -> --device usb-tablet,bus=xhci.0 -net -> -nic,macaddr=52:54:00:12:34:62,model=e1000 -net -> -tap,ifname=$INTERFACE,script=no,downscript=no -drive -> -file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads -> -> -Set 'tap3' nonpersistent -> -> -i have bicected the issue: -> -> -towo:Defiant> git bisect good -> -b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit -> -commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 -> -Author: Paolo Bonzini <pbonzini@redhat.com> -> -Date:  Tue Oct 27 08:44:23 2020 -0400 -> -> -   vl: remove separate preconfig main_loop -> -> -   Move post-preconfig initialization to the x-exit-preconfig. If -> -preconfig -> -   is not requested, just exit preconfig mode immediately with the QMP -> -   command. -> -> -   As a result, the preconfig loop will run with accel_setup_post -> -   and os_setup_post restrictions (xen_restrict, chroot, etc.) -> -   already done. -> -> -   Reviewed-by: Igor Mammedov <imammedo@redhat.com> -> -   Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> -> -> - include/sysemu/runstate.h | 1 - -> - monitor/qmp-cmds.c       | 9 ----- -> - softmmu/vl.c             | 95 -> -++++++++++++++++++++--------------------------- -> - 3 files changed, 41 insertions(+), 64 deletions(-) -> -> -Regards, -> -> -Torsten Wohlfarth -> -> -> - -On Sun, 3 Jan 2021 18:28:11 +0100 -Philippe Mathieu-Daudé <philmd@redhat.com> wrote: - -> -Cc'ing Gerd + patch author/reviewer. -> -> -On 1/2/21 2:11 PM, Torsten Wohlfarth wrote: -> -> Hello, -> -> -> -> i can't start any system if i use virgl. I get the following error: -> -> -> -> qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion -> -> `con->gl' failed. -Does following fix issue: - [PULL 12/55] vl: initialize displays _after_ exiting preconfiguration - -> -> ./and.sh: line 27: 3337167 Aborted                qemu-x86_64 -m 4096 -> -> -smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device -> -> virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device -> -> intel-hda,id=sound0,msi=on -device -> -> hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci -> -> -device usb-tablet,bus=xhci.0 -net -> -> nic,macaddr=52:54:00:12:34:62,model=e1000 -net -> -> tap,ifname=$INTERFACE,script=no,downscript=no -drive -> -> file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads -> -> -> -> Set 'tap3' nonpersistent -> -> -> -> i have bicected the issue: -> -> -> -> towo:Defiant> git bisect good -> -> b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit -> -> commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 -> -> Author: Paolo Bonzini <pbonzini@redhat.com> -> -> Date:  Tue Oct 27 08:44:23 2020 -0400 -> -> -> ->    vl: remove separate preconfig main_loop -> -> -> ->    Move post-preconfig initialization to the x-exit-preconfig. If -> -> preconfig -> ->    is not requested, just exit preconfig mode immediately with the QMP -> ->    command. -> -> -> ->    As a result, the preconfig loop will run with accel_setup_post -> ->    and os_setup_post restrictions (xen_restrict, chroot, etc.) -> ->    already done. -> -> -> ->    Reviewed-by: Igor Mammedov <imammedo@redhat.com> -> ->    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> -> -> -> ->  include/sysemu/runstate.h | 1 - -> ->  monitor/qmp-cmds.c       | 9 ----- -> ->  softmmu/vl.c             | 95 -> -> ++++++++++++++++++++--------------------------- -> ->  3 files changed, 41 insertions(+), 64 deletions(-) -> -> -> -> Regards, -> -> -> -> Torsten Wohlfarth -> -> -> -> -> -> -> -> - -Hi Igor, - -yes, that fixes my issue. - -Regards, Torsten - -Am 04.01.21 um 19:50 schrieb Igor Mammedov: -On Sun, 3 Jan 2021 18:28:11 +0100 -Philippe Mathieu-Daudé <philmd@redhat.com> wrote: -Cc'ing Gerd + patch author/reviewer. - -On 1/2/21 2:11 PM, Torsten Wohlfarth wrote: -Hello, - -i can't start any system if i use virgl. I get the following error: - -qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion -`con->gl' failed. -Does following fix issue: - [PULL 12/55] vl: initialize displays _after_ exiting preconfiguration -./and.sh: line 27: 3337167 Aborted                qemu-x86_64 -m 4096 --smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device -virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device -intel-hda,id=sound0,msi=on -device -hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci --device usb-tablet,bus=xhci.0 -net -nic,macaddr=52:54:00:12:34:62,model=e1000 -net -tap,ifname=$INTERFACE,script=no,downscript=no -drive -file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads - -Set 'tap3' nonpersistent - -i have bicected the issue: -towo:Defiant> git bisect good -b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit -commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 -Author: Paolo Bonzini <pbonzini@redhat.com> -Date:  Tue Oct 27 08:44:23 2020 -0400 - -    vl: remove separate preconfig main_loop - -    Move post-preconfig initialization to the x-exit-preconfig. If -preconfig -    is not requested, just exit preconfig mode immediately with the QMP -    command. - -    As a result, the preconfig loop will run with accel_setup_post -    and os_setup_post restrictions (xen_restrict, chroot, etc.) -    already done. - -    Reviewed-by: Igor Mammedov <imammedo@redhat.com> -    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> - -  include/sysemu/runstate.h | 1 - -  monitor/qmp-cmds.c       | 9 ----- -  softmmu/vl.c             | 95 -++++++++++++++++++++--------------------------- -  3 files changed, 41 insertions(+), 64 deletions(-) - -Regards, - -Torsten Wohlfarth - diff --git a/results/classifier/02/other/21247035 b/results/classifier/02/other/21247035 deleted file mode 100644 index 14cc03c6d..000000000 --- a/results/classifier/02/other/21247035 +++ /dev/null @@ -1,1322 +0,0 @@ -other: 0.640 -mistranslation: 0.584 -instruction: 0.508 -semantic: 0.374 -boot: 0.345 - -[Qemu-devel] [BUG] I/O thread segfault for QEMU on s390x - -Hi, -I have been noticing some segfaults for QEMU on s390x, and I have been -hitting this issue quite reliably (at least once in 10 runs of a test -case). The qemu version is 2.11.50, and I have systemd created coredumps -when this happens. - -Here is a back trace of the segfaulting thread: - - -#0 0x000003ffafed202c in swapcontext () from /lib64/libc.so.6 -#1 0x000002aa355c02ee in qemu_coroutine_new () at -util/coroutine-ucontext.c:164 -#2 0x000002aa355bec34 in qemu_coroutine_create -(address@hidden <blk_aio_read_entry>, -address@hidden) at util/qemu-coroutine.c:76 -#3 0x000002aa35510262 in blk_aio_prwv (blk=0x2aa65fbefa0, -offset=<optimized out>, bytes=<optimized out>, qiov=0x3ffa002a9c0, -address@hidden <blk_aio_read_entry>, flags=0, -cb=0x2aa35340a50 <virtio_blk_rw_complete>, opaque=0x3ffa002a960) at -block/block-backend.c:1299 -#4 0x000002aa35510376 in blk_aio_preadv (blk=<optimized out>, -offset=<optimized out>, qiov=<optimized out>, flags=<optimized out>, -cb=<optimized out>, opaque=0x3ffa002a960) at block/block-backend.c:1392 -#5 0x000002aa3534114e in submit_requests (niov=<optimized out>, -num_reqs=<optimized out>, start=<optimized out>, mrb=<optimized out>, -blk=<optimized out>) at -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:372 -#6 virtio_blk_submit_multireq (blk=<optimized out>, -address@hidden) at -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:402 -#7 0x000002aa353422e0 in virtio_blk_handle_vq (s=0x2aa6611e7d8, -vq=0x3ffb0f5f010) at /usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:620 -#8 0x000002aa3536655a in virtio_queue_notify_aio_vq -(address@hidden) at -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1515 -#9 0x000002aa35366cd6 in virtio_queue_notify_aio_vq (vq=0x3ffb0f5f010) -at /usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1511 -#10 virtio_queue_host_notifier_aio_poll (opaque=0x3ffb0f5f078) at -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:2409 -#11 0x000002aa355a8ba4 in run_poll_handlers_once -(address@hidden) at util/aio-posix.c:497 -#12 0x000002aa355a9b74 in run_poll_handlers (max_ns=<optimized out>, -ctx=0x2aa65f99310) at util/aio-posix.c:534 -#13 try_poll_mode (blocking=true, ctx=0x2aa65f99310) at util/aio-posix.c:562 -#14 aio_poll (ctx=0x2aa65f99310, address@hidden) at -util/aio-posix.c:602 -#15 0x000002aa353d2d0a in iothread_run (opaque=0x2aa65f990f0) at -iothread.c:60 -#16 0x000003ffb0f07e82 in start_thread () from /lib64/libpthread.so.0 -#17 0x000003ffaff91596 in thread_start () from /lib64/libc.so.6 -I don't have much knowledge about i/o threads and the block layer code -in QEMU, so I would like to report to the community about this issue. -I believe this very similar to the bug that I reported upstream couple -of days ago -( -https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04452.html -). -Any help would be greatly appreciated. - -Thanks -Farhan - -On Thu, Mar 1, 2018 at 10:33 PM, Farhan Ali <address@hidden> wrote: -> -Hi, -> -> -I have been noticing some segfaults for QEMU on s390x, and I have been -> -hitting this issue quite reliably (at least once in 10 runs of a test case). -> -The qemu version is 2.11.50, and I have systemd created coredumps -> -when this happens. -Can you describe the test case or suggest how to reproduce it for us? - -Fam - -On 03/02/2018 01:13 AM, Fam Zheng wrote: -On Thu, Mar 1, 2018 at 10:33 PM, Farhan Ali <address@hidden> wrote: -Hi, - -I have been noticing some segfaults for QEMU on s390x, and I have been -hitting this issue quite reliably (at least once in 10 runs of a test case). -The qemu version is 2.11.50, and I have systemd created coredumps -when this happens. -Can you describe the test case or suggest how to reproduce it for us? - -Fam -The test case is with a single guest, running a memory intensive -workload. The guest has 8 vpcus and 4G of memory. -Here is the qemu command line, if that helps: - -/usr/bin/qemu-kvm -name guest=sles,debug-threads=on \ --S -object -secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-sles/master-key.aes -\ --machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off \ --m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 \ --object iothread,id=iothread1 -object iothread,id=iothread2 -uuid -b83a596b-3a1a-4ac9-9f3e-d9a4032ee52c \ --display none -no-user-config -nodefaults -chardev -socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-sles/monitor.sock,server,nowait --mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc --no-shutdown \ --boot strict=on -drive -file=/dev/mapper/360050763998b0883980000002400002b,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native --device -virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 --drive -file=/dev/mapper/360050763998b0883980000002800002f,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native --device -virtio-blk-ccw,iothread=iothread2,scsi=off,devno=fe.0.0002,drive=drive-virtio-disk1,id=virtio-disk1 --netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device -virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:38:a6:36:e8:1f,devno=fe.0.0000 --chardev pty,id=charconsole0 -device -sclpconsole,chardev=charconsole0,id=console0 -device -virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -msg timestamp=on -Please let me know if I need to provide any other information. - -Thanks -Farhan - -On Thu, Mar 01, 2018 at 09:33:35AM -0500, Farhan Ali wrote: -> -Hi, -> -> -I have been noticing some segfaults for QEMU on s390x, and I have been -> -hitting this issue quite reliably (at least once in 10 runs of a test case). -> -The qemu version is 2.11.50, and I have systemd created coredumps -> -when this happens. -> -> -Here is a back trace of the segfaulting thread: -The backtrace looks normal. - -Please post the QEMU command-line and the details of the segfault (which -memory access faulted?). - -> -#0 0x000003ffafed202c in swapcontext () from /lib64/libc.so.6 -> -#1 0x000002aa355c02ee in qemu_coroutine_new () at -> -util/coroutine-ucontext.c:164 -> -#2 0x000002aa355bec34 in qemu_coroutine_create -> -(address@hidden <blk_aio_read_entry>, -> -address@hidden) at util/qemu-coroutine.c:76 -> -#3 0x000002aa35510262 in blk_aio_prwv (blk=0x2aa65fbefa0, offset=<optimized -> -out>, bytes=<optimized out>, qiov=0x3ffa002a9c0, -> -address@hidden <blk_aio_read_entry>, flags=0, -> -cb=0x2aa35340a50 <virtio_blk_rw_complete>, opaque=0x3ffa002a960) at -> -block/block-backend.c:1299 -> -#4 0x000002aa35510376 in blk_aio_preadv (blk=<optimized out>, -> -offset=<optimized out>, qiov=<optimized out>, flags=<optimized out>, -> -cb=<optimized out>, opaque=0x3ffa002a960) at block/block-backend.c:1392 -> -#5 0x000002aa3534114e in submit_requests (niov=<optimized out>, -> -num_reqs=<optimized out>, start=<optimized out>, mrb=<optimized out>, -> -blk=<optimized out>) at -> -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:372 -> -#6 virtio_blk_submit_multireq (blk=<optimized out>, -> -address@hidden) at -> -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:402 -> -#7 0x000002aa353422e0 in virtio_blk_handle_vq (s=0x2aa6611e7d8, -> -vq=0x3ffb0f5f010) at /usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:620 -> -#8 0x000002aa3536655a in virtio_queue_notify_aio_vq -> -(address@hidden) at -> -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1515 -> -#9 0x000002aa35366cd6 in virtio_queue_notify_aio_vq (vq=0x3ffb0f5f010) at -> -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1511 -> -#10 virtio_queue_host_notifier_aio_poll (opaque=0x3ffb0f5f078) at -> -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:2409 -> -#11 0x000002aa355a8ba4 in run_poll_handlers_once -> -(address@hidden) at util/aio-posix.c:497 -> -#12 0x000002aa355a9b74 in run_poll_handlers (max_ns=<optimized out>, -> -ctx=0x2aa65f99310) at util/aio-posix.c:534 -> -#13 try_poll_mode (blocking=true, ctx=0x2aa65f99310) at util/aio-posix.c:562 -> -#14 aio_poll (ctx=0x2aa65f99310, address@hidden) at -> -util/aio-posix.c:602 -> -#15 0x000002aa353d2d0a in iothread_run (opaque=0x2aa65f990f0) at -> -iothread.c:60 -> -#16 0x000003ffb0f07e82 in start_thread () from /lib64/libpthread.so.0 -> -#17 0x000003ffaff91596 in thread_start () from /lib64/libc.so.6 -> -> -> -I don't have much knowledge about i/o threads and the block layer code in -> -QEMU, so I would like to report to the community about this issue. -> -I believe this very similar to the bug that I reported upstream couple of -> -days ago -> -( -https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04452.html -). -> -> -Any help would be greatly appreciated. -> -> -Thanks -> -Farhan -> -signature.asc -Description: -PGP signature - -On 03/02/2018 04:23 AM, Stefan Hajnoczi wrote: -On Thu, Mar 01, 2018 at 09:33:35AM -0500, Farhan Ali wrote: -Hi, - -I have been noticing some segfaults for QEMU on s390x, and I have been -hitting this issue quite reliably (at least once in 10 runs of a test case). -The qemu version is 2.11.50, and I have systemd created coredumps -when this happens. - -Here is a back trace of the segfaulting thread: -The backtrace looks normal. - -Please post the QEMU command-line and the details of the segfault (which -memory access faulted?). -I was able to create another crash today and here is the qemu comand line - -/usr/bin/qemu-kvm -name guest=sles,debug-threads=on \ --S -object -secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-sles/master-key.aes -\ --machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off \ --m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 \ --object iothread,id=iothread1 -object iothread,id=iothread2 -uuid -b83a596b-3a1a-4ac9-9f3e-d9a4032ee52c \ --display none -no-user-config -nodefaults -chardev -socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-sles/monitor.sock,server,nowait --mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc --no-shutdown \ --boot strict=on -drive -file=/dev/mapper/360050763998b0883980000002400002b,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native --device -virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 --drive -file=/dev/mapper/360050763998b0883980000002800002f,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native --device -virtio-blk-ccw,iothread=iothread2,scsi=off,devno=fe.0.0002,drive=drive-virtio-disk1,id=virtio-disk1 --netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device -virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:38:a6:36:e8:1f,devno=fe.0.0000 --chardev pty,id=charconsole0 -device -sclpconsole,chardev=charconsole0,id=console0 -device -virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -msg timestamp=on -This the latest back trace on the segfaulting thread, and it seems to -segfault in swapcontext. -Program terminated with signal SIGSEGV, Segmentation fault. -#0 0x000003ff8595202c in swapcontext () from /lib64/libc.so.6 - - -This is the remaining back trace: - -#0 0x000003ff8595202c in swapcontext () from /lib64/libc.so.6 -#1 0x000002aa33b45566 in qemu_coroutine_new () at -util/coroutine-ucontext.c:164 -#2 0x000002aa33b43eac in qemu_coroutine_create -(address@hidden <blk_aio_write_entry>, -address@hidden) at util/qemu-coroutine.c:76 -#3 0x000002aa33a954da in blk_aio_prwv (blk=0x2aa4f0efda0, -offset=<optimized out>, bytes=<optimized out>, qiov=0x3ff74019080, -address@hidden <blk_aio_write_entry>, flags=0, -cb=0x2aa338c62e8 <virtio_blk_rw_complete>, opaque=0x3ff74019020) at -block/block-backend.c:1299 -#4 0x000002aa33a9563e in blk_aio_pwritev (blk=<optimized out>, -offset=<optimized out>, qiov=<optimized out>, flags=<optimized out>, -cb=<optimized out>, opaque=0x3ff74019020) at block/block-backend.c:1400 -#5 0x000002aa338c6a38 in submit_requests (niov=<optimized out>, -num_reqs=1, start=<optimized out>, mrb=0x3ff831fe6e0, blk=<optimized -out>) at /usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:369 -#6 virtio_blk_submit_multireq (blk=<optimized out>, -address@hidden) at -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:426 -#7 0x000002aa338c7b78 in virtio_blk_handle_vq (s=0x2aa4f2507c8, -vq=0x3ff869df010) at /usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:620 -#8 0x000002aa338ebdf2 in virtio_queue_notify_aio_vq (vq=0x3ff869df010) -at /usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1515 -#9 0x000002aa33b2df46 in aio_dispatch_handlers -(address@hidden) at util/aio-posix.c:406 -#10 0x000002aa33b2eb50 in aio_poll (ctx=0x2aa4f0ca050, -address@hidden) at util/aio-posix.c:692 -#11 0x000002aa33957f6a in iothread_run (opaque=0x2aa4f0c9630) at -iothread.c:60 -#12 0x000003ff86987e82 in start_thread () from /lib64/libpthread.so.0 -#13 0x000003ff85a11596 in thread_start () from /lib64/libc.so.6 -Backtrace stopped: previous frame identical to this frame (corrupt stack?) - -On Fri, Mar 02, 2018 at 10:30:57AM -0500, Farhan Ali wrote: -> -> -> -On 03/02/2018 04:23 AM, Stefan Hajnoczi wrote: -> -> On Thu, Mar 01, 2018 at 09:33:35AM -0500, Farhan Ali wrote: -> -> > Hi, -> -> > -> -> > I have been noticing some segfaults for QEMU on s390x, and I have been -> -> > hitting this issue quite reliably (at least once in 10 runs of a test -> -> > case). -> -> > The qemu version is 2.11.50, and I have systemd created coredumps -> -> > when this happens. -> -> > -> -> > Here is a back trace of the segfaulting thread: -> -> The backtrace looks normal. -> -> -> -> Please post the QEMU command-line and the details of the segfault (which -> -> memory access faulted?). -> -> -> -> -> -I was able to create another crash today and here is the qemu comand line -> -> -/usr/bin/qemu-kvm -name guest=sles,debug-threads=on \ -> --S -object -> -secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-sles/master-key.aes -> -\ -> --machine s390-ccw-virtio-2.12,accel=kvm,usb=off,dump-guest-core=off \ -> --m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 \ -> --object iothread,id=iothread1 -object iothread,id=iothread2 -uuid -> -b83a596b-3a1a-4ac9-9f3e-d9a4032ee52c \ -> --display none -no-user-config -nodefaults -chardev -> -socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-sles/monitor.sock,server,nowait -> -> --mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -> -\ -> --boot strict=on -drive -> -file=/dev/mapper/360050763998b0883980000002400002b,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -> --device -> -virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -> --drive -> -file=/dev/mapper/360050763998b0883980000002800002f,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native -> --device -> -virtio-blk-ccw,iothread=iothread2,scsi=off,devno=fe.0.0002,drive=drive-virtio-disk1,id=virtio-disk1 -> --netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device -> -virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:38:a6:36:e8:1f,devno=fe.0.0000 -> --chardev pty,id=charconsole0 -device -> -sclpconsole,chardev=charconsole0,id=console0 -device -> -virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -msg timestamp=on -> -> -> -This the latest back trace on the segfaulting thread, and it seems to -> -segfault in swapcontext. -> -> -Program terminated with signal SIGSEGV, Segmentation fault. -> -#0 0x000003ff8595202c in swapcontext () from /lib64/libc.so.6 -Please include the following gdb output: - - (gdb) disas swapcontext - (gdb) i r - -That way it's possible to see which instruction faulted and which -registers were being accessed. - -> -This is the remaining back trace: -> -> -#0 0x000003ff8595202c in swapcontext () from /lib64/libc.so.6 -> -#1 0x000002aa33b45566 in qemu_coroutine_new () at -> -util/coroutine-ucontext.c:164 -> -#2 0x000002aa33b43eac in qemu_coroutine_create -> -(address@hidden <blk_aio_write_entry>, -> -address@hidden) at util/qemu-coroutine.c:76 -> -#3 0x000002aa33a954da in blk_aio_prwv (blk=0x2aa4f0efda0, offset=<optimized -> -out>, bytes=<optimized out>, qiov=0x3ff74019080, -> -address@hidden <blk_aio_write_entry>, flags=0, -> -cb=0x2aa338c62e8 <virtio_blk_rw_complete>, opaque=0x3ff74019020) at -> -block/block-backend.c:1299 -> -#4 0x000002aa33a9563e in blk_aio_pwritev (blk=<optimized out>, -> -offset=<optimized out>, qiov=<optimized out>, flags=<optimized out>, -> -cb=<optimized out>, opaque=0x3ff74019020) at block/block-backend.c:1400 -> -#5 0x000002aa338c6a38 in submit_requests (niov=<optimized out>, num_reqs=1, -> -start=<optimized out>, mrb=0x3ff831fe6e0, blk=<optimized out>) at -> -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:369 -> -#6 virtio_blk_submit_multireq (blk=<optimized out>, -> -address@hidden) at -> -/usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:426 -> -#7 0x000002aa338c7b78 in virtio_blk_handle_vq (s=0x2aa4f2507c8, -> -vq=0x3ff869df010) at /usr/src/debug/qemu-2.11.50/hw/block/virtio-blk.c:620 -> -#8 0x000002aa338ebdf2 in virtio_queue_notify_aio_vq (vq=0x3ff869df010) at -> -/usr/src/debug/qemu-2.11.50/hw/virtio/virtio.c:1515 -> -#9 0x000002aa33b2df46 in aio_dispatch_handlers -> -(address@hidden) at util/aio-posix.c:406 -> -#10 0x000002aa33b2eb50 in aio_poll (ctx=0x2aa4f0ca050, -> -address@hidden) at util/aio-posix.c:692 -> -#11 0x000002aa33957f6a in iothread_run (opaque=0x2aa4f0c9630) at -> -iothread.c:60 -> -#12 0x000003ff86987e82 in start_thread () from /lib64/libpthread.so.0 -> -#13 0x000003ff85a11596 in thread_start () from /lib64/libc.so.6 -> -Backtrace stopped: previous frame identical to this frame (corrupt stack?) -> -signature.asc -Description: -PGP signature - -On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote: -Please include the following gdb output: - - (gdb) disas swapcontext - (gdb) i r - -That way it's possible to see which instruction faulted and which -registers were being accessed. -here is the disas out for swapcontext, this is on a coredump with -debugging symbols enabled for qemu. So the addresses from the previous -dump is a little different. -(gdb) disas swapcontext -Dump of assembler code for function swapcontext: - 0x000003ff90751fb8 <+0>: lgr %r1,%r2 - 0x000003ff90751fbc <+4>: lgr %r0,%r3 - 0x000003ff90751fc0 <+8>: stfpc 248(%r1) - 0x000003ff90751fc4 <+12>: std %f0,256(%r1) - 0x000003ff90751fc8 <+16>: std %f1,264(%r1) - 0x000003ff90751fcc <+20>: std %f2,272(%r1) - 0x000003ff90751fd0 <+24>: std %f3,280(%r1) - 0x000003ff90751fd4 <+28>: std %f4,288(%r1) - 0x000003ff90751fd8 <+32>: std %f5,296(%r1) - 0x000003ff90751fdc <+36>: std %f6,304(%r1) - 0x000003ff90751fe0 <+40>: std %f7,312(%r1) - 0x000003ff90751fe4 <+44>: std %f8,320(%r1) - 0x000003ff90751fe8 <+48>: std %f9,328(%r1) - 0x000003ff90751fec <+52>: std %f10,336(%r1) - 0x000003ff90751ff0 <+56>: std %f11,344(%r1) - 0x000003ff90751ff4 <+60>: std %f12,352(%r1) - 0x000003ff90751ff8 <+64>: std %f13,360(%r1) - 0x000003ff90751ffc <+68>: std %f14,368(%r1) - 0x000003ff90752000 <+72>: std %f15,376(%r1) - 0x000003ff90752004 <+76>: slgr %r2,%r2 - 0x000003ff90752008 <+80>: stam %a0,%a15,184(%r1) - 0x000003ff9075200c <+84>: stmg %r0,%r15,56(%r1) - 0x000003ff90752012 <+90>: la %r2,2 - 0x000003ff90752016 <+94>: lgr %r5,%r0 - 0x000003ff9075201a <+98>: la %r3,384(%r5) - 0x000003ff9075201e <+102>: la %r4,384(%r1) - 0x000003ff90752022 <+106>: lghi %r5,8 - 0x000003ff90752026 <+110>: svc 175 - 0x000003ff90752028 <+112>: lgr %r5,%r0 -=> 0x000003ff9075202c <+116>: lfpc 248(%r5) - 0x000003ff90752030 <+120>: ld %f0,256(%r5) - 0x000003ff90752034 <+124>: ld %f1,264(%r5) - 0x000003ff90752038 <+128>: ld %f2,272(%r5) - 0x000003ff9075203c <+132>: ld %f3,280(%r5) - 0x000003ff90752040 <+136>: ld %f4,288(%r5) - 0x000003ff90752044 <+140>: ld %f5,296(%r5) - 0x000003ff90752048 <+144>: ld %f6,304(%r5) - 0x000003ff9075204c <+148>: ld %f7,312(%r5) - 0x000003ff90752050 <+152>: ld %f8,320(%r5) - 0x000003ff90752054 <+156>: ld %f9,328(%r5) - 0x000003ff90752058 <+160>: ld %f10,336(%r5) - 0x000003ff9075205c <+164>: ld %f11,344(%r5) - 0x000003ff90752060 <+168>: ld %f12,352(%r5) - 0x000003ff90752064 <+172>: ld %f13,360(%r5) - 0x000003ff90752068 <+176>: ld %f14,368(%r5) - 0x000003ff9075206c <+180>: ld %f15,376(%r5) - 0x000003ff90752070 <+184>: lam %a2,%a15,192(%r5) - 0x000003ff90752074 <+188>: lmg %r0,%r15,56(%r5) - 0x000003ff9075207a <+194>: br %r14 -End of assembler dump. - -(gdb) i r -r0 0x0 0 -r1 0x3ff8fe7de40 4396165881408 -r2 0x0 0 -r3 0x3ff8fe7e1c0 4396165882304 -r4 0x3ff8fe7dfc0 4396165881792 -r5 0x0 0 -r6 0xffffffff88004880 18446744071696304256 -r7 0x3ff880009e0 4396033247712 -r8 0x27ff89000 10736930816 -r9 0x3ff88001460 4396033250400 -r10 0x1000 4096 -r11 0x1261be0 19274720 -r12 0x3ff88001e00 4396033252864 -r13 0x14d0bc0 21826496 -r14 0x1312ac8 19999432 -r15 0x3ff8fe7dc80 4396165880960 -pc 0x3ff9075202c 0x3ff9075202c <swapcontext+116> -cc 0x2 2 - -On 03/05/2018 07:45 PM, Farhan Ali wrote: -> -> -> -On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote: -> -> Please include the following gdb output: -> -> -> ->   (gdb) disas swapcontext -> ->   (gdb) i r -> -> -> -> That way it's possible to see which instruction faulted and which -> -> registers were being accessed. -> -> -> -here is the disas out for swapcontext, this is on a coredump with debugging -> -symbols enabled for qemu. So the addresses from the previous dump is a little -> -different. -> -> -> -(gdb) disas swapcontext -> -Dump of assembler code for function swapcontext: -> -  0x000003ff90751fb8 <+0>:   lgr   %r1,%r2 -> -  0x000003ff90751fbc <+4>:   lgr   %r0,%r3 -> -  0x000003ff90751fc0 <+8>:   stfpc   248(%r1) -> -  0x000003ff90751fc4 <+12>:   std   %f0,256(%r1) -> -  0x000003ff90751fc8 <+16>:   std   %f1,264(%r1) -> -  0x000003ff90751fcc <+20>:   std   %f2,272(%r1) -> -  0x000003ff90751fd0 <+24>:   std   %f3,280(%r1) -> -  0x000003ff90751fd4 <+28>:   std   %f4,288(%r1) -> -  0x000003ff90751fd8 <+32>:   std   %f5,296(%r1) -> -  0x000003ff90751fdc <+36>:   std   %f6,304(%r1) -> -  0x000003ff90751fe0 <+40>:   std   %f7,312(%r1) -> -  0x000003ff90751fe4 <+44>:   std   %f8,320(%r1) -> -  0x000003ff90751fe8 <+48>:   std   %f9,328(%r1) -> -  0x000003ff90751fec <+52>:   std   %f10,336(%r1) -> -  0x000003ff90751ff0 <+56>:   std   %f11,344(%r1) -> -  0x000003ff90751ff4 <+60>:   std   %f12,352(%r1) -> -  0x000003ff90751ff8 <+64>:   std   %f13,360(%r1) -> -  0x000003ff90751ffc <+68>:   std   %f14,368(%r1) -> -  0x000003ff90752000 <+72>:   std   %f15,376(%r1) -> -  0x000003ff90752004 <+76>:   slgr   %r2,%r2 -> -  0x000003ff90752008 <+80>:   stam   %a0,%a15,184(%r1) -> -  0x000003ff9075200c <+84>:   stmg   %r0,%r15,56(%r1) -> -  0x000003ff90752012 <+90>:   la   %r2,2 -> -  0x000003ff90752016 <+94>:   lgr   %r5,%r0 -> -  0x000003ff9075201a <+98>:   la   %r3,384(%r5) -> -  0x000003ff9075201e <+102>:   la   %r4,384(%r1) -> -  0x000003ff90752022 <+106>:   lghi   %r5,8 -> -  0x000003ff90752026 <+110>:   svc   175 -sys_rt_sigprocmask. r0 should not be changed by the system call. - -> -  0x000003ff90752028 <+112>:   lgr   %r5,%r0 -> -=> 0x000003ff9075202c <+116>:   lfpc   248(%r5) -so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the -2nd parameter to this -function). Now this is odd. - -> -  0x000003ff90752030 <+120>:   ld   %f0,256(%r5) -> -  0x000003ff90752034 <+124>:   ld   %f1,264(%r5) -> -  0x000003ff90752038 <+128>:   ld   %f2,272(%r5) -> -  0x000003ff9075203c <+132>:   ld   %f3,280(%r5) -> -  0x000003ff90752040 <+136>:   ld   %f4,288(%r5) -> -  0x000003ff90752044 <+140>:   ld   %f5,296(%r5) -> -  0x000003ff90752048 <+144>:   ld   %f6,304(%r5) -> -  0x000003ff9075204c <+148>:   ld   %f7,312(%r5) -> -  0x000003ff90752050 <+152>:   ld   %f8,320(%r5) -> -  0x000003ff90752054 <+156>:   ld   %f9,328(%r5) -> -  0x000003ff90752058 <+160>:   ld   %f10,336(%r5) -> -  0x000003ff9075205c <+164>:   ld   %f11,344(%r5) -> -  0x000003ff90752060 <+168>:   ld   %f12,352(%r5) -> -  0x000003ff90752064 <+172>:   ld   %f13,360(%r5) -> -  0x000003ff90752068 <+176>:   ld   %f14,368(%r5) -> -  0x000003ff9075206c <+180>:   ld   %f15,376(%r5) -> -  0x000003ff90752070 <+184>:   lam   %a2,%a15,192(%r5) -> -  0x000003ff90752074 <+188>:   lmg   %r0,%r15,56(%r5) -> -  0x000003ff9075207a <+194>:   br   %r14 -> -End of assembler dump. -> -> -(gdb) i r -> -r0            0x0   0 -> -r1            0x3ff8fe7de40   4396165881408 -> -r2            0x0   0 -> -r3            0x3ff8fe7e1c0   4396165882304 -> -r4            0x3ff8fe7dfc0   4396165881792 -> -r5            0x0   0 -> -r6            0xffffffff88004880   18446744071696304256 -> -r7            0x3ff880009e0   4396033247712 -> -r8            0x27ff89000   10736930816 -> -r9            0x3ff88001460   4396033250400 -> -r10           0x1000   4096 -> -r11           0x1261be0   19274720 -> -r12           0x3ff88001e00   4396033252864 -> -r13           0x14d0bc0   21826496 -> -r14           0x1312ac8   19999432 -> -r15           0x3ff8fe7dc80   4396165880960 -> -pc            0x3ff9075202c   0x3ff9075202c <swapcontext+116> -> -cc            0x2   2 - -On 5 March 2018 at 18:54, Christian Borntraeger <address@hidden> wrote: -> -> -> -On 03/05/2018 07:45 PM, Farhan Ali wrote: -> -> 0x000003ff90752026 <+110>: svc 175 -> -> -sys_rt_sigprocmask. r0 should not be changed by the system call. -> -> -> 0x000003ff90752028 <+112>: lgr %r5,%r0 -> -> => 0x000003ff9075202c <+116>: lfpc 248(%r5) -> -> -so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the -> -2nd parameter to this -> -function). Now this is odd. -...particularly given that the only place we call swapcontext() -the second parameter is always the address of a local variable -and can't be 0... - -thanks --- PMM - -Do you happen to run with a recent host kernel that has - -commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 - s390: scrub registers on kernel entry and KVM exit - - - - - -Can you run with this on top -diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S -index 13a133a6015c..d6dc0e5e8f74 100644 ---- a/arch/s390/kernel/entry.S -+++ b/arch/s390/kernel/entry.S -@@ -426,13 +426,13 @@ ENTRY(system_call) - UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER - BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP - stmg %r0,%r7,__PT_R0(%r11) -- # clear user controlled register to prevent speculative use -- xgr %r0,%r0 - mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC - mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW - mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC - stg %r14,__PT_FLAGS(%r11) - .Lsysc_do_svc: -+ # clear user controlled register to prevent speculative use -+ xgr %r0,%r0 - # load address of system call table - lg %r10,__THREAD_sysc_table(%r13,%r12) - llgh %r8,__PT_INT_CODE+2(%r11) - - -To me it looks like that the critical section cleanup (interrupt during system -call entry) might -save the registers again into ptregs but we have already zeroed out r0. -This patch moves the clearing of r0 after sysc_do_svc, which should fix the -critical -section cleanup. - -Adding Martin and Heiko. Will spin a patch. - - -On 03/05/2018 07:54 PM, Christian Borntraeger wrote: -> -> -> -On 03/05/2018 07:45 PM, Farhan Ali wrote: -> -> -> -> -> -> On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote: -> ->> Please include the following gdb output: -> ->> -> ->>   (gdb) disas swapcontext -> ->>   (gdb) i r -> ->> -> ->> That way it's possible to see which instruction faulted and which -> ->> registers were being accessed. -> -> -> -> -> -> here is the disas out for swapcontext, this is on a coredump with debugging -> -> symbols enabled for qemu. So the addresses from the previous dump is a -> -> little different. -> -> -> -> -> -> (gdb) disas swapcontext -> -> Dump of assembler code for function swapcontext: -> ->   0x000003ff90751fb8 <+0>:   lgr   %r1,%r2 -> ->   0x000003ff90751fbc <+4>:   lgr   %r0,%r3 -> ->   0x000003ff90751fc0 <+8>:   stfpc   248(%r1) -> ->   0x000003ff90751fc4 <+12>:   std   %f0,256(%r1) -> ->   0x000003ff90751fc8 <+16>:   std   %f1,264(%r1) -> ->   0x000003ff90751fcc <+20>:   std   %f2,272(%r1) -> ->   0x000003ff90751fd0 <+24>:   std   %f3,280(%r1) -> ->   0x000003ff90751fd4 <+28>:   std   %f4,288(%r1) -> ->   0x000003ff90751fd8 <+32>:   std   %f5,296(%r1) -> ->   0x000003ff90751fdc <+36>:   std   %f6,304(%r1) -> ->   0x000003ff90751fe0 <+40>:   std   %f7,312(%r1) -> ->   0x000003ff90751fe4 <+44>:   std   %f8,320(%r1) -> ->   0x000003ff90751fe8 <+48>:   std   %f9,328(%r1) -> ->   0x000003ff90751fec <+52>:   std   %f10,336(%r1) -> ->   0x000003ff90751ff0 <+56>:   std   %f11,344(%r1) -> ->   0x000003ff90751ff4 <+60>:   std   %f12,352(%r1) -> ->   0x000003ff90751ff8 <+64>:   std   %f13,360(%r1) -> ->   0x000003ff90751ffc <+68>:   std   %f14,368(%r1) -> ->   0x000003ff90752000 <+72>:   std   %f15,376(%r1) -> ->   0x000003ff90752004 <+76>:   slgr   %r2,%r2 -> ->   0x000003ff90752008 <+80>:   stam   %a0,%a15,184(%r1) -> ->   0x000003ff9075200c <+84>:   stmg   %r0,%r15,56(%r1) -> ->   0x000003ff90752012 <+90>:   la   %r2,2 -> ->   0x000003ff90752016 <+94>:   lgr   %r5,%r0 -> ->   0x000003ff9075201a <+98>:   la   %r3,384(%r5) -> ->   0x000003ff9075201e <+102>:   la   %r4,384(%r1) -> ->   0x000003ff90752022 <+106>:   lghi   %r5,8 -> ->   0x000003ff90752026 <+110>:   svc   175 -> -> -sys_rt_sigprocmask. r0 should not be changed by the system call. -> -> ->   0x000003ff90752028 <+112>:   lgr   %r5,%r0 -> -> => 0x000003ff9075202c <+116>:   lfpc   248(%r5) -> -> -so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the -> -2nd parameter to this -> -function). Now this is odd. -> -> ->   0x000003ff90752030 <+120>:   ld   %f0,256(%r5) -> ->   0x000003ff90752034 <+124>:   ld   %f1,264(%r5) -> ->   0x000003ff90752038 <+128>:   ld   %f2,272(%r5) -> ->   0x000003ff9075203c <+132>:   ld   %f3,280(%r5) -> ->   0x000003ff90752040 <+136>:   ld   %f4,288(%r5) -> ->   0x000003ff90752044 <+140>:   ld   %f5,296(%r5) -> ->   0x000003ff90752048 <+144>:   ld   %f6,304(%r5) -> ->   0x000003ff9075204c <+148>:   ld   %f7,312(%r5) -> ->   0x000003ff90752050 <+152>:   ld   %f8,320(%r5) -> ->   0x000003ff90752054 <+156>:   ld   %f9,328(%r5) -> ->   0x000003ff90752058 <+160>:   ld   %f10,336(%r5) -> ->   0x000003ff9075205c <+164>:   ld   %f11,344(%r5) -> ->   0x000003ff90752060 <+168>:   ld   %f12,352(%r5) -> ->   0x000003ff90752064 <+172>:   ld   %f13,360(%r5) -> ->   0x000003ff90752068 <+176>:   ld   %f14,368(%r5) -> ->   0x000003ff9075206c <+180>:   ld   %f15,376(%r5) -> ->   0x000003ff90752070 <+184>:   lam   %a2,%a15,192(%r5) -> ->   0x000003ff90752074 <+188>:   lmg   %r0,%r15,56(%r5) -> ->   0x000003ff9075207a <+194>:   br   %r14 -> -> End of assembler dump. -> -> -> -> (gdb) i r -> -> r0            0x0   0 -> -> r1            0x3ff8fe7de40   4396165881408 -> -> r2            0x0   0 -> -> r3            0x3ff8fe7e1c0   4396165882304 -> -> r4            0x3ff8fe7dfc0   4396165881792 -> -> r5            0x0   0 -> -> r6            0xffffffff88004880   18446744071696304256 -> -> r7            0x3ff880009e0   4396033247712 -> -> r8            0x27ff89000   10736930816 -> -> r9            0x3ff88001460   4396033250400 -> -> r10           0x1000   4096 -> -> r11           0x1261be0   19274720 -> -> r12           0x3ff88001e00   4396033252864 -> -> r13           0x14d0bc0   21826496 -> -> r14           0x1312ac8   19999432 -> -> r15           0x3ff8fe7dc80   4396165880960 -> -> pc            0x3ff9075202c   0x3ff9075202c <swapcontext+116> -> -> cc            0x2   2 - -On 03/05/2018 02:08 PM, Christian Borntraeger wrote: -Do you happen to run with a recent host kernel that has - -commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 - s390: scrub registers on kernel entry and KVM exit -Yes. -Can you run with this on top -diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S -index 13a133a6015c..d6dc0e5e8f74 100644 ---- a/arch/s390/kernel/entry.S -+++ b/arch/s390/kernel/entry.S -@@ -426,13 +426,13 @@ ENTRY(system_call) - UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER - BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP - stmg %r0,%r7,__PT_R0(%r11) -- # clear user controlled register to prevent speculative use -- xgr %r0,%r0 - mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC - mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW - mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC - stg %r14,__PT_FLAGS(%r11) - .Lsysc_do_svc: -+ # clear user controlled register to prevent speculative use -+ xgr %r0,%r0 - # load address of system call table - lg %r10,__THREAD_sysc_table(%r13,%r12) - llgh %r8,__PT_INT_CODE+2(%r11) - - -To me it looks like that the critical section cleanup (interrupt during system -call entry) might -save the registers again into ptregs but we have already zeroed out r0. -This patch moves the clearing of r0 after sysc_do_svc, which should fix the -critical -section cleanup. -Okay I will run with this. -Adding Martin and Heiko. Will spin a patch. - - -On 03/05/2018 07:54 PM, Christian Borntraeger wrote: -On 03/05/2018 07:45 PM, Farhan Ali wrote: -On 03/05/2018 06:03 AM, Stefan Hajnoczi wrote: -Please include the following gdb output: - -   (gdb) disas swapcontext -   (gdb) i r - -That way it's possible to see which instruction faulted and which -registers were being accessed. -here is the disas out for swapcontext, this is on a coredump with debugging -symbols enabled for qemu. So the addresses from the previous dump is a little -different. - - -(gdb) disas swapcontext -Dump of assembler code for function swapcontext: -   0x000003ff90751fb8 <+0>:   lgr   %r1,%r2 -   0x000003ff90751fbc <+4>:   lgr   %r0,%r3 -   0x000003ff90751fc0 <+8>:   stfpc   248(%r1) -   0x000003ff90751fc4 <+12>:   std   %f0,256(%r1) -   0x000003ff90751fc8 <+16>:   std   %f1,264(%r1) -   0x000003ff90751fcc <+20>:   std   %f2,272(%r1) -   0x000003ff90751fd0 <+24>:   std   %f3,280(%r1) -   0x000003ff90751fd4 <+28>:   std   %f4,288(%r1) -   0x000003ff90751fd8 <+32>:   std   %f5,296(%r1) -   0x000003ff90751fdc <+36>:   std   %f6,304(%r1) -   0x000003ff90751fe0 <+40>:   std   %f7,312(%r1) -   0x000003ff90751fe4 <+44>:   std   %f8,320(%r1) -   0x000003ff90751fe8 <+48>:   std   %f9,328(%r1) -   0x000003ff90751fec <+52>:   std   %f10,336(%r1) -   0x000003ff90751ff0 <+56>:   std   %f11,344(%r1) -   0x000003ff90751ff4 <+60>:   std   %f12,352(%r1) -   0x000003ff90751ff8 <+64>:   std   %f13,360(%r1) -   0x000003ff90751ffc <+68>:   std   %f14,368(%r1) -   0x000003ff90752000 <+72>:   std   %f15,376(%r1) -   0x000003ff90752004 <+76>:   slgr   %r2,%r2 -   0x000003ff90752008 <+80>:   stam   %a0,%a15,184(%r1) -   0x000003ff9075200c <+84>:   stmg   %r0,%r15,56(%r1) -   0x000003ff90752012 <+90>:   la   %r2,2 -   0x000003ff90752016 <+94>:   lgr   %r5,%r0 -   0x000003ff9075201a <+98>:   la   %r3,384(%r5) -   0x000003ff9075201e <+102>:   la   %r4,384(%r1) -   0x000003ff90752022 <+106>:   lghi   %r5,8 -   0x000003ff90752026 <+110>:   svc   175 -sys_rt_sigprocmask. r0 should not be changed by the system call. -  0x000003ff90752028 <+112>:   lgr   %r5,%r0 -=> 0x000003ff9075202c <+116>:   lfpc   248(%r5) -so r5 is zero and it was loaded from r0. r0 was loaded from r3 (which is the -2nd parameter to this -function). Now this is odd. -  0x000003ff90752030 <+120>:   ld   %f0,256(%r5) -   0x000003ff90752034 <+124>:   ld   %f1,264(%r5) -   0x000003ff90752038 <+128>:   ld   %f2,272(%r5) -   0x000003ff9075203c <+132>:   ld   %f3,280(%r5) -   0x000003ff90752040 <+136>:   ld   %f4,288(%r5) -   0x000003ff90752044 <+140>:   ld   %f5,296(%r5) -   0x000003ff90752048 <+144>:   ld   %f6,304(%r5) -   0x000003ff9075204c <+148>:   ld   %f7,312(%r5) -   0x000003ff90752050 <+152>:   ld   %f8,320(%r5) -   0x000003ff90752054 <+156>:   ld   %f9,328(%r5) -   0x000003ff90752058 <+160>:   ld   %f10,336(%r5) -   0x000003ff9075205c <+164>:   ld   %f11,344(%r5) -   0x000003ff90752060 <+168>:   ld   %f12,352(%r5) -   0x000003ff90752064 <+172>:   ld   %f13,360(%r5) -   0x000003ff90752068 <+176>:   ld   %f14,368(%r5) -   0x000003ff9075206c <+180>:   ld   %f15,376(%r5) -   0x000003ff90752070 <+184>:   lam   %a2,%a15,192(%r5) -   0x000003ff90752074 <+188>:   lmg   %r0,%r15,56(%r5) -   0x000003ff9075207a <+194>:   br   %r14 -End of assembler dump. - -(gdb) i r -r0            0x0   0 -r1            0x3ff8fe7de40   4396165881408 -r2            0x0   0 -r3            0x3ff8fe7e1c0   4396165882304 -r4            0x3ff8fe7dfc0   4396165881792 -r5            0x0   0 -r6            0xffffffff88004880   18446744071696304256 -r7            0x3ff880009e0   4396033247712 -r8            0x27ff89000   10736930816 -r9            0x3ff88001460   4396033250400 -r10           0x1000   4096 -r11           0x1261be0   19274720 -r12           0x3ff88001e00   4396033252864 -r13           0x14d0bc0   21826496 -r14           0x1312ac8   19999432 -r15           0x3ff8fe7dc80   4396165880960 -pc            0x3ff9075202c   0x3ff9075202c <swapcontext+116> -cc            0x2   2 - -On Mon, 5 Mar 2018 20:08:45 +0100 -Christian Borntraeger <address@hidden> wrote: - -> -Do you happen to run with a recent host kernel that has -> -> -commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 -> -s390: scrub registers on kernel entry and KVM exit -> -> -Can you run with this on top -> -diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S -> -index 13a133a6015c..d6dc0e5e8f74 100644 -> ---- a/arch/s390/kernel/entry.S -> -+++ b/arch/s390/kernel/entry.S -> -@@ -426,13 +426,13 @@ ENTRY(system_call) -> -UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER -> -BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP -> -stmg %r0,%r7,__PT_R0(%r11) -> -- # clear user controlled register to prevent speculative use -> -- xgr %r0,%r0 -> -mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC -> -mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW -> -mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC -> -stg %r14,__PT_FLAGS(%r11) -> -.Lsysc_do_svc: -> -+ # clear user controlled register to prevent speculative use -> -+ xgr %r0,%r0 -> -# load address of system call table -> -lg %r10,__THREAD_sysc_table(%r13,%r12) -> -llgh %r8,__PT_INT_CODE+2(%r11) -> -> -> -To me it looks like that the critical section cleanup (interrupt during -> -system call entry) might -> -save the registers again into ptregs but we have already zeroed out r0. -> -This patch moves the clearing of r0 after sysc_do_svc, which should fix the -> -critical -> -section cleanup. -> -> -Adding Martin and Heiko. Will spin a patch. -Argh, yes. Thanks Chrisitan, this is it. I have been searching for the bug -for days now. The point is that if the system call handler is interrupted -after the xgr but before .Lsysc_do_svc the code at .Lcleanup_system_call -repeats the stmg for %r0-%r7 but now %r0 is already zero. - -Please commit a patch for this and I'll will queue it up immediately. - --- -blue skies, - Martin. - -"Reality continues to ruin my life." - Calvin. - -On 03/06/2018 01:34 AM, Martin Schwidefsky wrote: -On Mon, 5 Mar 2018 20:08:45 +0100 -Christian Borntraeger <address@hidden> wrote: -Do you happen to run with a recent host kernel that has - -commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 - s390: scrub registers on kernel entry and KVM exit - -Can you run with this on top -diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S -index 13a133a6015c..d6dc0e5e8f74 100644 ---- a/arch/s390/kernel/entry.S -+++ b/arch/s390/kernel/entry.S -@@ -426,13 +426,13 @@ ENTRY(system_call) - UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER - BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP - stmg %r0,%r7,__PT_R0(%r11) -- # clear user controlled register to prevent speculative use -- xgr %r0,%r0 - mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC - mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW - mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC - stg %r14,__PT_FLAGS(%r11) - .Lsysc_do_svc: -+ # clear user controlled register to prevent speculative use -+ xgr %r0,%r0 - # load address of system call table - lg %r10,__THREAD_sysc_table(%r13,%r12) - llgh %r8,__PT_INT_CODE+2(%r11) - - -To me it looks like that the critical section cleanup (interrupt during system -call entry) might -save the registers again into ptregs but we have already zeroed out r0. -This patch moves the clearing of r0 after sysc_do_svc, which should fix the -critical -section cleanup. - -Adding Martin and Heiko. Will spin a patch. -Argh, yes. Thanks Chrisitan, this is it. I have been searching for the bug -for days now. The point is that if the system call handler is interrupted -after the xgr but before .Lsysc_do_svc the code at .Lcleanup_system_call -repeats the stmg for %r0-%r7 but now %r0 is already zero. - -Please commit a patch for this and I'll will queue it up immediately. -This patch does fix the QEMU crash. I haven't seen the crash after -running the test case for more than a day. Thanks to everyone for taking -a look at this problem :) -Thanks -Farhan - diff --git a/results/classifier/02/other/23300761 b/results/classifier/02/other/23300761 deleted file mode 100644 index 47edfea0d..000000000 --- a/results/classifier/02/other/23300761 +++ /dev/null @@ -1,314 +0,0 @@ -other: 0.963 -semantic: 0.950 -boot: 0.929 -instruction: 0.929 -mistranslation: 0.770 - -[Qemu-devel] [BUG] 216 Alerts reported by LGTM for QEMU (some might be release critical) - -Hi, -LGTM reports 16 errors, 81 warnings and 119 recommendations: -https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list -. -Some of them are already know (wrong format strings), others look like -real errors: -- several multiplication results which don't work as they should in -contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only -32 bit!), target/i386/translate.c and other files -- potential buffer overflows in gdbstub.c and other files -I am afraid that the overflows in the block code are release critical, -maybe that in target/i386/translate.c and other errors, too. -About half of the alerts are issues which can be fixed later. - -Regards - -Stefan - -On 13/07/19 19:46, Stefan Weil wrote: -> -> -LGTM reports 16 errors, 81 warnings and 119 recommendations: -> -https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list -. -> -> -Some of them are already know (wrong format strings), others look like -> -real errors: -> -> -- several multiplication results which don't work as they should in -> -contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only -> -32 bit!), target/i386/translate.c and other files -m->nb_clusters here is limited by s->l2_slice_size (see for example -handle_alloc) so I wouldn't be surprised if this is a false positive. I -couldn't find this particular multiplication in Coverity, but it has -about 250 issues marked as intentional or false positive so there's -probably a lot of overlap with what LGTM found. - -Paolo - -Am 13.07.2019 um 21:42 schrieb Paolo Bonzini: -> -On 13/07/19 19:46, Stefan Weil wrote: -> -> LGTM reports 16 errors, 81 warnings and 119 recommendations: -> -> -https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list -. -> -> -> -> Some of them are already known (wrong format strings), others look like -> -> real errors: -> -> -> -> - several multiplication results which don't work as they should in -> -> contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only -> -> 32 bit!), target/i386/translate.c and other files -> -m->nb_clusters here is limited by s->l2_slice_size (see for example -> -handle_alloc) so I wouldn't be surprised if this is a false positive. I -> -couldn't find this particular multiplication in Coverity, but it has -> -about 250 issues marked as intentional or false positive so there's -> -probably a lot of overlap with what LGTM found. -> -> -Paolo -> -From other projects I know that there is a certain overlap between the -results from Coverity Scan an LGTM, but it is good to have both -analyzers, and the results from LGTM are typically quite reliable. - -Even if we know that there is no multiplication overflow, the code could -be modified. Either the assigned value should use the same data type as -the factors (possible when there is never an overflow, avoids a size -extension), or the multiplication could use the larger data type by -adding a type cast to one of the factors (then an overflow cannot -happen, static code analysers and human reviewers have an easier job, -but the multiplication costs more time). - -Stefan - -Am 14.07.2019 um 15:28 hat Stefan Weil geschrieben: -> -Am 13.07.2019 um 21:42 schrieb Paolo Bonzini: -> -> On 13/07/19 19:46, Stefan Weil wrote: -> ->> LGTM reports 16 errors, 81 warnings and 119 recommendations: -> ->> -https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list -. -> ->> -> ->> Some of them are already known (wrong format strings), others look like -> ->> real errors: -> ->> -> ->> - several multiplication results which don't work as they should in -> ->> contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only -> ->> 32 bit!), target/i386/translate.c and other files -Request sizes are limited to 32 bit in the generic block layer before -they are even passed to the individual block drivers, so most if not all -of these are going to be false positives. - -> -> m->nb_clusters here is limited by s->l2_slice_size (see for example -> -> handle_alloc) so I wouldn't be surprised if this is a false positive. I -> -> couldn't find this particular multiplication in Coverity, but it has -> -> about 250 issues marked as intentional or false positive so there's -> -> probably a lot of overlap with what LGTM found. -> -> -> -> Paolo -> -> -From other projects I know that there is a certain overlap between the -> -results from Coverity Scan an LGTM, but it is good to have both -> -analyzers, and the results from LGTM are typically quite reliable. -> -> -Even if we know that there is no multiplication overflow, the code could -> -be modified. Either the assigned value should use the same data type as -> -the factors (possible when there is never an overflow, avoids a size -> -extension), or the multiplication could use the larger data type by -> -adding a type cast to one of the factors (then an overflow cannot -> -happen, static code analysers and human reviewers have an easier job, -> -but the multiplication costs more time). -But if you look at the code we're talking about, you see that it's -complaining about things where being more explicit would make things -less readable. - -For example, if complains about the multiplication in this line: - - s->file_size += n * s->header.cluster_size; - -We know that n * s->header.cluster_size fits in 32 bits, but -s->file_size is 64 bits (and has to be 64 bits). Do you really think we -should introduce another uint32_t variable to store the intermediate -result? And if we cast n to uint64_t, not only might the multiplication -cost more time, but also human readers would wonder why the result could -become larger than 32 bits. So a cast would be misleading. - - -It also complains about this line: - - ret = bdrv_truncate(bs->file, (3 + l1_clusters) * s->cluster_size, - PREALLOC_MODE_OFF, &local_err); - -Here, we don't even assign the result to a 64 bit variable, but just -pass it to a function which takes a 64 bit parameter. Again, I don't -think introducing additional variables for the intermediate result or -adding casts would be an improvement of the situation. - - -So I don't think this is a good enough tool to base our code on what it -does and doesn't understand. It would have too much of a negative impact -on our code. We'd rather need a way to mark false positives as such and -move on without changing the code in such cases. - -Kevin - -On Sat, 13 Jul 2019 at 18:46, Stefan Weil <address@hidden> wrote: -> -LGTM reports 16 errors, 81 warnings and 119 recommendations: -> -https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list -. -I had a look at some of these before, but mostly I came -to the conclusion that it wasn't worth trying to put the -effort into keeping up with the site because they didn't -seem to provide any useful way to mark things as false -positives. Coverity has its flaws but at least you can do -that kind of thing in its UI (it runs at about a 33% fp -rate, I think.) "Analyzer thinks this multiply can overflow -but in fact it's not possible" is quite a common false -positive cause... - -Anyway, if you want to fish out specific issues, analyse -whether they're false positive or real, and report them -to the mailing list as followups to the patches which -introduced the issue, that's probably the best way for -us to make use of this analyzer. (That is essentially -what I do for coverity.) - -thanks --- PMM - -Am 14.07.2019 um 19:30 schrieb Peter Maydell: -[...] -> -"Analyzer thinks this multiply can overflow -> -but in fact it's not possible" is quite a common false -> -positive cause... -The analysers don't complain because a multiply can overflow. - -They complain because the code indicates that a larger result is -expected, for example uint64_t = uint32_t * uint32_t. They would not -complain for the same multiplication if it were assigned to a uint32_t. - -So there is a simple solution to write the code in a way which avoids -false positives... - -Stefan - -Stefan Weil <address@hidden> writes: - -> -Am 14.07.2019 um 19:30 schrieb Peter Maydell: -> -[...] -> -> "Analyzer thinks this multiply can overflow -> -> but in fact it's not possible" is quite a common false -> -> positive cause... -> -> -> -The analysers don't complain because a multiply can overflow. -> -> -They complain because the code indicates that a larger result is -> -expected, for example uint64_t = uint32_t * uint32_t. They would not -> -complain for the same multiplication if it were assigned to a uint32_t. -I agree this is an anti-pattern. - -> -So there is a simple solution to write the code in a way which avoids -> -false positives... -You wrote elsewhere in this thread: - - Either the assigned value should use the same data type as the - factors (possible when there is never an overflow, avoids a size - extension), or the multiplication could use the larger data type by - adding a type cast to one of the factors (then an overflow cannot - happen, static code analysers and human reviewers have an easier - job, but the multiplication costs more time). - -Makes sense to me. - -On 7/14/19 5:30 PM, Peter Maydell wrote: -> -I had a look at some of these before, but mostly I came -> -to the conclusion that it wasn't worth trying to put the -> -effort into keeping up with the site because they didn't -> -seem to provide any useful way to mark things as false -> -positives. Coverity has its flaws but at least you can do -> -that kind of thing in its UI (it runs at about a 33% fp -> -rate, I think.) -Yes, LGTM wants you to modify the source code with - - /* lgtm [cpp/some-warning-code] */ - -and on the same line as the reported problem. Which is mildly annoying in that -you're definitely committing to LGTM in the long term. Also for any -non-trivial bit of code, it will almost certainly run over 80 columns. - - -r~ - diff --git a/results/classifier/02/other/23448582 b/results/classifier/02/other/23448582 deleted file mode 100644 index 6d6939b4f..000000000 --- a/results/classifier/02/other/23448582 +++ /dev/null @@ -1,266 +0,0 @@ -other: 0.990 -semantic: 0.987 -instruction: 0.983 -mistranslation: 0.982 -boot: 0.967 - -[BUG REPORT] cxl process in infinity loop - -Hi, all - -When I did the cxl memory hot-plug test on QEMU, I accidentally connected -two memdev to the same downstream port, the command like below: - -> --object memory-backend-ram,size=262144k,share=on,id=vmem0 \ -> --object memory-backend-ram,size=262144k,share=on,id=vmem1 \ -> --device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -> --device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -> --device cxl-upstream,bus=root_port0,id=us0 \ -> --device cxl-downstream,port=0,bus=us0,id=swport00,chassis=0,slot=5 \ -> --device cxl-downstream,port=0,bus=us0,id=swport01,chassis=0,slot=7 \ -same downstream port but has different slot! - -> --device cxl-type3,bus=swport00,volatile-memdev=vmem0,id=cxl-vmem0 \ -> --device cxl-type3,bus=swport01,volatile-memdev=vmem1,id=cxl-vmem1 \ -> --M -> -cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=64G,cxl-fmw.0.interleave-granularity=4k -> -\ -There is no error occurred when vm start, but when I executed the âcxl listâ -command to view -the CXL objects info, the process can not end properly. - -Then I used strace to trace the process, I found that the process is in -infinity loop: -# strace cxl list -...... -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -write(3, "1\n\0", 3) = 3 -close(3) = 0 -access("/run/udev/queue", F_OK) = 0 - -[Environment]: -linux: V6.10-rc3 -QEMU: V9.0.0 -ndctl: v79 - -I know this is because of the wrong use of the QEMU command, but I think we -should -be aware of this error in one of the QEMU, OS or ndctl side at least. - -Thanks -Xingtao - -On Tue, 2 Jul 2024 00:30:06 +0000 -"Xingtao Yao (Fujitsu)" <yaoxt.fnst@fujitsu.com> wrote: - -> -Hi, all -> -> -When I did the cxl memory hot-plug test on QEMU, I accidentally connected -> -two memdev to the same downstream port, the command like below: -> -> -> -object memory-backend-ram,size=262144k,share=on,id=vmem0 \ -> -> -object memory-backend-ram,size=262144k,share=on,id=vmem1 \ -> -> -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -> -> -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -> -> -device cxl-upstream,bus=root_port0,id=us0 \ -> -> -device cxl-downstream,port=0,bus=us0,id=swport00,chassis=0,slot=5 \ -> -> -device cxl-downstream,port=0,bus=us0,id=swport01,chassis=0,slot=7 \ -> -same downstream port but has different slot! -> -> -> -device cxl-type3,bus=swport00,volatile-memdev=vmem0,id=cxl-vmem0 \ -> -> -device cxl-type3,bus=swport01,volatile-memdev=vmem1,id=cxl-vmem1 \ -> -> -M -> -> cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=64G,cxl-fmw.0.interleave-granularity=4k -> -> \ -> -> -There is no error occurred when vm start, but when I executed the âcxl listâ -> -command to view -> -the CXL objects info, the process can not end properly. -I'd be happy to look preventing this on QEMU side if you send one, -but in general there are are lots of ways to shoot yourself in the -foot with CXL and PCI device emulation in QEMU so I'm not going -to rush to solve this specific one. - -Likewise, some hardening in kernel / userspace probably makes sense but -this is a non compliant switch so priority of a fix is probably fairly low. - -Jonathan - -> -> -Then I used strace to trace the process, I found that the process is in -> -infinity loop: -> -# strace cxl list -> -...... -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0 -> -openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3 -> -write(3, "1\n\0", 3) = 3 -> -close(3) = 0 -> -access("/run/udev/queue", F_OK) = 0 -> -> -[Environment]: -> -linux: V6.10-rc3 -> -QEMU: V9.0.0 -> -ndctl: v79 -> -> -I know this is because of the wrong use of the QEMU command, but I think we -> -should -> -be aware of this error in one of the QEMU, OS or ndctl side at least. -> -> -Thanks -> -Xingtao - diff --git a/results/classifier/02/other/25892827 b/results/classifier/02/other/25892827 deleted file mode 100644 index b33268d01..000000000 --- a/results/classifier/02/other/25892827 +++ /dev/null @@ -1,1078 +0,0 @@ -other: 0.892 -instruction: 0.842 -mistranslation: 0.842 -boot: 0.839 -semantic: 0.825 - -[Qemu-devel] [BUG/RFC] Two cpus are not brought up normally in SLES11 sp3 VM after reboot - -Hi, - -Recently we encountered a problem in our project: 2 CPUs in VM are not brought -up normally after reboot. - -Our host is using KVM kmod 3.6 and QEMU 2.1. -A SLES 11 sp3 VM configured with 8 vcpus, -cpu model is configured with 'host-passthrough'. - -After VM's first time started up, everything seems to be OK. -and then VM is paniced and rebooted. -After reboot, only 6 cpus are brought up in VM, cpu1 and cpu7 are not online. - -This is the only message we can get from VM: -VM dmesg shows: -[ 0.069867] Booting Node 0, Processors #1 -[ 5.060042] CPU1: Stuck ?? -[ 5.060499] #2 -[ 5.088322] kvm-clock: cpu 2, msr 6:3fc90901, secondary cpu clock -[ 5.088335] KVM setup async PF for cpu 2 -[ 5.092967] NMI watchdog enabled, takes one hw-pmu counter. -[ 5.094405] #3 -[ 5.108324] kvm-clock: cpu 3, msr 6:3fcd0901, secondary cpu clock -[ 5.108333] KVM setup async PF for cpu 3 -[ 5.113553] NMI watchdog enabled, takes one hw-pmu counter. -[ 5.114970] #4 -[ 5.128325] kvm-clock: cpu 4, msr 6:3fd10901, secondary cpu clock -[ 5.128336] KVM setup async PF for cpu 4 -[ 5.134576] NMI watchdog enabled, takes one hw-pmu counter. -[ 5.135998] #5 -[ 5.152324] kvm-clock: cpu 5, msr 6:3fd50901, secondary cpu clock -[ 5.152334] KVM setup async PF for cpu 5 -[ 5.154764] NMI watchdog enabled, takes one hw-pmu counter. -[ 5.156467] #6 -[ 5.172327] kvm-clock: cpu 6, msr 6:3fd90901, secondary cpu clock -[ 5.172341] KVM setup async PF for cpu 6 -[ 5.180738] NMI watchdog enabled, takes one hw-pmu counter. -[ 5.182173] #7 Ok. -[ 10.170815] CPU7: Stuck ?? -[ 10.171648] Brought up 6 CPUs -[ 10.172394] Total of 6 processors activated (28799.97 BogoMIPS). - -From host, we found that QEMU vcpu1 thread and vcpu7 thread were not consuming -any cpu (Should be in idle state), -All of VCPUs' stacks in host is like bellow: - -[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -[<ffffffff81468092>] system_call_fastpath+0x16/0x1b -[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -[<ffffffffffffffff>] 0xffffffffffffffff - -We looked into the kernel codes that could leading to the above 'Stuck' warning, -and found that the only possible is the emulation of 'cpuid' instruct in -kvm/qemu has something wrong. -But since we canât reproduce this problem, we are not quite sure. -Is there any possible that the cupid emulation in kvm/qemu has some bug ? - -Has anyone come across these problem before? Or any idea? - -Thanks, -zhanghailiang - -On 06/07/2015 09:54, zhanghailiang wrote: -> -> -From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -> -consuming any cpu (Should be in idle state), -> -All of VCPUs' stacks in host is like bellow: -> -> -[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -> -[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -> -[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -> -[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -> -[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -> -[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -> -[<ffffffff81468092>] system_call_fastpath+0x16/0x1b -> -[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -> -[<ffffffffffffffff>] 0xffffffffffffffff -> -> -We looked into the kernel codes that could leading to the above 'Stuck' -> -warning, -> -and found that the only possible is the emulation of 'cpuid' instruct in -> -kvm/qemu has something wrong. -> -But since we canât reproduce this problem, we are not quite sure. -> -Is there any possible that the cupid emulation in kvm/qemu has some bug ? -Can you explain the relationship to the cpuid emulation? What do the -traces say about vcpus 1 and 7? - -Paolo - -On 2015/7/6 16:45, Paolo Bonzini wrote: -On 06/07/2015 09:54, zhanghailiang wrote: -From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -consuming any cpu (Should be in idle state), -All of VCPUs' stacks in host is like bellow: - -[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -[<ffffffff81468092>] system_call_fastpath+0x16/0x1b -[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -[<ffffffffffffffff>] 0xffffffffffffffff - -We looked into the kernel codes that could leading to the above 'Stuck' -warning, -and found that the only possible is the emulation of 'cpuid' instruct in -kvm/qemu has something wrong. -But since we canât reproduce this problem, we are not quite sure. -Is there any possible that the cupid emulation in kvm/qemu has some bug ? -Can you explain the relationship to the cpuid emulation? What do the -traces say about vcpus 1 and 7? -OK, we searched the VM's kernel codes with the 'Stuck' message, and it is -located in -do_boot_cpu(). It's in BSP context, the call process is: -BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() --> wakeup_secondary_via_INIT() to trigger APs. -It will wait 5s for APs to startup, if some AP not startup normally, it will -print 'CPU%d Stuck' or 'CPU%d: Not responding'. - -If it prints 'Stuck', it means the AP has received the SIPI interrupt and -begins to execute the code -'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before -smp_callin()(smpboot.c). -The follow is the starup process of BSP and AP. -BSP: -start_kernel() - ->smp_init() - ->smp_boot_cpus() - ->do_boot_cpu() - ->start_ip = trampoline_address(); //set the address that AP will go -to execute - ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU - ->for (timeout = 0; timeout < 50000; timeout++) - if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if AP -startup or not - -APs: -ENTRY(trampoline_data) (trampoline_64.S) - ->ENTRY(secondary_startup_64) (head_64.S) - ->start_secondary() (smpboot.c) - ->cpu_init(); - ->smp_callin(); - ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP comes -here, the BSP will not prints the error message. - -From above call process, we can be sure that, the AP has been stuck between -trampoline_data and the cpumask_set_cpu() in -smp_callin(), we look through these codes path carefully, and only found a -'hlt' instruct that could block the process. -It is located in trampoline_data(): - -ENTRY(trampoline_data) - ... - - call verify_cpu # Verify the cpu supports long mode - testl %eax, %eax # Check for return code - jnz no_longmode - - ... - -no_longmode: - hlt - jmp no_longmode - -For the process verify_cpu(), -we can only find the 'cpuid' sensitive instruct that could lead VM exit from -No-root mode. -This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to -the fail in verify_cpu. - -From the message in VM, we know vcpu1 and vcpu7 is something wrong. -[ 5.060042] CPU1: Stuck ?? -[ 10.170815] CPU7: Stuck ?? -[ 10.171648] Brought up 6 CPUs - -Besides, the follow is the cpus message got from host. -80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command -instance-0000000 -* CPU #0: pc=0x00007f64160c683d thread_id=68570 - CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 - CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 - CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 - CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 - CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 - CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 - CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 - -Oh, i also forgot to mention in the above message that, we have bond each vCPU -to different physical CPU in -host. - -Thanks, -zhanghailiang - -On 06/07/2015 11:59, zhanghailiang wrote: -> -> -> -Besides, the follow is the cpus message got from host. -> -80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh -> -qemu-monitor-command instance-0000000 -> -* CPU #0: pc=0x00007f64160c683d thread_id=68570 -> -CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 -> -CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 -> -CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 -> -CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 -> -CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 -> -CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 -> -CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 -> -> -Oh, i also forgot to mention in the above message that, we have bond -> -each vCPU to different physical CPU in -> -host. -Can you capture a trace on the host (trace-cmd record -e kvm) and send -it privately? Please note which CPUs get stuck, since I guess it's not -always 1 and 7. - -Paolo - -On Mon, 6 Jul 2015 17:59:10 +0800 -zhanghailiang <address@hidden> wrote: - -> -On 2015/7/6 16:45, Paolo Bonzini wrote: -> -> -> -> -> -> On 06/07/2015 09:54, zhanghailiang wrote: -> ->> -> ->> From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -> ->> consuming any cpu (Should be in idle state), -> ->> All of VCPUs' stacks in host is like bellow: -> ->> -> ->> [<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -> ->> [<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -> ->> [<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -> ->> [<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -> ->> [<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -> ->> [<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -> ->> [<ffffffff81468092>] system_call_fastpath+0x16/0x1b -> ->> [<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -> ->> [<ffffffffffffffff>] 0xffffffffffffffff -> ->> -> ->> We looked into the kernel codes that could leading to the above 'Stuck' -> ->> warning, -in current upstream there isn't any printk(...Stuck...) left since that code -path -has been reworked. -I've often seen this on over-committed host during guest CPUs up/down torture -test. -Could you update guest kernel to upstream and see if issue reproduces? - -> ->> and found that the only possible is the emulation of 'cpuid' instruct in -> ->> kvm/qemu has something wrong. -> ->> But since we canât reproduce this problem, we are not quite sure. -> ->> Is there any possible that the cupid emulation in kvm/qemu has some bug ? -> -> -> -> Can you explain the relationship to the cpuid emulation? What do the -> -> traces say about vcpus 1 and 7? -> -> -OK, we searched the VM's kernel codes with the 'Stuck' message, and it is -> -located in -> -do_boot_cpu(). It's in BSP context, the call process is: -> -BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() -> --> wakeup_secondary_via_INIT() to trigger APs. -> -It will wait 5s for APs to startup, if some AP not startup normally, it will -> -print 'CPU%d Stuck' or 'CPU%d: Not responding'. -> -> -If it prints 'Stuck', it means the AP has received the SIPI interrupt and -> -begins to execute the code -> -'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places -> -before smp_callin()(smpboot.c). -> -The follow is the starup process of BSP and AP. -> -BSP: -> -start_kernel() -> -->smp_init() -> -->smp_boot_cpus() -> -->do_boot_cpu() -> -->start_ip = trampoline_address(); //set the address that AP will -> -go to execute -> -->wakeup_secondary_cpu_via_init(); // kick the secondary CPU -> -->for (timeout = 0; timeout < 50000; timeout++) -> -if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if -> -AP startup or not -> -> -APs: -> -ENTRY(trampoline_data) (trampoline_64.S) -> -->ENTRY(secondary_startup_64) (head_64.S) -> -->start_secondary() (smpboot.c) -> -->cpu_init(); -> -->smp_callin(); -> -->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP -> -comes here, the BSP will not prints the error message. -> -> -From above call process, we can be sure that, the AP has been stuck between -> -trampoline_data and the cpumask_set_cpu() in -> -smp_callin(), we look through these codes path carefully, and only found a -> -'hlt' instruct that could block the process. -> -It is located in trampoline_data(): -> -> -ENTRY(trampoline_data) -> -... -> -> -call verify_cpu # Verify the cpu supports long mode -> -testl %eax, %eax # Check for return code -> -jnz no_longmode -> -> -... -> -> -no_longmode: -> -hlt -> -jmp no_longmode -> -> -For the process verify_cpu(), -> -we can only find the 'cpuid' sensitive instruct that could lead VM exit from -> -No-root mode. -> -This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to -> -the fail in verify_cpu. -> -> -From the message in VM, we know vcpu1 and vcpu7 is something wrong. -> -[ 5.060042] CPU1: Stuck ?? -> -[ 10.170815] CPU7: Stuck ?? -> -[ 10.171648] Brought up 6 CPUs -> -> -Besides, the follow is the cpus message got from host. -> -80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh -> -qemu-monitor-command instance-0000000 -> -* CPU #0: pc=0x00007f64160c683d thread_id=68570 -> -CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 -> -CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 -> -CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 -> -CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 -> -CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 -> -CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 -> -CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 -> -> -Oh, i also forgot to mention in the above message that, we have bond each -> -vCPU to different physical CPU in -> -host. -> -> -Thanks, -> -zhanghailiang -> -> -> -> -> --- -> -To unsubscribe from this list: send the line "unsubscribe kvm" in -> -the body of a message to address@hidden -> -More majordomo info at -http://vger.kernel.org/majordomo-info.html - -On 2015/7/7 19:23, Igor Mammedov wrote: -On Mon, 6 Jul 2015 17:59:10 +0800 -zhanghailiang <address@hidden> wrote: -On 2015/7/6 16:45, Paolo Bonzini wrote: -On 06/07/2015 09:54, zhanghailiang wrote: -From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -consuming any cpu (Should be in idle state), -All of VCPUs' stacks in host is like bellow: - -[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -[<ffffffff81468092>] system_call_fastpath+0x16/0x1b -[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -[<ffffffffffffffff>] 0xffffffffffffffff - -We looked into the kernel codes that could leading to the above 'Stuck' -warning, -in current upstream there isn't any printk(...Stuck...) left since that code -path -has been reworked. -I've often seen this on over-committed host during guest CPUs up/down torture -test. -Could you update guest kernel to upstream and see if issue reproduces? -Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to -reproduce it. - -For your test case, is it a kernel bug? -Or is there any related patch could solve your test problem been merged into -upstream ? - -Thanks, -zhanghailiang -and found that the only possible is the emulation of 'cpuid' instruct in -kvm/qemu has something wrong. -But since we canât reproduce this problem, we are not quite sure. -Is there any possible that the cupid emulation in kvm/qemu has some bug ? -Can you explain the relationship to the cpuid emulation? What do the -traces say about vcpus 1 and 7? -OK, we searched the VM's kernel codes with the 'Stuck' message, and it is -located in -do_boot_cpu(). It's in BSP context, the call process is: -BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() --> wakeup_secondary_via_INIT() to trigger APs. -It will wait 5s for APs to startup, if some AP not startup normally, it will -print 'CPU%d Stuck' or 'CPU%d: Not responding'. - -If it prints 'Stuck', it means the AP has received the SIPI interrupt and -begins to execute the code -'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before -smp_callin()(smpboot.c). -The follow is the starup process of BSP and AP. -BSP: -start_kernel() - ->smp_init() - ->smp_boot_cpus() - ->do_boot_cpu() - ->start_ip = trampoline_address(); //set the address that AP will -go to execute - ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU - ->for (timeout = 0; timeout < 50000; timeout++) - if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if -AP startup or not - -APs: -ENTRY(trampoline_data) (trampoline_64.S) - ->ENTRY(secondary_startup_64) (head_64.S) - ->start_secondary() (smpboot.c) - ->cpu_init(); - ->smp_callin(); - ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP -comes here, the BSP will not prints the error message. - - From above call process, we can be sure that, the AP has been stuck between -trampoline_data and the cpumask_set_cpu() in -smp_callin(), we look through these codes path carefully, and only found a -'hlt' instruct that could block the process. -It is located in trampoline_data(): - -ENTRY(trampoline_data) - ... - - call verify_cpu # Verify the cpu supports long mode - testl %eax, %eax # Check for return code - jnz no_longmode - - ... - -no_longmode: - hlt - jmp no_longmode - -For the process verify_cpu(), -we can only find the 'cpuid' sensitive instruct that could lead VM exit from -No-root mode. -This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to -the fail in verify_cpu. - - From the message in VM, we know vcpu1 and vcpu7 is something wrong. -[ 5.060042] CPU1: Stuck ?? -[ 10.170815] CPU7: Stuck ?? -[ 10.171648] Brought up 6 CPUs - -Besides, the follow is the cpus message got from host. -80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command -instance-0000000 -* CPU #0: pc=0x00007f64160c683d thread_id=68570 - CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 - CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 - CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 - CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 - CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 - CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 - CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 - -Oh, i also forgot to mention in the above message that, we have bond each vCPU -to different physical CPU in -host. - -Thanks, -zhanghailiang - - - - --- -To unsubscribe from this list: send the line "unsubscribe kvm" in -the body of a message to address@hidden -More majordomo info at -http://vger.kernel.org/majordomo-info.html -. - -On Tue, 7 Jul 2015 19:43:35 +0800 -zhanghailiang <address@hidden> wrote: - -> -On 2015/7/7 19:23, Igor Mammedov wrote: -> -> On Mon, 6 Jul 2015 17:59:10 +0800 -> -> zhanghailiang <address@hidden> wrote: -> -> -> ->> On 2015/7/6 16:45, Paolo Bonzini wrote: -> ->>> -> ->>> -> ->>> On 06/07/2015 09:54, zhanghailiang wrote: -> ->>>> -> ->>>> From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -> ->>>> consuming any cpu (Should be in idle state), -> ->>>> All of VCPUs' stacks in host is like bellow: -> ->>>> -> ->>>> [<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -> ->>>> [<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -> ->>>> [<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -> ->>>> [<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -> ->>>> [<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -> ->>>> [<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -> ->>>> [<ffffffff81468092>] system_call_fastpath+0x16/0x1b -> ->>>> [<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -> ->>>> [<ffffffffffffffff>] 0xffffffffffffffff -> ->>>> -> ->>>> We looked into the kernel codes that could leading to the above 'Stuck' -> ->>>> warning, -> -> in current upstream there isn't any printk(...Stuck...) left since that -> -> code path -> -> has been reworked. -> -> I've often seen this on over-committed host during guest CPUs up/down -> -> torture test. -> -> Could you update guest kernel to upstream and see if issue reproduces? -> -> -> -> -Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to -> -reproduce it. -> -> -For your test case, is it a kernel bug? -> -Or is there any related patch could solve your test problem been merged into -> -upstream ? -I don't remember all prerequisite patches but you should be able to find -http://marc.info/?l=linux-kernel&m=140326703108009&w=2 -"x86/smpboot: Initialize secondary CPU only if master CPU will wait for it" -and then look for dependencies. - - -> -> -Thanks, -> -zhanghailiang -> -> ->>>> and found that the only possible is the emulation of 'cpuid' instruct in -> ->>>> kvm/qemu has something wrong. -> ->>>> But since we canât reproduce this problem, we are not quite sure. -> ->>>> Is there any possible that the cupid emulation in kvm/qemu has some bug ? -> ->>> -> ->>> Can you explain the relationship to the cpuid emulation? What do the -> ->>> traces say about vcpus 1 and 7? -> ->> -> ->> OK, we searched the VM's kernel codes with the 'Stuck' message, and it is -> ->> located in -> ->> do_boot_cpu(). It's in BSP context, the call process is: -> ->> BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> -> ->> do_boot_cpu() -> wakeup_secondary_via_INIT() to trigger APs. -> ->> It will wait 5s for APs to startup, if some AP not startup normally, it -> ->> will print 'CPU%d Stuck' or 'CPU%d: Not responding'. -> ->> -> ->> If it prints 'Stuck', it means the AP has received the SIPI interrupt and -> ->> begins to execute the code -> ->> 'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places -> ->> before smp_callin()(smpboot.c). -> ->> The follow is the starup process of BSP and AP. -> ->> BSP: -> ->> start_kernel() -> ->> ->smp_init() -> ->> ->smp_boot_cpus() -> ->> ->do_boot_cpu() -> ->> ->start_ip = trampoline_address(); //set the address that AP -> ->> will go to execute -> ->> ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU -> ->> ->for (timeout = 0; timeout < 50000; timeout++) -> ->> if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// -> ->> check if AP startup or not -> ->> -> ->> APs: -> ->> ENTRY(trampoline_data) (trampoline_64.S) -> ->> ->ENTRY(secondary_startup_64) (head_64.S) -> ->> ->start_secondary() (smpboot.c) -> ->> ->cpu_init(); -> ->> ->smp_callin(); -> ->> ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP -> ->> comes here, the BSP will not prints the error message. -> ->> -> ->> From above call process, we can be sure that, the AP has been stuck -> ->> between trampoline_data and the cpumask_set_cpu() in -> ->> smp_callin(), we look through these codes path carefully, and only found a -> ->> 'hlt' instruct that could block the process. -> ->> It is located in trampoline_data(): -> ->> -> ->> ENTRY(trampoline_data) -> ->> ... -> ->> -> ->> call verify_cpu # Verify the cpu supports long mode -> ->> testl %eax, %eax # Check for return code -> ->> jnz no_longmode -> ->> -> ->> ... -> ->> -> ->> no_longmode: -> ->> hlt -> ->> jmp no_longmode -> ->> -> ->> For the process verify_cpu(), -> ->> we can only find the 'cpuid' sensitive instruct that could lead VM exit -> ->> from No-root mode. -> ->> This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading -> ->> to the fail in verify_cpu. -> ->> -> ->> From the message in VM, we know vcpu1 and vcpu7 is something wrong. -> ->> [ 5.060042] CPU1: Stuck ?? -> ->> [ 10.170815] CPU7: Stuck ?? -> ->> [ 10.171648] Brought up 6 CPUs -> ->> -> ->> Besides, the follow is the cpus message got from host. -> ->> 80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh -> ->> qemu-monitor-command instance-0000000 -> ->> * CPU #0: pc=0x00007f64160c683d thread_id=68570 -> ->> CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 -> ->> CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 -> ->> CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 -> ->> CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 -> ->> CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 -> ->> CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 -> ->> CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 -> ->> -> ->> Oh, i also forgot to mention in the above message that, we have bond each -> ->> vCPU to different physical CPU in -> ->> host. -> ->> -> ->> Thanks, -> ->> zhanghailiang -> ->> -> ->> -> ->> -> ->> -> ->> -- -> ->> To unsubscribe from this list: send the line "unsubscribe kvm" in -> ->> the body of a message to address@hidden -> ->> More majordomo info at -http://vger.kernel.org/majordomo-info.html -> -> -> -> -> -> . -> -> -> -> -> - -On 2015/7/7 20:21, Igor Mammedov wrote: -On Tue, 7 Jul 2015 19:43:35 +0800 -zhanghailiang <address@hidden> wrote: -On 2015/7/7 19:23, Igor Mammedov wrote: -On Mon, 6 Jul 2015 17:59:10 +0800 -zhanghailiang <address@hidden> wrote: -On 2015/7/6 16:45, Paolo Bonzini wrote: -On 06/07/2015 09:54, zhanghailiang wrote: -From host, we found that QEMU vcpu1 thread and vcpu7 thread were not -consuming any cpu (Should be in idle state), -All of VCPUs' stacks in host is like bellow: - -[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm] -[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm] -[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] -[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] -[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0 -[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0 -[<ffffffff81468092>] system_call_fastpath+0x16/0x1b -[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7 -[<ffffffffffffffff>] 0xffffffffffffffff - -We looked into the kernel codes that could leading to the above 'Stuck' -warning, -in current upstream there isn't any printk(...Stuck...) left since that code -path -has been reworked. -I've often seen this on over-committed host during guest CPUs up/down torture -test. -Could you update guest kernel to upstream and see if issue reproduces? -Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to -reproduce it. - -For your test case, is it a kernel bug? -Or is there any related patch could solve your test problem been merged into -upstream ? -I don't remember all prerequisite patches but you should be able to find -http://marc.info/?l=linux-kernel&m=140326703108009&w=2 -"x86/smpboot: Initialize secondary CPU only if master CPU will wait for it" -and then look for dependencies. -Er, we have investigated this patch, and it is not related to our problem, :) - -Thanks. -Thanks, -zhanghailiang -and found that the only possible is the emulation of 'cpuid' instruct in -kvm/qemu has something wrong. -But since we canât reproduce this problem, we are not quite sure. -Is there any possible that the cupid emulation in kvm/qemu has some bug ? -Can you explain the relationship to the cpuid emulation? What do the -traces say about vcpus 1 and 7? -OK, we searched the VM's kernel codes with the 'Stuck' message, and it is -located in -do_boot_cpu(). It's in BSP context, the call process is: -BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() --> wakeup_secondary_via_INIT() to trigger APs. -It will wait 5s for APs to startup, if some AP not startup normally, it will -print 'CPU%d Stuck' or 'CPU%d: Not responding'. - -If it prints 'Stuck', it means the AP has received the SIPI interrupt and -begins to execute the code -'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before -smp_callin()(smpboot.c). -The follow is the starup process of BSP and AP. -BSP: -start_kernel() - ->smp_init() - ->smp_boot_cpus() - ->do_boot_cpu() - ->start_ip = trampoline_address(); //set the address that AP will -go to execute - ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU - ->for (timeout = 0; timeout < 50000; timeout++) - if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if -AP startup or not - -APs: -ENTRY(trampoline_data) (trampoline_64.S) - ->ENTRY(secondary_startup_64) (head_64.S) - ->start_secondary() (smpboot.c) - ->cpu_init(); - ->smp_callin(); - ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP -comes here, the BSP will not prints the error message. - - From above call process, we can be sure that, the AP has been stuck between -trampoline_data and the cpumask_set_cpu() in -smp_callin(), we look through these codes path carefully, and only found a -'hlt' instruct that could block the process. -It is located in trampoline_data(): - -ENTRY(trampoline_data) - ... - - call verify_cpu # Verify the cpu supports long mode - testl %eax, %eax # Check for return code - jnz no_longmode - - ... - -no_longmode: - hlt - jmp no_longmode - -For the process verify_cpu(), -we can only find the 'cpuid' sensitive instruct that could lead VM exit from -No-root mode. -This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to -the fail in verify_cpu. - - From the message in VM, we know vcpu1 and vcpu7 is something wrong. -[ 5.060042] CPU1: Stuck ?? -[ 10.170815] CPU7: Stuck ?? -[ 10.171648] Brought up 6 CPUs - -Besides, the follow is the cpus message got from host. -80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command -instance-0000000 -* CPU #0: pc=0x00007f64160c683d thread_id=68570 - CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573 - CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575 - CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576 - CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577 - CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578 - CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583 - CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584 - -Oh, i also forgot to mention in the above message that, we have bond each vCPU -to different physical CPU in -host. - -Thanks, -zhanghailiang - - - - --- -To unsubscribe from this list: send the line "unsubscribe kvm" in -the body of a message to address@hidden -More majordomo info at -http://vger.kernel.org/majordomo-info.html -. -. - diff --git a/results/classifier/02/other/31349848 b/results/classifier/02/other/31349848 deleted file mode 100644 index ef66802a5..000000000 --- a/results/classifier/02/other/31349848 +++ /dev/null @@ -1,155 +0,0 @@ -other: 0.901 -semantic: 0.846 -instruction: 0.845 -boot: 0.815 -mistranslation: 0.781 - -[Qemu-devel] [BUG] qemu stuck when detach host-usb device - -Description of problem: -The guest has a host-usb device(Kingston Technology DataTraveler 100 G3/G4/SE9 -G2), which is attached -to xhci controller(on host). Qemu will stuck if I detach it from guest. - -How reproducible: -100% - -Steps to Reproduce: -1. Use usb stick to copy files in guest , make it busy working. -2. virsh detach-device vm_name usb.xml - -Then qemu will stuck for 20s, I found this is because libusb_release_interface -block for 20s. -Dmesg prints: - -[35442.034861] usb 4-2.1: Disable of device-initiated U1 failed. -[35447.034993] usb 4-2.1: Disable of device-initiated U2 failed. -[35452.035131] usb 4-2.1: Set SEL for device-initiated U1 failed. -[35457.035259] usb 4-2.1: Set SEL for device-initiated U2 failed. - -Is this a hardware error or software's bug? - -On Tue, Nov 27, 2018 at 01:26:24AM +0000, linzhecheng wrote: -> -Description of problem: -> -The guest has a host-usb device(Kingston Technology DataTraveler 100 -> -G3/G4/SE9 G2), which is attached -> -to xhci controller(on host). Qemu will stuck if I detach it from guest. -> -> -How reproducible: -> -100% -> -> -Steps to Reproduce: -> -1. Use usb stick to copy files in guest , make it busy working. -> -2. virsh detach-device vm_name usb.xml -> -> -Then qemu will stuck for 20s, I found this is because -> -libusb_release_interface block for 20s. -> -Dmesg prints: -> -> -[35442.034861] usb 4-2.1: Disable of device-initiated U1 failed. -> -[35447.034993] usb 4-2.1: Disable of device-initiated U2 failed. -> -[35452.035131] usb 4-2.1: Set SEL for device-initiated U1 failed. -> -[35457.035259] usb 4-2.1: Set SEL for device-initiated U2 failed. -> -> -Is this a hardware error or software's bug? -I'd guess software error, could be is libusb or (host) linux kernel. -Cc'ing libusb-devel. - -cheers, - Gerd - -> ------Original Message----- -> -From: Gerd Hoffmann [ -mailto:address@hidden -> -Sent: Tuesday, November 27, 2018 2:09 PM -> -To: linzhecheng <address@hidden> -> -Cc: address@hidden; wangxin (U) <address@hidden>; -> -Zhoujian (jay) <address@hidden>; address@hidden -> -Subject: Re: [Qemu-devel] [BUG] qemu stuck when detach host-usb device -> -> -On Tue, Nov 27, 2018 at 01:26:24AM +0000, linzhecheng wrote: -> -> Description of problem: -> -> The guest has a host-usb device(Kingston Technology DataTraveler 100 -> -> G3/G4/SE9 G2), which is attached to xhci controller(on host). Qemu will -> -> stuck -> -if I detach it from guest. -> -> -> -> How reproducible: -> -> 100% -> -> -> -> Steps to Reproduce: -> -> 1. Use usb stick to copy files in guest , make it busy working. -> -> 2. virsh detach-device vm_name usb.xml -> -> -> -> Then qemu will stuck for 20s, I found this is because -> -> libusb_release_interface -> -block for 20s. -> -> Dmesg prints: -> -> -> -> [35442.034861] usb 4-2.1: Disable of device-initiated U1 failed. -> -> [35447.034993] usb 4-2.1: Disable of device-initiated U2 failed. -> -> [35452.035131] usb 4-2.1: Set SEL for device-initiated U1 failed. -> -> [35457.035259] usb 4-2.1: Set SEL for device-initiated U2 failed. -> -> -> -> Is this a hardware error or software's bug? -> -> -I'd guess software error, could be is libusb or (host) linux kernel. -> -Cc'ing libusb-devel. -Perhaps it's usb driver's bug. Could you also reproduce it? -> -> -cheers, -> -Gerd - diff --git a/results/classifier/02/other/32484936 b/results/classifier/02/other/32484936 deleted file mode 100644 index 204f2e264..000000000 --- a/results/classifier/02/other/32484936 +++ /dev/null @@ -1,224 +0,0 @@ -other: 0.856 -semantic: 0.832 -instruction: 0.829 -boot: 0.810 -mistranslation: 0.794 - -[Qemu-devel] [Snapshot Bug?]Qcow2 meta data corruption - -Hi all, -There was a problem about qcow2 image file happened in my serval vms and I could not figure it out, -so have to ask for some help. -Here is the thing: -At first, I found there were some data corruption in a vm, so I did qemu-img check to all my vms. -parts of check report: -3-Leaked cluster 2926229 refcount=1 reference=0 -4-Leaked cluster 3021181 refcount=1 reference=0 -5-Leaked cluster 3021182 refcount=1 reference=0 -6-Leaked cluster 3021183 refcount=1 reference=0 -7-Leaked cluster 3021184 refcount=1 reference=0 -8-ERROR cluster 3102547 refcount=3 reference=4 -9-ERROR cluster 3111536 refcount=3 reference=4 -10-ERROR cluster 3113369 refcount=3 reference=4 -11-ERROR cluster 3235590 refcount=10 reference=11 -12-ERROR cluster 3235591 refcount=10 reference=11 -423-Warning: cluster offset=0xc000c00020000 is after the end of the image file, can't properly check refcounts. -424-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -425-Warning: cluster offset=0xc0001000c0000 is after the end of the image file, can't properly check refcounts. -426-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -427-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -428-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -429-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -430-Warning: cluster offset=0xc000c00010000 is after the end of the image file, can't properly check refcounts. -After a futher look in, I found two l2 entries point to the same cluster, and that was found in serval qcow2 image files of different vms. -Like this: -table entry conflict (with our qcow2 check -tool): -a table offset : 0x00000093f7080000 level : 2, l1 table entry 100, l2 table entry 7 -b table offset : 0x00000093f7080000 level : 2, l1 table entry 5, l2 table entry 7 -table entry conflict : -a table offset : 0x00000000a01e0000 level : 2, l1 table entry 100, l2 table entry 19 -b table offset : 0x00000000a01e0000 level : 2, l1 table entry 5, l2 table entry 19 -table entry conflict : -a table offset : 0x00000000a01d0000 level : 2, l1 table entry 100, l2 table entry 18 -b table offset : 0x00000000a01d0000 level : 2, l1 table entry 5, l2 table entry 18 -table entry conflict : -a table offset : 0x00000000a01c0000 level : 2, l1 table entry 100, l2 table entry 17 -b table offset : 0x00000000a01c0000 level : 2, l1 table entry 5, l2 table entry 17 -table entry conflict : -a table offset : 0x00000000a01b0000 level : 2, l1 table entry 100, l2 table entry 16 -b table offset : 0x00000000a01b0000 level : 2, l1 table entry 5, l2 table entry 16 -I think the problem is relate to the snapshot create, delete. But I cant reproduce it . -Can Anyone give a hint about how this happen? -Qemu version 2.0.1, I download the source code and make install it. -Qemu parameters: -/usr/bin/kvm -chardev socket,id=qmp,path=/var/run/qemu-server/5855899639838.qmp,server,nowait -mon chardev=qmp,mode=control -vnc :0,websocket,to=200 -enable-kvm -pidfile /var/run/qemu-server/5855899639838.pid -daemonize -name yfMailSvr-200.200.0.14 -smp sockets=1,cores=4 -cpu core2duo,hv_spinlocks=0xffff,hv_relaxed,hv_time,hv_vapic,+sse4.1,+sse4.2,+x2apic,+erms,+smep,+fsgsbase,+f16c,+dca,+pcid,+pdcm,+xtpr,+ht,+ss,+acpi,+ds -nodefaults -vga cirrus -k en-us -boot menu=on,splash-time=8000 -m 8192 -usb -drive if=none,id=drive-ide0,media=cdrom,aio=native -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0 -drive file=/sf/data/36b82a720d3a278001ba904e80c20c13e_ecf4bbbf3e94/images/host-ecf4bbbf3e94/784f3f08532a/yfMailSvr-200.200.0.14.vm/vm-disk-1.qcow2,if=none,id=drive-virtio1,cache=none,aio=native -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -drive file=/sf/data/36b82a720d3a278001ba904e80c20c13e_ecf4bbbf3e94/images/host-ecf4bbbf3e94/784f3f08532a/yfMailSvr-200.200.0.14.vm/vm-disk-2.qcow2,if=none,id=drive-virtio2,cache=none,aio=native -device virtio-blk-pci,drive=drive-virtio2,id=virtio2,bus=pci.0,addr=0xc,bootindex=101 -netdev type=tap,id=net0,ifname=585589963983800,script=/sf/etc/kvm/vtp-bridge,vhost=on,vhostforce=on -device virtio-net-pci,romfile=,mac=FE:FC:FE:F0:AB:BA,netdev=net0,bus=pci.0,addr=0x12,id=net0 -rtc driftfix=slew,clock=rt,base=localtime -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -Thanks -Sangfor VT. -leijian - -Hi all, -There was a problem about qcow2 image file happened in my serval vms and I could not figure it out, -so have to ask for some help. -Here is the thing: -At first, I found there were some data corruption in a vm, so I did qemu-img check to all my vms. -parts of check report: -3-Leaked cluster 2926229 refcount=1 reference=0 -4-Leaked cluster 3021181 refcount=1 reference=0 -5-Leaked cluster 3021182 refcount=1 reference=0 -6-Leaked cluster 3021183 refcount=1 reference=0 -7-Leaked cluster 3021184 refcount=1 reference=0 -8-ERROR cluster 3102547 refcount=3 reference=4 -9-ERROR cluster 3111536 refcount=3 reference=4 -10-ERROR cluster 3113369 refcount=3 reference=4 -11-ERROR cluster 3235590 refcount=10 reference=11 -12-ERROR cluster 3235591 refcount=10 reference=11 -423-Warning: cluster offset=0xc000c00020000 is after the end of the image file, can't properly check refcounts. -424-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -425-Warning: cluster offset=0xc0001000c0000 is after the end of the image file, can't properly check refcounts. -426-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -427-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -428-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -429-Warning: cluster offset=0xc000c000c0000 is after the end of the image file, can't properly check refcounts. -430-Warning: cluster offset=0xc000c00010000 is after the end of the image file, can't properly check refcounts. -After a futher look in, I found two l2 entries point to the same cluster, and that was found in serval qcow2 image files of different vms. -Like this: -table entry conflict (with our qcow2 check -tool): -a table offset : 0x00000093f7080000 level : 2, l1 table entry 100, l2 table entry 7 -b table offset : 0x00000093f7080000 level : 2, l1 table entry 5, l2 table entry 7 -table entry conflict : -a table offset : 0x00000000a01e0000 level : 2, l1 table entry 100, l2 table entry 19 -b table offset : 0x00000000a01e0000 level : 2, l1 table entry 5, l2 table entry 19 -table entry conflict : -a table offset : 0x00000000a01d0000 level : 2, l1 table entry 100, l2 table entry 18 -b table offset : 0x00000000a01d0000 level : 2, l1 table entry 5, l2 table entry 18 -table entry conflict : -a table offset : 0x00000000a01c0000 level : 2, l1 table entry 100, l2 table entry 17 -b table offset : 0x00000000a01c0000 level : 2, l1 table entry 5, l2 table entry 17 -table entry conflict : -a table offset : 0x00000000a01b0000 level : 2, l1 table entry 100, l2 table entry 16 -b table offset : 0x00000000a01b0000 level : 2, l1 table entry 5, l2 table entry 16 -I think the problem is relate to the snapshot create, delete. But I cant reproduce it . -Can Anyone give a hint about how this happen? -Qemu version 2.0.1, I download the source code and make install it. -Qemu parameters: -/usr/bin/kvm -chardev socket,id=qmp,path=/var/run/qemu-server/5855899639838.qmp,server,nowait -mon chardev=qmp,mode=control -vnc :0,websocket,to=200 -enable-kvm -pidfile /var/run/qemu-server/5855899639838.pid -daemonize -name yfMailSvr-200.200.0.14 -smp sockets=1,cores=4 -cpu core2duo,hv_spinlocks=0xffff,hv_relaxed,hv_time,hv_vapic,+sse4.1,+sse4.2,+x2apic,+erms,+smep,+fsgsbase,+f16c,+dca,+pcid,+pdcm,+xtpr,+ht,+ss,+acpi,+ds -nodefaults -vga cirrus -k en-us -boot menu=on,splash-time=8000 -m 8192 -usb -drive if=none,id=drive-ide0,media=cdrom,aio=native -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0 -drive file=/sf/data/36b82a720d3a278001ba904e80c20c13e_ecf4bbbf3e94/images/host-ecf4bbbf3e94/784f3f08532a/yfMailSvr-200.200.0.14.vm/vm-disk-1.qcow2,if=none,id=drive-virtio1,cache=none,aio=native -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -drive file=/sf/data/36b82a720d3a278001ba904e80c20c13e_ecf4bbbf3e94/images/host-ecf4bbbf3e94/784f3f08532a/yfMailSvr-200.200.0.14.vm/vm-disk-2.qcow2,if=none,id=drive-virtio2,cache=none,aio=native -device virtio-blk-pci,drive=drive-virtio2,id=virtio2,bus=pci.0,addr=0xc,bootindex=101 -netdev type=tap,id=net0,ifname=585589963983800,script=/sf/etc/kvm/vtp-bridge,vhost=on,vhostforce=on -device virtio-net-pci,romfile=,mac=FE:FC:FE:F0:AB:BA,netdev=net0,bus=pci.0,addr=0x12,id=net0 -rtc driftfix=slew,clock=rt,base=localtime -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -Thanks -Sangfor VT. -leijian - -Am 03.04.2015 um 12:04 hat leijian geschrieben: -> -Hi all, -> -> -There was a problem about qcow2 image file happened in my serval vms and I -> -could not figure it out, -> -so have to ask for some help. -> -[...] -> -I think the problem is relate to the snapshot create, delete. But I cant -> -reproduce it . -> -Can Anyone give a hint about how this happen? -How did you create/delete your snapshots? - -More specifically, did you take care to never access your image from -more than one process (except if both are read-only)? It happens -occasionally that people use 'qemu-img snapshot' while the VM is -running. This is wrong and can corrupt the image. - -Kevin - -On 04/07/2015 03:33 AM, Kevin Wolf wrote: -> -More specifically, did you take care to never access your image from -> -more than one process (except if both are read-only)? It happens -> -occasionally that people use 'qemu-img snapshot' while the VM is -> -running. This is wrong and can corrupt the image. -Since this has been done by more than one person, I'm wondering if there -is something we can do in the qcow2 format itself to make it harder for -the casual user to cause corruption. Maybe if we declare some bit or -extension header for an image open for writing, which other readers can -use as a warning ("this image is being actively modified; reading it may -fail"), and other writers can use to deny access ("another process is -already modifying this image"), where a writer should set that bit -before writing anything else in the file, then clear it on exit. Of -course, you'd need a way to override the bit to actively clear it to -recover from the case of a writer dying unexpectedly without resetting -it normally. And it won't help the case of a reader opening the file -first, followed by a writer, where the reader could still get thrown off -track. - -Or maybe we could document in the qcow2 format that all readers and -writers should attempt to obtain the appropriate flock() permissions [or -other appropriate advisory locking scheme] over the file header, so that -cooperating processes that both use advisory locking will know when the -file is in use by another process. - --- -Eric Blake eblake redhat com +1-919-301-3266 -Libvirt virtualization library -http://libvirt.org -signature.asc -Description: -OpenPGP digital signature - - -I created/deleted the snapshot by using qmp command "snapshot_blkdev_internal"/"snapshot_delete_blkdev_internal", and for avoiding the case you mentioned above, I have added the flock() permission in the qemu_open(). -Here is the test of doing qemu-img snapshot to a running vm: -Diskfile:/sf/data/36c81f660e38b3b001b183da50b477d89_f8bc123b3e74/images/host-f8bc123b3e74/4a8d8728fcdc/Devried30030.vm/vm-disk-1.qcow2 is used! errno=Resource temporarily unavailable -Does the two cluster entry happen to be the same because of the refcount of using cluster decrease to 0 unexpectedly and is allocated again? -If it was not accessing the image from more than one process, any other exceptions I can test for? -Thanks -leijian -From: -Eric Blake -Date: -2015-04-07 23:27 -To: -Kevin Wolf -; -leijian -CC: -qemu-devel -; -stefanha -Subject: -Re: [Qemu-devel] [Snapshot Bug?]Qcow2 meta data -corruption -On 04/07/2015 03:33 AM, Kevin Wolf wrote: -> More specifically, did you take care to never access your image from -> more than one process (except if both are read-only)? It happens -> occasionally that people use 'qemu-img snapshot' while the VM is -> running. This is wrong and can corrupt the image. -Since this has been done by more than one person, I'm wondering if there -is something we can do in the qcow2 format itself to make it harder for -the casual user to cause corruption. Maybe if we declare some bit or -extension header for an image open for writing, which other readers can -use as a warning ("this image is being actively modified; reading it may -fail"), and other writers can use to deny access ("another process is -already modifying this image"), where a writer should set that bit -before writing anything else in the file, then clear it on exit. Of -course, you'd need a way to override the bit to actively clear it to -recover from the case of a writer dying unexpectedly without resetting -it normally. And it won't help the case of a reader opening the file -first, followed by a writer, where the reader could still get thrown off -track. -Or maybe we could document in the qcow2 format that all readers and -writers should attempt to obtain the appropriate flock() permissions [or -other appropriate advisory locking scheme] over the file header, so that -cooperating processes that both use advisory locking will know when the -file is in use by another process. --- -Eric Blake eblake redhat com +1-919-301-3266 -Libvirt virtualization library http://libvirt.org - diff --git a/results/classifier/02/other/35170175 b/results/classifier/02/other/35170175 deleted file mode 100644 index 113c832b6..000000000 --- a/results/classifier/02/other/35170175 +++ /dev/null @@ -1,522 +0,0 @@ -other: 0.933 -instruction: 0.812 -semantic: 0.798 -boot: 0.719 -mistranslation: 0.641 - -[Qemu-devel] [BUG] QEMU crashes with dpdk virtio pmd - -Qemu crashes, with pre-condition: -vm xml config with multiqueue, and the vm's driver virtio-net support -multi-queue - -reproduce steps: -i. start dpdk testpmd in VM with the virtio nic -ii. stop testpmd -iii. reboot the VM - -This commit "f9d6dbf0 remove virtio queues if the guest doesn't support -multiqueue" is introduced. - -Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a) -VM DPDK version: DPDK-1.6.1 - -Call Trace: -#0 0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6 -#1 0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6 -#2 0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6 -#3 0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6 -#4 0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0 -#5 0x00007f6088fea32c in iter_remove_or_steal () from -/usr/lib64/libglib-2.0.so.0 -#6 0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at -qom/object.c:410 -#7 object_finalize (data=0x7f6091e74800) at qom/object.c:467 -#8 object_unref (address@hidden) at qom/object.c:903 -#9 0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at -git/qemu/exec.c:1154 -#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163 -#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514 -#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6 - -Call Trace: -#0 0x00007fdccaeb9790 in ?? () -#1 0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at -qom/object.c:405 -#2 object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467 -#3 object_unref (address@hidden) at qom/object.c:903 -#4 0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at -git/qemu/exec.c:1154 -#5 phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163 -#6 address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514 -#7 0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#8 0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#9 0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6 - -The q->tx_bh will free in virtio_net_del_queue() function, when remove virtio -queues -if the guest doesn't support multiqueue. But it might be still referenced by -others (eg . virtio_net_set_status()), -which need so set NULL. - -diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c -index 7d091c9..98bd683 100644 ---- a/hw/net/virtio-net.c -+++ b/hw/net/virtio-net.c -@@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n, int index) - if (q->tx_timer) { - timer_del(q->tx_timer); - timer_free(q->tx_timer); -+ q->tx_timer = NULL; - } else { - qemu_bh_delete(q->tx_bh); -+ q->tx_bh = NULL; - } -+ q->tx_waiting = 0; - virtio_del_queue(vdev, index * 2 + 1); - } - -From: wangyunjian -Sent: Monday, April 24, 2017 6:10 PM -To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason Wang' -<address@hidden> -Cc: wangyunjian <address@hidden>; caihe <address@hidden> -Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd - -Qemu crashes, with pre-condition: -vm xml config with multiqueue, and the vm's driver virtio-net support -multi-queue - -reproduce steps: -i. start dpdk testpmd in VM with the virtio nic -ii. stop testpmd -iii. reboot the VM - -This commit "f9d6dbf0 remove virtio queues if the guest doesn't support -multiqueue" is introduced. - -Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a) -VM DPDK version:  DPDK-1.6.1 - -Call Trace: -#0 0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6 -#1 0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6 -#2 0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6 -#3 0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6 -#4 0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0 -#5 0x00007f6088fea32c in iter_remove_or_steal () from -/usr/lib64/libglib-2.0.so.0 -#6 0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at -qom/object.c:410 -#7 object_finalize (data=0x7f6091e74800) at qom/object.c:467 -#8 object_unref (address@hidden) at qom/object.c:903 -#9 0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at -git/qemu/exec.c:1154 -#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163 -#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514 -#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6 - -Call Trace: -#0 0x00007fdccaeb9790 in ?? () -#1 0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at -qom/object.c:405 -#2 object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467 -#3 object_unref (address@hidden) at qom/object.c:903 -#4 0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at -git/qemu/exec.c:1154 -#5 phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163 -#6 address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514 -#7 0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#8 0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#9 0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6 - -On 2017å¹´04æ25æ¥ 19:37, wangyunjian wrote: -The q->tx_bh will free in virtio_net_del_queue() function, when remove virtio -queues -if the guest doesn't support multiqueue. But it might be still referenced by -others (eg . virtio_net_set_status()), -which need so set NULL. - -diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c -index 7d091c9..98bd683 100644 ---- a/hw/net/virtio-net.c -+++ b/hw/net/virtio-net.c -@@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n, int index) - if (q->tx_timer) { - timer_del(q->tx_timer); - timer_free(q->tx_timer); -+ q->tx_timer = NULL; - } else { - qemu_bh_delete(q->tx_bh); -+ q->tx_bh = NULL; - } -+ q->tx_waiting = 0; - virtio_del_queue(vdev, index * 2 + 1); - } -Thanks a lot for the fix. - -Two questions: -- If virtio_net_set_status() is the only function that may access tx_bh, -it looks like setting tx_waiting to zero is sufficient? -- Can you post a formal patch for this? - -Thanks -From: wangyunjian -Sent: Monday, April 24, 2017 6:10 PM -To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason Wang' -<address@hidden> -Cc: wangyunjian <address@hidden>; caihe <address@hidden> -Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd - -Qemu crashes, with pre-condition: -vm xml config with multiqueue, and the vm's driver virtio-net support -multi-queue - -reproduce steps: -i. start dpdk testpmd in VM with the virtio nic -ii. stop testpmd -iii. reboot the VM - -This commit "f9d6dbf0 remove virtio queues if the guest doesn't support -multiqueue" is introduced. - -Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a) -VM DPDK version: DPDK-1.6.1 - -Call Trace: -#0 0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6 -#1 0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6 -#2 0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6 -#3 0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6 -#4 0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0 -#5 0x00007f6088fea32c in iter_remove_or_steal () from -/usr/lib64/libglib-2.0.so.0 -#6 0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at -qom/object.c:410 -#7 object_finalize (data=0x7f6091e74800) at qom/object.c:467 -#8 object_unref (address@hidden) at qom/object.c:903 -#9 0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at -git/qemu/exec.c:1154 -#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163 -#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514 -#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6 - -Call Trace: -#0 0x00007fdccaeb9790 in ?? () -#1 0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at -qom/object.c:405 -#2 object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467 -#3 object_unref (address@hidden) at qom/object.c:903 -#4 0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at -git/qemu/exec.c:1154 -#5 phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163 -#6 address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514 -#7 0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at -util/rcu.c:272 -#8 0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -#9 0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6 - -CCing Paolo and Stefan, since it has a relationship with bh in Qemu. - -> ------Original Message----- -> -From: Jason Wang [ -mailto:address@hidden -> -> -> -On 2017å¹´04æ25æ¥ 19:37, wangyunjian wrote: -> -> The q->tx_bh will free in virtio_net_del_queue() function, when remove -> -> virtio -> -queues -> -> if the guest doesn't support multiqueue. But it might be still referenced by -> -others (eg . virtio_net_set_status()), -> -> which need so set NULL. -> -> -> -> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c -> -> index 7d091c9..98bd683 100644 -> -> --- a/hw/net/virtio-net.c -> -> +++ b/hw/net/virtio-net.c -> -> @@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n, -> -int index) -> -> if (q->tx_timer) { -> -> timer_del(q->tx_timer); -> -> timer_free(q->tx_timer); -> -> + q->tx_timer = NULL; -> -> } else { -> -> qemu_bh_delete(q->tx_bh); -> -> + q->tx_bh = NULL; -> -> } -> -> + q->tx_waiting = 0; -> -> virtio_del_queue(vdev, index * 2 + 1); -> -> } -> -> -Thanks a lot for the fix. -> -> -Two questions: -> -> -- If virtio_net_set_status() is the only function that may access tx_bh, -> -it looks like setting tx_waiting to zero is sufficient? -Currently yes, but we don't assure that it works for all scenarios, so -we set the tx_bh and tx_timer to NULL to avoid to possibly access wild pointer, -which is the common method for usage of bh in Qemu. - -I have another question about the root cause of this issure. - -This below trace is the path of setting tx_waiting to one in -virtio_net_handle_tx_bh() : - -Breakpoint 1, virtio_net_handle_tx_bh (vdev=0x0, vq=0x7f335ad13900) at -/data/wyj/git/qemu/hw/net/virtio-net.c:1398 -1398 { -(gdb) bt -#0 virtio_net_handle_tx_bh (vdev=0x0, vq=0x7f335ad13900) at -/data/wyj/git/qemu/hw/net/virtio-net.c:1398 -#1 0x00007f3357bddf9c in virtio_bus_set_host_notifier (bus=<optimized out>, -address@hidden, address@hidden) at hw/virtio/virtio-bus.c:297 -#2 0x00007f3357a0055d in vhost_dev_disable_notifiers (address@hidden, -address@hidden) at /data/wyj/git/qemu/hw/virtio/vhost.c:1422 -#3 0x00007f33579e3373 in vhost_net_stop_one (net=0x7f335ad84dc0, -dev=0x7f335c6f5f90) at /data/wyj/git/qemu/hw/net/vhost_net.c:289 -#4 0x00007f33579e385b in vhost_net_stop (address@hidden, ncs=<optimized out>, -address@hidden) at /data/wyj/git/qemu/hw/net/vhost_net.c:367 -#5 0x00007f33579e15de in virtio_net_vhost_status (status=<optimized out>, -n=0x7f335c6f5f90) at /data/wyj/git/qemu/hw/net/virtio-net.c:176 -#6 virtio_net_set_status (vdev=0x7f335c6f5f90, status=0 '\000') at -/data/wyj/git/qemu/hw/net/virtio-net.c:250 -#7 0x00007f33579f8dc6 in virtio_set_status (address@hidden, address@hidden -'\000') at /data/wyj/git/qemu/hw/virtio/virtio.c:1146 -#8 0x00007f3357bdd3cc in virtio_ioport_write (val=0, addr=18, -opaque=0x7f335c6eda80) at hw/virtio/virtio-pci.c:387 -#9 virtio_pci_config_write (opaque=0x7f335c6eda80, addr=18, val=0, -size=<optimized out>) at hw/virtio/virtio-pci.c:511 -#10 0x00007f33579b2155 in memory_region_write_accessor (mr=0x7f335c6ee470, -addr=18, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized -out>, attrs=...) at /data/wyj/git/qemu/memory.c:526 -#11 0x00007f33579af2e9 in access_with_adjusted_size (address@hidden, -address@hidden, address@hidden, access_size_min=<optimized out>, -access_size_max=<optimized out>, address@hidden - 0x7f33579b20f0 <memory_region_write_accessor>, address@hidden, -address@hidden) at /data/wyj/git/qemu/memory.c:592 -#12 0x00007f33579b2e15 in memory_region_dispatch_write (address@hidden, -address@hidden, data=0, address@hidden, address@hidden) at -/data/wyj/git/qemu/memory.c:1319 -#13 0x00007f335796cd93 in address_space_write_continue (mr=0x7f335c6ee470, l=1, -addr1=18, len=1, buf=0x7f335773d000 "", attrs=..., addr=49170, -as=0x7f3358317060 <address_space_io>) at /data/wyj/git/qemu/exec.c:2834 -#14 address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., -buf=<optimized out>, len=<optimized out>) at /data/wyj/git/qemu/exec.c:2879 -#15 0x00007f335796d3ad in address_space_rw (as=<optimized out>, address@hidden, -attrs=..., address@hidden, buf=<optimized out>, address@hidden, address@hidden) -at /data/wyj/git/qemu/exec.c:2981 -#16 0x00007f33579ae226 in kvm_handle_io (count=1, size=1, direction=<optimized -out>, data=<optimized out>, attrs=..., port=49170) at -/data/wyj/git/qemu/kvm-all.c:1803 -#17 kvm_cpu_exec (address@hidden) at /data/wyj/git/qemu/kvm-all.c:2032 -#18 0x00007f335799b632 in qemu_kvm_cpu_thread_fn (arg=0x7f335ae82070) at -/data/wyj/git/qemu/cpus.c:1118 -#19 0x00007f3352983dc5 in start_thread () from /usr/lib64/libpthread.so.0 -#20 0x00007f335113571d in clone () from /usr/lib64/libc.so.6 - -It calls qemu_bh_schedule(q->tx_bh) at the bottom of virtio_net_handle_tx_bh(), -I don't know why virtio_net_tx_bh() doesn't be invoked, so that the -q->tx_waiting is not zero. -[ps: we added logs in virtio_net_tx_bh() to verify that] - -Some other information: - -It won't crash if we don't use vhost-net. - - -Thanks, --Gonglei - -> -- Can you post a formal patch for this? -> -> -Thanks -> -> -> From: wangyunjian -> -> Sent: Monday, April 24, 2017 6:10 PM -> -> To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason -> -Wang' <address@hidden> -> -> Cc: wangyunjian <address@hidden>; caihe <address@hidden> -> -> Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd -> -> -> -> Qemu crashes, with pre-condition: -> -> vm xml config with multiqueue, and the vm's driver virtio-net support -> -multi-queue -> -> -> -> reproduce steps: -> -> i. start dpdk testpmd in VM with the virtio nic -> -> ii. stop testpmd -> -> iii. reboot the VM -> -> -> -> This commit "f9d6dbf0 remove virtio queues if the guest doesn't support -> -multiqueue" is introduced. -> -> -> -> Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a) -> -> VM DPDK version: DPDK-1.6.1 -> -> -> -> Call Trace: -> -> #0 0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6 -> -> #1 0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6 -> -> #2 0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6 -> -> #3 0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6 -> -> #4 0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0 -> -> #5 0x00007f6088fea32c in iter_remove_or_steal () from -> -/usr/lib64/libglib-2.0.so.0 -> -> #6 0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) -> -at qom/object.c:410 -> -> #7 object_finalize (data=0x7f6091e74800) at qom/object.c:467 -> -> #8 object_unref (address@hidden) at qom/object.c:903 -> -> #9 0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at -> -git/qemu/exec.c:1154 -> -> #10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163 -> -> #11 address_space_dispatch_free (d=0x7f6090b72b90) at -> -git/qemu/exec.c:2514 -> -> #12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at -> -util/rcu.c:272 -> -> #13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -> -> #14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6 -> -> -> -> Call Trace: -> -> #0 0x00007fdccaeb9790 in ?? () -> -> #1 0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at -> -qom/object.c:405 -> -> #2 object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467 -> -> #3 object_unref (address@hidden) at qom/object.c:903 -> -> #4 0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at -> -git/qemu/exec.c:1154 -> -> #5 phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163 -> -> #6 address_space_dispatch_free (d=0x7fdcdc86a9e0) at -> -git/qemu/exec.c:2514 -> -> #7 0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at -> -util/rcu.c:272 -> -> #8 0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0 -> -> #9 0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6 -> -> -> -> - -On 25/04/2017 14:02, Jason Wang wrote: -> -> -Thanks a lot for the fix. -> -> -Two questions: -> -> -- If virtio_net_set_status() is the only function that may access tx_bh, -> -it looks like setting tx_waiting to zero is sufficient? -I think clearing tx_bh is better anyway, as leaving a dangling pointer -is not very hygienic. - -Paolo - -> -- Can you post a formal patch for this? - diff --git a/results/classifier/02/other/42613410 b/results/classifier/02/other/42613410 deleted file mode 100644 index 992fb4b42..000000000 --- a/results/classifier/02/other/42613410 +++ /dev/null @@ -1,150 +0,0 @@ -other: 0.332 -semantic: 0.327 -mistranslation: 0.314 -instruction: 0.307 -boot: 0.187 - -[Qemu-devel] [PATCH, Bug 1612908] scripts: Add TCP endpoints for qom-* scripts - -From: Carl Allendorph <address@hidden> - -I've created a patch for bug #1612908. The current docs for the scripts -in the "scripts/qmp/" directory suggest that both unix sockets and -tcp endpoints can be used. The TCP endpoints don't work for most of the -scripts, with notable exception of 'qmp-shell'. This patch attempts to -refactor the process of distinguishing between unix path endpoints and -tcp endpoints to work for all of these scripts. - -Carl Allendorph (1): - scripts: Add ability for qom-* python scripts to target tcp endpoints - - scripts/qmp/qmp-shell | 22 ++-------------------- - scripts/qmp/qmp.py | 23 ++++++++++++++++++++--- - 2 files changed, 22 insertions(+), 23 deletions(-) - --- -2.7.4 - -From: Carl Allendorph <address@hidden> - -The current code for QEMUMonitorProtocol accepts both a unix socket -endpoint as a string and a tcp endpoint as a tuple. Most of the scripts -that use this class don't massage the command line argument to generate -a tuple. This patch refactors qmp-shell slightly to reuse the existing -parsing of the "host:port" string for all the qom-* scripts. - -Signed-off-by: Carl Allendorph <address@hidden> ---- - scripts/qmp/qmp-shell | 22 ++-------------------- - scripts/qmp/qmp.py | 23 ++++++++++++++++++++--- - 2 files changed, 22 insertions(+), 23 deletions(-) - -diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell -index 0373b24..8a2a437 100755 ---- a/scripts/qmp/qmp-shell -+++ b/scripts/qmp/qmp-shell -@@ -83,9 +83,6 @@ class QMPCompleter(list): - class QMPShellError(Exception): - pass - --class QMPShellBadPort(QMPShellError): -- pass -- - class FuzzyJSON(ast.NodeTransformer): - '''This extension of ast.NodeTransformer filters literal "true/false/null" - values in an AST and replaces them by proper "True/False/None" values that -@@ -103,28 +100,13 @@ class FuzzyJSON(ast.NodeTransformer): - # _execute_cmd()). Let's design a better one. - class QMPShell(qmp.QEMUMonitorProtocol): - def __init__(self, address, pretty=False): -- qmp.QEMUMonitorProtocol.__init__(self, self.__get_address(address)) -+ qmp.QEMUMonitorProtocol.__init__(self, address) - self._greeting = None - self._completer = None - self._pretty = pretty - self._transmode = False - self._actions = list() - -- def __get_address(self, arg): -- """ -- Figure out if the argument is in the port:host form, if it's not it's -- probably a file path. -- """ -- addr = arg.split(':') -- if len(addr) == 2: -- try: -- port = int(addr[1]) -- except ValueError: -- raise QMPShellBadPort -- return ( addr[0], port ) -- # socket path -- return arg -- - def _fill_completion(self): - for cmd in self.cmd('query-commands')['return']: - self._completer.append(cmd['name']) -@@ -400,7 +382,7 @@ def main(): - - if qemu is None: - fail_cmdline() -- except QMPShellBadPort: -+ except qmp.QMPShellBadPort: - die('bad port number in command-line') - - try: -diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py -index 62d3651..261ece8 100644 ---- a/scripts/qmp/qmp.py -+++ b/scripts/qmp/qmp.py -@@ -25,21 +25,23 @@ class QMPCapabilitiesError(QMPError): - class QMPTimeoutError(QMPError): - pass - -+class QMPShellBadPort(QMPError): -+ pass -+ - class QEMUMonitorProtocol: - def __init__(self, address, server=False, debug=False): - """ - Create a QEMUMonitorProtocol class. - - @param address: QEMU address, can be either a unix socket path (string) -- or a tuple in the form ( address, port ) for a TCP -- connection -+ or a TCP endpoint (string in the format "host:port") - @param server: server mode listens on the socket (bool) - @raise socket.error on socket connection errors - @note No connection is established, this is done by the connect() or - accept() methods - """ - self.__events = [] -- self.__address = address -+ self.__address = self.__get_address(address) - self._debug = debug - self.__sock = self.__get_sock() - if server: -@@ -47,6 +49,21 @@ class QEMUMonitorProtocol: - self.__sock.bind(self.__address) - self.__sock.listen(1) - -+ def __get_address(self, arg): -+ """ -+ Figure out if the argument is in the port:host form, if it's not it's -+ probably a file path. -+ """ -+ addr = arg.split(':') -+ if len(addr) == 2: -+ try: -+ port = int(addr[1]) -+ except ValueError: -+ raise QMPShellBadPort -+ return ( addr[0], port ) -+ # socket path -+ return arg -+ - def __get_sock(self): - if isinstance(self.__address, tuple): - family = socket.AF_INET --- -2.7.4 - diff --git a/results/classifier/02/other/42974450 b/results/classifier/02/other/42974450 deleted file mode 100644 index a3d5b7c21..000000000 --- a/results/classifier/02/other/42974450 +++ /dev/null @@ -1,430 +0,0 @@ -other: 0.940 -instruction: 0.920 -semantic: 0.917 -boot: 0.909 -mistranslation: 0.882 - -[Bug Report] Possible Missing Endianness Conversion - -The virtio packed virtqueue support patch[1] suggests converting -endianness by lines: - -virtio_tswap16s(vdev, &e->off_wrap); -virtio_tswap16s(vdev, &e->flags); - -Though both of these conversion statements aren't present in the -latest qemu code here[2] - -Is this intentional? - -[1]: -https://mail.gnu.org/archive/html/qemu-block/2019-10/msg01492.html -[2]: -https://elixir.bootlin.com/qemu/latest/source/hw/virtio/virtio.c#L314 - -CCing Jason. - -On Mon, Jun 24, 2024 at 4:30â¯PM Xoykie <xoykie@gmail.com> wrote: -> -> -The virtio packed virtqueue support patch[1] suggests converting -> -endianness by lines: -> -> -virtio_tswap16s(vdev, &e->off_wrap); -> -virtio_tswap16s(vdev, &e->flags); -> -> -Though both of these conversion statements aren't present in the -> -latest qemu code here[2] -> -> -Is this intentional? -Good catch! - -It looks like it was removed (maybe by mistake) by commit -d152cdd6f6 ("virtio: use virtio accessor to access packed event") - -Jason can you confirm that? - -Thanks, -Stefano - -> -> -[1]: -https://mail.gnu.org/archive/html/qemu-block/2019-10/msg01492.html -> -[2]: -https://elixir.bootlin.com/qemu/latest/source/hw/virtio/virtio.c#L314 -> - -On Mon, 24 Jun 2024 at 16:11, Stefano Garzarella <sgarzare@redhat.com> wrote: -> -> -CCing Jason. -> -> -On Mon, Jun 24, 2024 at 4:30â¯PM Xoykie <xoykie@gmail.com> wrote: -> -> -> -> The virtio packed virtqueue support patch[1] suggests converting -> -> endianness by lines: -> -> -> -> virtio_tswap16s(vdev, &e->off_wrap); -> -> virtio_tswap16s(vdev, &e->flags); -> -> -> -> Though both of these conversion statements aren't present in the -> -> latest qemu code here[2] -> -> -> -> Is this intentional? -> -> -Good catch! -> -> -It looks like it was removed (maybe by mistake) by commit -> -d152cdd6f6 ("virtio: use virtio accessor to access packed event") -That commit changes from: - -- address_space_read_cached(cache, off_off, &e->off_wrap, -- sizeof(e->off_wrap)); -- virtio_tswap16s(vdev, &e->off_wrap); - -which does a byte read of 2 bytes and then swaps the bytes -depending on the host endianness and the value of -virtio_access_is_big_endian() - -to this: - -+ e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); - -virtio_lduw_phys_cached() is a small function which calls -either lduw_be_phys_cached() or lduw_le_phys_cached() -depending on the value of virtio_access_is_big_endian(). -(And lduw_be_phys_cached() and lduw_le_phys_cached() do -the right thing for the host-endianness to do a "load -a specifically big or little endian 16-bit value".) - -Which is to say that because we use a load/store function that's -explicit about the size of the data type it is accessing, the -function itself can handle doing the load as big or little -endian, rather than the calling code having to do a manual swap after -it has done a load-as-bag-of-bytes. This is generally preferable -as it's less error-prone. - -(Explicit swap-after-loading still has a place where the -code is doing a load of a whole structure out of the -guest and then swapping each struct field after the fact, -because it means we can do a single load-from-guest-memory -rather than a whole sequence of calls all the way down -through the memory subsystem.) - -thanks --- PMM - -On Mon, Jun 24, 2024 at 04:19:52PM GMT, Peter Maydell wrote: -On Mon, 24 Jun 2024 at 16:11, Stefano Garzarella <sgarzare@redhat.com> wrote: -CCing Jason. - -On Mon, Jun 24, 2024 at 4:30â¯PM Xoykie <xoykie@gmail.com> wrote: -> -> The virtio packed virtqueue support patch[1] suggests converting -> endianness by lines: -> -> virtio_tswap16s(vdev, &e->off_wrap); -> virtio_tswap16s(vdev, &e->flags); -> -> Though both of these conversion statements aren't present in the -> latest qemu code here[2] -> -> Is this intentional? - -Good catch! - -It looks like it was removed (maybe by mistake) by commit -d152cdd6f6 ("virtio: use virtio accessor to access packed event") -That commit changes from: - -- address_space_read_cached(cache, off_off, &e->off_wrap, -- sizeof(e->off_wrap)); -- virtio_tswap16s(vdev, &e->off_wrap); - -which does a byte read of 2 bytes and then swaps the bytes -depending on the host endianness and the value of -virtio_access_is_big_endian() - -to this: - -+ e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); - -virtio_lduw_phys_cached() is a small function which calls -either lduw_be_phys_cached() or lduw_le_phys_cached() -depending on the value of virtio_access_is_big_endian(). -(And lduw_be_phys_cached() and lduw_le_phys_cached() do -the right thing for the host-endianness to do a "load -a specifically big or little endian 16-bit value".) - -Which is to say that because we use a load/store function that's -explicit about the size of the data type it is accessing, the -function itself can handle doing the load as big or little -endian, rather than the calling code having to do a manual swap after -it has done a load-as-bag-of-bytes. This is generally preferable -as it's less error-prone. -Thanks for the details! - -So, should we also remove `virtio_tswap16s(vdev, &e->flags);` ? - -I mean: -diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c -index 893a072c9d..2e5e67bdb9 100644 ---- a/hw/virtio/virtio.c -+++ b/hw/virtio/virtio.c -@@ -323,7 +323,6 @@ static void vring_packed_event_read(VirtIODevice *vdev, - /* Make sure flags is seen before off_wrap */ - smp_rmb(); - e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); -- virtio_tswap16s(vdev, &e->flags); - } - - static void vring_packed_off_wrap_write(VirtIODevice *vdev, - -Thanks, -Stefano -(Explicit swap-after-loading still has a place where the -code is doing a load of a whole structure out of the -guest and then swapping each struct field after the fact, -because it means we can do a single load-from-guest-memory -rather than a whole sequence of calls all the way down -through the memory subsystem.) - -thanks --- PMM - -On Tue, 25 Jun 2024 at 08:18, Stefano Garzarella <sgarzare@redhat.com> wrote: -> -> -On Mon, Jun 24, 2024 at 04:19:52PM GMT, Peter Maydell wrote: -> ->On Mon, 24 Jun 2024 at 16:11, Stefano Garzarella <sgarzare@redhat.com> wrote: -> ->> -> ->> CCing Jason. -> ->> -> ->> On Mon, Jun 24, 2024 at 4:30â¯PM Xoykie <xoykie@gmail.com> wrote: -> ->> > -> ->> > The virtio packed virtqueue support patch[1] suggests converting -> ->> > endianness by lines: -> ->> > -> ->> > virtio_tswap16s(vdev, &e->off_wrap); -> ->> > virtio_tswap16s(vdev, &e->flags); -> ->> > -> ->> > Though both of these conversion statements aren't present in the -> ->> > latest qemu code here[2] -> ->> > -> ->> > Is this intentional? -> ->> -> ->> Good catch! -> ->> -> ->> It looks like it was removed (maybe by mistake) by commit -> ->> d152cdd6f6 ("virtio: use virtio accessor to access packed event") -> -> -> ->That commit changes from: -> -> -> ->- address_space_read_cached(cache, off_off, &e->off_wrap, -> ->- sizeof(e->off_wrap)); -> ->- virtio_tswap16s(vdev, &e->off_wrap); -> -> -> ->which does a byte read of 2 bytes and then swaps the bytes -> ->depending on the host endianness and the value of -> ->virtio_access_is_big_endian() -> -> -> ->to this: -> -> -> ->+ e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); -> -> -> ->virtio_lduw_phys_cached() is a small function which calls -> ->either lduw_be_phys_cached() or lduw_le_phys_cached() -> ->depending on the value of virtio_access_is_big_endian(). -> ->(And lduw_be_phys_cached() and lduw_le_phys_cached() do -> ->the right thing for the host-endianness to do a "load -> ->a specifically big or little endian 16-bit value".) -> -> -> ->Which is to say that because we use a load/store function that's -> ->explicit about the size of the data type it is accessing, the -> ->function itself can handle doing the load as big or little -> ->endian, rather than the calling code having to do a manual swap after -> ->it has done a load-as-bag-of-bytes. This is generally preferable -> ->as it's less error-prone. -> -> -Thanks for the details! -> -> -So, should we also remove `virtio_tswap16s(vdev, &e->flags);` ? -> -> -I mean: -> -diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c -> -index 893a072c9d..2e5e67bdb9 100644 -> ---- a/hw/virtio/virtio.c -> -+++ b/hw/virtio/virtio.c -> -@@ -323,7 +323,6 @@ static void vring_packed_event_read(VirtIODevice *vdev, -> -/* Make sure flags is seen before off_wrap */ -> -smp_rmb(); -> -e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); -> -- virtio_tswap16s(vdev, &e->flags); -> -} -That definitely looks like it's probably not correct... - --- PMM - -On Fri, Jun 28, 2024 at 03:53:09PM GMT, Peter Maydell wrote: -On Tue, 25 Jun 2024 at 08:18, Stefano Garzarella <sgarzare@redhat.com> wrote: -On Mon, Jun 24, 2024 at 04:19:52PM GMT, Peter Maydell wrote: ->On Mon, 24 Jun 2024 at 16:11, Stefano Garzarella <sgarzare@redhat.com> wrote: ->> ->> CCing Jason. ->> ->> On Mon, Jun 24, 2024 at 4:30â¯PM Xoykie <xoykie@gmail.com> wrote: ->> > ->> > The virtio packed virtqueue support patch[1] suggests converting ->> > endianness by lines: ->> > ->> > virtio_tswap16s(vdev, &e->off_wrap); ->> > virtio_tswap16s(vdev, &e->flags); ->> > ->> > Though both of these conversion statements aren't present in the ->> > latest qemu code here[2] ->> > ->> > Is this intentional? ->> ->> Good catch! ->> ->> It looks like it was removed (maybe by mistake) by commit ->> d152cdd6f6 ("virtio: use virtio accessor to access packed event") -> ->That commit changes from: -> ->- address_space_read_cached(cache, off_off, &e->off_wrap, ->- sizeof(e->off_wrap)); ->- virtio_tswap16s(vdev, &e->off_wrap); -> ->which does a byte read of 2 bytes and then swaps the bytes ->depending on the host endianness and the value of ->virtio_access_is_big_endian() -> ->to this: -> ->+ e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); -> ->virtio_lduw_phys_cached() is a small function which calls ->either lduw_be_phys_cached() or lduw_le_phys_cached() ->depending on the value of virtio_access_is_big_endian(). ->(And lduw_be_phys_cached() and lduw_le_phys_cached() do ->the right thing for the host-endianness to do a "load ->a specifically big or little endian 16-bit value".) -> ->Which is to say that because we use a load/store function that's ->explicit about the size of the data type it is accessing, the ->function itself can handle doing the load as big or little ->endian, rather than the calling code having to do a manual swap after ->it has done a load-as-bag-of-bytes. This is generally preferable ->as it's less error-prone. - -Thanks for the details! - -So, should we also remove `virtio_tswap16s(vdev, &e->flags);` ? - -I mean: -diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c -index 893a072c9d..2e5e67bdb9 100644 ---- a/hw/virtio/virtio.c -+++ b/hw/virtio/virtio.c -@@ -323,7 +323,6 @@ static void vring_packed_event_read(VirtIODevice *vdev, - /* Make sure flags is seen before off_wrap */ - smp_rmb(); - e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off); -- virtio_tswap16s(vdev, &e->flags); - } -That definitely looks like it's probably not correct... -Yeah, I just sent that patch: -20240701075208.19634-1-sgarzare@redhat.com -">https://lore.kernel.org/qemu-devel/ -20240701075208.19634-1-sgarzare@redhat.com -We can continue the discussion there. - -Thanks, -Stefano - diff --git a/results/classifier/02/other/43643137 b/results/classifier/02/other/43643137 deleted file mode 100644 index e00f1f6b1..000000000 --- a/results/classifier/02/other/43643137 +++ /dev/null @@ -1,539 +0,0 @@ -other: 0.781 -semantic: 0.764 -instruction: 0.754 -mistranslation: 0.665 -boot: 0.652 - -[Qemu-devel] [BUG/RFC] INIT IPI lost when VM starts - -Hi, -We encountered a problem that when a domain starts, seabios failed to online a -vCPU. - -After investigation, we found that the reason is in kvm-kmod, KVM_APIC_INIT bit -in -vcpu->arch.apic->pending_events was overwritten by qemu, and thus an INIT IPI -sent -to AP was lost. Qemu does this since libvirtd sends a âquery-cpusâ qmp command -to qemu -on VM start. - -In qemu, qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> -do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from kvm-kmod and -sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call -kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus pending_events is -overwritten by qemu. - -I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true after -âquery-cpusâ, -and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am not sure -whether -it is OK for qemu to set cpu->kvm_vcpu_dirty in do_kvm_cpu_synchronize_state in -each caller. - -Whatâs your opinion? - -Let me clarify it more clearly. Time sequence is that qemu handles âquery-cpusâ qmp -command, vcpu 1 (and vcpu 0) got registers from kvm-kmod (qmp_query_cpus-> -cpu_synchronize_state-> kvm_cpu_synchronize_state-> -> do_kvm_cpu_synchronize_state-> kvm_arch_get_registers), then vcpu 0 (BSP) -sends INIT-SIPI to vcpu 1(AP). In kvm-kmod, vcpu 1âs pending_eventsâs KVM_APIC_INIT -bit set. -Then vcpu 1 continue running, vcpu1 thread in qemu calls -kvm_arch_put_registers-> kvm_put_vcpu_events, so KVM_APIC_INIT bit in vcpu 1âs -pending_events got cleared, i.e., lost. - -In kvm-kmod, except for pending_events, sipi_vector may also be overwritten., -so I am not sure if there are other fields/registers in danger, i.e., those may -be modified asynchronously with vcpu thread itself. - -BTW, using a sleep like following can reliably reproduce this problem, if VM -equipped with more than 2 vcpus and starting VM using libvirtd. - -diff --git a/target/i386/kvm.c b/target/i386/kvm.c -index 55865db..5099290 100644 ---- a/target/i386/kvm.c -+++ b/target/i386/kvm.c -@@ -2534,6 +2534,11 @@ static int kvm_put_vcpu_events(X86CPU *cpu, int level) - KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR; - } - -+ if (CPU(cpu)->cpu_index == 1) { -+ fprintf(stderr, "vcpu 1 sleep!!!!\n"); -+ sleep(10); -+ } -+ - return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events); - } - - -On 2017/3/20 22:21, Herongguang (Stephen) wrote: -Hi, -We encountered a problem that when a domain starts, seabios failed to online a -vCPU. - -After investigation, we found that the reason is in kvm-kmod, KVM_APIC_INIT bit -in -vcpu->arch.apic->pending_events was overwritten by qemu, and thus an INIT IPI -sent -to AP was lost. Qemu does this since libvirtd sends a âquery-cpusâ qmp command -to qemu -on VM start. - -In qemu, qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> -do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from kvm-kmod and -sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call -kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus pending_events is -overwritten by qemu. - -I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true after -âquery-cpusâ, -and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am not sure -whether -it is OK for qemu to set cpu->kvm_vcpu_dirty in do_kvm_cpu_synchronize_state in -each caller. - -Whatâs your opinion? - -On 20/03/2017 15:21, Herongguang (Stephen) wrote: -> -> -We encountered a problem that when a domain starts, seabios failed to -> -online a vCPU. -> -> -After investigation, we found that the reason is in kvm-kmod, -> -KVM_APIC_INIT bit in -> -vcpu->arch.apic->pending_events was overwritten by qemu, and thus an -> -INIT IPI sent -> -to AP was lost. Qemu does this since libvirtd sends a âquery-cpusâ qmp -> -command to qemu -> -on VM start. -> -> -In qemu, qmp_query_cpus-> cpu_synchronize_state-> -> -kvm_cpu_synchronize_state-> -> -do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from -> -kvm-kmod and -> -sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call -> -kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus -> -pending_events is -> -overwritten by qemu. -> -> -I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true -> -after âquery-cpusâ, -> -and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am -> -not sure whether -> -it is OK for qemu to set cpu->kvm_vcpu_dirty in -> -do_kvm_cpu_synchronize_state in each caller. -> -> -Whatâs your opinion? -Hi Rongguang, - -sorry for the late response. - -Where exactly is KVM_APIC_INIT dropped? kvm_get_mp_state does clear the -bit, but the result of the INIT is stored in mp_state. - -kvm_get_vcpu_events is called after kvm_get_mp_state; it retrieves -KVM_APIC_INIT in events.smi.latched_init and kvm_set_vcpu_events passes -it back. Maybe it should ignore events.smi.latched_init if not in SMM, -but I would like to understand the exact sequence of events. - -Thanks, - -paolo - -On 2017/4/6 0:16, Paolo Bonzini wrote: -On 20/03/2017 15:21, Herongguang (Stephen) wrote: -We encountered a problem that when a domain starts, seabios failed to -online a vCPU. - -After investigation, we found that the reason is in kvm-kmod, -KVM_APIC_INIT bit in -vcpu->arch.apic->pending_events was overwritten by qemu, and thus an -INIT IPI sent -to AP was lost. Qemu does this since libvirtd sends a âquery-cpusâ qmp -command to qemu -on VM start. - -In qemu, qmp_query_cpus-> cpu_synchronize_state-> -kvm_cpu_synchronize_state-> -do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from -kvm-kmod and -sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call -kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus -pending_events is -overwritten by qemu. - -I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true -after âquery-cpusâ, -and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am -not sure whether -it is OK for qemu to set cpu->kvm_vcpu_dirty in -do_kvm_cpu_synchronize_state in each caller. - -Whatâs your opinion? -Hi Rongguang, - -sorry for the late response. - -Where exactly is KVM_APIC_INIT dropped? kvm_get_mp_state does clear the -bit, but the result of the INIT is stored in mp_state. -It's dropped in KVM_SET_VCPU_EVENTS, see below. -kvm_get_vcpu_events is called after kvm_get_mp_state; it retrieves -KVM_APIC_INIT in events.smi.latched_init and kvm_set_vcpu_events passes -it back. Maybe it should ignore events.smi.latched_init if not in SMM, -but I would like to understand the exact sequence of events. -time0: -vcpu1: -qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> -> do_kvm_cpu_synchronize_state(and set vcpu1's cpu->kvm_vcpu_dirty to true)-> -kvm_arch_get_registers(KVM_APIC_INIT bit in vcpu->arch.apic->pending_events was not set) - -time1: -vcpu0: -send INIT-SIPI to all AP->(in vcpu 0's context)__apic_accept_irq(KVM_APIC_INIT bit -in vcpu1's arch.apic->pending_events is set) - -time2: -vcpu1: -kvm_cpu_exec->(if cpu->kvm_vcpu_dirty is -true)kvm_arch_put_registers->kvm_put_vcpu_events(overwritten KVM_APIC_INIT bit in -vcpu->arch.apic->pending_events!) - -So it's a race between vcpu1 get/put registers with kvm/other vcpus changing -vcpu1's status/structure fields in the mean time, I am in worry of if there are -other fields may be overwritten, -sipi_vector is one. - -also see: -https://www.mail-archive.com/address@hidden/msg438675.html -Thanks, - -paolo - -. - -Hi Paolo, - -What's your opinion about this patch? We found it just before finishing patches -for the past two days. - - -Thanks, --Gonglei - - -> ------Original Message----- -> -From: address@hidden [ -mailto:address@hidden -On -> -Behalf Of Herongguang (Stephen) -> -Sent: Thursday, April 06, 2017 9:47 AM -> -To: Paolo Bonzini; address@hidden; address@hidden; -> -address@hidden; address@hidden; address@hidden; -> -wangxin (U); Huangweidong (C) -> -Subject: Re: [BUG/RFC] INIT IPI lost when VM starts -> -> -> -> -On 2017/4/6 0:16, Paolo Bonzini wrote: -> -> -> -> On 20/03/2017 15:21, Herongguang (Stephen) wrote: -> ->> We encountered a problem that when a domain starts, seabios failed to -> ->> online a vCPU. -> ->> -> ->> After investigation, we found that the reason is in kvm-kmod, -> ->> KVM_APIC_INIT bit in -> ->> vcpu->arch.apic->pending_events was overwritten by qemu, and thus an -> ->> INIT IPI sent -> ->> to AP was lost. Qemu does this since libvirtd sends a âquery-cpusâ qmp -> ->> command to qemu -> ->> on VM start. -> ->> -> ->> In qemu, qmp_query_cpus-> cpu_synchronize_state-> -> ->> kvm_cpu_synchronize_state-> -> ->> do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from -> ->> kvm-kmod and -> ->> sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call -> ->> kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus -> ->> pending_events is -> ->> overwritten by qemu. -> ->> -> ->> I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true -> ->> after âquery-cpusâ, -> ->> and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am -> ->> not sure whether -> ->> it is OK for qemu to set cpu->kvm_vcpu_dirty in -> ->> do_kvm_cpu_synchronize_state in each caller. -> ->> -> ->> Whatâs your opinion? -> -> Hi Rongguang, -> -> -> -> sorry for the late response. -> -> -> -> Where exactly is KVM_APIC_INIT dropped? kvm_get_mp_state does clear -> -the -> -> bit, but the result of the INIT is stored in mp_state. -> -> -It's dropped in KVM_SET_VCPU_EVENTS, see below. -> -> -> -> -> kvm_get_vcpu_events is called after kvm_get_mp_state; it retrieves -> -> KVM_APIC_INIT in events.smi.latched_init and kvm_set_vcpu_events passes -> -> it back. Maybe it should ignore events.smi.latched_init if not in SMM, -> -> but I would like to understand the exact sequence of events. -> -> -time0: -> -vcpu1: -> -qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> -> -> do_kvm_cpu_synchronize_state(and set vcpu1's cpu->kvm_vcpu_dirty to -> -true)-> kvm_arch_get_registers(KVM_APIC_INIT bit in -> -vcpu->arch.apic->pending_events was not set) -> -> -time1: -> -vcpu0: -> -send INIT-SIPI to all AP->(in vcpu 0's -> -context)__apic_accept_irq(KVM_APIC_INIT bit in vcpu1's -> -arch.apic->pending_events is set) -> -> -time2: -> -vcpu1: -> -kvm_cpu_exec->(if cpu->kvm_vcpu_dirty is -> -true)kvm_arch_put_registers->kvm_put_vcpu_events(overwritten -> -KVM_APIC_INIT bit in vcpu->arch.apic->pending_events!) -> -> -So it's a race between vcpu1 get/put registers with kvm/other vcpus changing -> -vcpu1's status/structure fields in the mean time, I am in worry of if there -> -are -> -other fields may be overwritten, -> -sipi_vector is one. -> -> -also see: -> -https://www.mail-archive.com/address@hidden/msg438675.html -> -> -> Thanks, -> -> -> -> paolo -> -> -> -> . -> -> -> - -2017-11-20 06:57+0000, Gonglei (Arei): -> -Hi Paolo, -> -> -What's your opinion about this patch? We found it just before finishing -> -patches -> -for the past two days. -I think your case was fixed by f4ef19108608 ("KVM: X86: Fix loss of -pending INIT due to race"), but that patch didn't fix it perfectly, so -maybe you're hitting a similar case that happens in SMM ... - -> -> -----Original Message----- -> -> From: address@hidden [ -mailto:address@hidden -On -> -> Behalf Of Herongguang (Stephen) -> -> On 2017/4/6 0:16, Paolo Bonzini wrote: -> -> > Hi Rongguang, -> -> > -> -> > sorry for the late response. -> -> > -> -> > Where exactly is KVM_APIC_INIT dropped? kvm_get_mp_state does clear -> -> the -> -> > bit, but the result of the INIT is stored in mp_state. -> -> -> -> It's dropped in KVM_SET_VCPU_EVENTS, see below. -> -> -> -> > -> -> > kvm_get_vcpu_events is called after kvm_get_mp_state; it retrieves -> -> > KVM_APIC_INIT in events.smi.latched_init and kvm_set_vcpu_events passes -> -> > it back. Maybe it should ignore events.smi.latched_init if not in SMM, -> -> > but I would like to understand the exact sequence of events. -> -> -> -> time0: -> -> vcpu1: -> -> qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> -> -> > do_kvm_cpu_synchronize_state(and set vcpu1's cpu->kvm_vcpu_dirty to -> -> true)-> kvm_arch_get_registers(KVM_APIC_INIT bit in -> -> vcpu->arch.apic->pending_events was not set) -> -> -> -> time1: -> -> vcpu0: -> -> send INIT-SIPI to all AP->(in vcpu 0's -> -> context)__apic_accept_irq(KVM_APIC_INIT bit in vcpu1's -> -> arch.apic->pending_events is set) -> -> -> -> time2: -> -> vcpu1: -> -> kvm_cpu_exec->(if cpu->kvm_vcpu_dirty is -> -> true)kvm_arch_put_registers->kvm_put_vcpu_events(overwritten -> -> KVM_APIC_INIT bit in vcpu->arch.apic->pending_events!) -> -> -> -> So it's a race between vcpu1 get/put registers with kvm/other vcpus changing -> -> vcpu1's status/structure fields in the mean time, I am in worry of if there -> -> are -> -> other fields may be overwritten, -> -> sipi_vector is one. -Fields that can be asynchronously written by other VCPUs (like SIPI, -NMI) must not be SET if other VCPUs were not paused since the last GET. -(Looking at the interface, we can currently lose pending SMI.) - -INIT is one of the restricted fields, but the API unconditionally -couples SMM with latched INIT, which means that we can lose an INIT if -the VCPU is in SMM mode -- do you see SMM in kvm_vcpu_events? - -Thanks. - diff --git a/results/classifier/02/other/48245039 b/results/classifier/02/other/48245039 deleted file mode 100644 index 1dc80e09e..000000000 --- a/results/classifier/02/other/48245039 +++ /dev/null @@ -1,531 +0,0 @@ -other: 0.953 -instruction: 0.951 -semantic: 0.939 -boot: 0.932 -mistranslation: 0.888 - -[Qemu-devel] [BUG] gcov support appears to be broken - -Hello, according to out docs, here is the procedure that should produce -coverage report for execution of the complete "make check": - -#./configure --enable-gcov -#make -#make check -#make coverage-report - -It seems that first three commands execute as expected. (For example, there are -plenty of files generated by "make check" that would've not been generated if -"enable-gcov" hadn't been chosen.) However, the last command complains about -some missing files related to FP support. If those files are added (for -example, artificially, using "touch <missing-file"), that it starts complaining -about missing some decodetree-generated files. Other kinds of files are -involved too. - -It would be nice to have coverage support working. Please somebody take a look, -or explain if I make a mistake or misunderstood our gcov support. - -Yours, -Aleksandar - -On Mon, 5 Aug 2019 at 11:39, Aleksandar Markovic <address@hidden> wrote: -> -> -Hello, according to out docs, here is the procedure that should produce -> -coverage report for execution of the complete "make check": -> -> -#./configure --enable-gcov -> -#make -> -#make check -> -#make coverage-report -> -> -It seems that first three commands execute as expected. (For example, there -> -are plenty of files generated by "make check" that would've not been -> -generated if "enable-gcov" hadn't been chosen.) However, the last command -> -complains about some missing files related to FP support. If those files are -> -added (for example, artificially, using "touch <missing-file"), that it -> -starts complaining about missing some decodetree-generated files. Other kinds -> -of files are involved too. -> -> -It would be nice to have coverage support working. Please somebody take a -> -look, or explain if I make a mistake or misunderstood our gcov support. -Cc'ing Alex who's probably the closest we have to a gcov expert. - -(make/make check of a --enable-gcov build is in the set of things our -Travis CI setup runs, so we do defend that part against regressions.) - -thanks --- PMM - -Peter Maydell <address@hidden> writes: - -> -On Mon, 5 Aug 2019 at 11:39, Aleksandar Markovic <address@hidden> wrote: -> -> -> -> Hello, according to out docs, here is the procedure that should produce -> -> coverage report for execution of the complete "make check": -> -> -> -> #./configure --enable-gcov -> -> #make -> -> #make check -> -> #make coverage-report -> -> -> -> It seems that first three commands execute as expected. (For example, -> -> there are plenty of files generated by "make check" that would've not -> -> been generated if "enable-gcov" hadn't been chosen.) However, the -> -> last command complains about some missing files related to FP -> -> support. If those files are added (for example, artificially, using -> -> "touch <missing-file"), that it starts complaining about missing some -> -> decodetree-generated files. Other kinds of files are involved too. -The gcov tool is fairly noisy about missing files but that just -indicates the tests haven't exercised those code paths. "make check" -especially doesn't touch much of the TCG code and a chunk of floating -point. - -> -> -> -> It would be nice to have coverage support working. Please somebody -> -> take a look, or explain if I make a mistake or misunderstood our gcov -> -> support. -So your failure mode is no report is generated at all? It's working for -me here. - -> -> -Cc'ing Alex who's probably the closest we have to a gcov expert. -> -> -(make/make check of a --enable-gcov build is in the set of things our -> -Travis CI setup runs, so we do defend that part against regressions.) -We defend the build but I have just checked and it seems our -check_coverage script is currently failing: -https://travis-ci.org/stsquad/qemu/jobs/567809808#L10328 -But as it's an after_success script it doesn't fail the build. - -> -> -thanks -> --- PMM --- -Alex Bennée - -> -> #./configure --enable-gcov -> -> #make -> -> #make check -> -> #make coverage-report -> -> -> -> It seems that first three commands execute as expected. (For example, -> -> there are plenty of files generated by "make check" that would've not -> -> been generated if "enable-gcov" hadn't been chosen.) However, the -> -> last command complains about some missing files related to FP -> -So your failure mode is no report is generated at all? It's working for -> -me here. -Alex, no report is generated for my test setups - in fact, "make -coverage-report" even says that it explicitly deletes what appears to be the -main coverage report html file). - -This is the terminal output of an unsuccessful executions of "make -coverage-report" for recent ToT: - -~/Build/qemu-TOT-TEST$ make coverage-report -make[1]: Entering directory '/home/user/Build/qemu-TOT-TEST/slirp' -make[1]: Nothing to be done for 'all'. -make[1]: Leaving directory '/home/user/Build/qemu-TOT-TEST/slirp' - CHK version_gen.h - GEN coverage-report.html -Traceback (most recent call last): - File "/usr/bin/gcovr", line 1970, in <module> - print_html_report(covdata, options.html_details) - File "/usr/bin/gcovr", line 1473, in print_html_report - INPUT = open(data['FILENAME'], 'r') -IOError: [Errno 2] No such file or directory: 'wrap.inc.c' -Makefile:1048: recipe for target -'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' failed -make: *** -[/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html] Error 1 -make: *** Deleting file -'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' - -This instance is executed in QEMU 3.0 source tree: (so, it looks the problem -existed for quite some time) - -~/Build/qemu-3.0$ make coverage-report - CHK version_gen.h - GEN coverage-report.html -Traceback (most recent call last): - File "/usr/bin/gcovr", line 1970, in <module> - print_html_report(covdata, options.html_details) - File "/usr/bin/gcovr", line 1473, in print_html_report - INPUT = open(data['FILENAME'], 'r') -IOError: [Errno 2] No such file or directory: -'/home/user/Build/qemu-3.0/target/openrisc/decode.inc.c' -Makefile:992: recipe for target -'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' failed -make: *** [/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html] -Error 1 -make: *** Deleting file -'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' - -Fond regards, -Aleksandar - - -> -Alex Bennée - -> -> #./configure --enable-gcov -> -> #make -> -> #make check -> -> #make coverage-report -> -> -> -> It seems that first three commands execute as expected. (For example, -> -> there are plenty of files generated by "make check" that would've not -> -> been generated if "enable-gcov" hadn't been chosen.) However, the -> -> last command complains about some missing files related to FP -> -So your failure mode is no report is generated at all? It's working for -> -me here. -Another piece of info: - -~/Build/qemu-TOT-TEST$ gcov --version -gcov (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010 -Copyright (C) 2015 Free Software Foundation, Inc. -This is free software; see the source for copying conditions. -There is NO warranty; not even for MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. - -:~/Build/qemu-TOT-TEST$ gcc --version -gcc (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0 -Copyright (C) 2017 Free Software Foundation, Inc. -This is free software; see the source for copying conditions. There is NO -warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. - - - - -Alex, no report is generated for my test setups - in fact, "make -coverage-report" even says that it explicitly deletes what appears to be the -main coverage report html file). - -This is the terminal output of an unsuccessful executions of "make -coverage-report" for recent ToT: - -~/Build/qemu-TOT-TEST$ make coverage-report -make[1]: Entering directory '/home/user/Build/qemu-TOT-TEST/slirp' -make[1]: Nothing to be done for 'all'. -make[1]: Leaving directory '/home/user/Build/qemu-TOT-TEST/slirp' - CHK version_gen.h - GEN coverage-report.html -Traceback (most recent call last): - File "/usr/bin/gcovr", line 1970, in <module> - print_html_report(covdata, options.html_details) - File "/usr/bin/gcovr", line 1473, in print_html_report - INPUT = open(data['FILENAME'], 'r') -IOError: [Errno 2] No such file or directory: 'wrap.inc.c' -Makefile:1048: recipe for target -'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' failed -make: *** -[/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html] Error 1 -make: *** Deleting file -'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' - -This instance is executed in QEMU 3.0 source tree: (so, it looks the problem -existed for quite some time) - -~/Build/qemu-3.0$ make coverage-report - CHK version_gen.h - GEN coverage-report.html -Traceback (most recent call last): - File "/usr/bin/gcovr", line 1970, in <module> - print_html_report(covdata, options.html_details) - File "/usr/bin/gcovr", line 1473, in print_html_report - INPUT = open(data['FILENAME'], 'r') -IOError: [Errno 2] No such file or directory: -'/home/user/Build/qemu-3.0/target/openrisc/decode.inc.c' -Makefile:992: recipe for target -'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' failed -make: *** [/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html] -Error 1 -make: *** Deleting file -'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' - -Fond regards, -Aleksandar - - -> -Alex Bennée - -> -> #./configure --enable-gcov -> -> #make -> -> #make check -> -> #make coverage-report -> -> -> -> It seems that first three commands execute as expected. (For example, -> -> there are plenty of files generated by "make check" that would've not -> -> been generated if "enable-gcov" hadn't been chosen.) However, the -> -> last command complains about some missing files related to FP -> -So your failure mode is no report is generated at all? It's working for -> -me here. -Alex, here is the thing: - -Seeing that my gcovr is relatively old (2014) 3.2 version, I upgraded it from -git repo to the most recent 4.1 (actually, to a dev version, from the very tip -of the tree), and "make coverage-report" started generating coverage reports. -It did emit some error messages (totally different than previous), but still it -did not stop like it used to do with gcovr 3.2. - -Perhaps you would want to add some gcov/gcovr minimal version info in our docs. -(or at least a statement "this was tested with such and such gcc, gcov and -gcovr", etc.?) - -Coverage report looked fine at first glance, but it a kind of disappointed me -when I digged deeper into its content - for example, it shows very low coverage -for our FP code (softfloat), while, in fact, we know that "make check" contains -detailed tests on FP functionalities. But this is most likely a separate -problem of a very different nature, perhaps the issue of separate git repo for -FP tests (testfloat) that our FP tests use as a mid-layer. - -I'll try how everything works with my test examples, and will let you know. - -Your help is greatly appreciated, -Aleksandar - -Fond regards, -Aleksandar - - -> -Alex Bennée - -Aleksandar Markovic <address@hidden> writes: - -> ->> #./configure --enable-gcov -> ->> #make -> ->> #make check -> ->> #make coverage-report -> ->> -> ->> It seems that first three commands execute as expected. (For example, -> ->> there are plenty of files generated by "make check" that would've not -> ->> been generated if "enable-gcov" hadn't been chosen.) However, the -> ->> last command complains about some missing files related to FP -> -> -> So your failure mode is no report is generated at all? It's working for -> -> me here. -> -> -Alex, here is the thing: -> -> -Seeing that my gcovr is relatively old (2014) 3.2 version, I upgraded it from -> -git repo to the most recent 4.1 (actually, to a dev version, from the very -> -tip of the tree), and "make coverage-report" started generating coverage -> -reports. It did emit some error messages (totally different than previous), -> -but still it did not stop like it used to do with gcovr 3.2. -> -> -Perhaps you would want to add some gcov/gcovr minimal version info in our -> -docs. (or at least a statement "this was tested with such and such gcc, gcov -> -and gcovr", etc.?) -> -> -Coverage report looked fine at first glance, but it a kind of -> -disappointed me when I digged deeper into its content - for example, -> -it shows very low coverage for our FP code (softfloat), while, in -> -fact, we know that "make check" contains detailed tests on FP -> -functionalities. But this is most likely a separate problem of a very -> -different nature, perhaps the issue of separate git repo for FP tests -> -(testfloat) that our FP tests use as a mid-layer. -I get: - -68.6 % 2593 / 3782 62.2 % 1690 / 2718 - -Which is not bad considering we don't exercise the 80 and 128 bit -softfloat code at all (which is not shared by the re-factored 16/32/64 -bit code). - -> -> -I'll try how everything works with my test examples, and will let you know. -> -> -Your help is greatly appreciated, -> -Aleksandar -> -> -Fond regards, -> -Aleksandar -> -> -> -> Alex Bennée --- -Alex Bennée - -> -> it shows very low coverage for our FP code (softfloat), while, in -> -> fact, we know that "make check" contains detailed tests on FP -> -> functionalities. But this is most likely a separate problem of a very -> -> different nature, perhaps the issue of separate git repo for FP tests -> -> (testfloat) that our FP tests use as a mid-layer. -> -> -I get: -> -> -68.6 % 2593 / 3782 62.2 % 1690 / 2718 -> -I would expect that kind of result too. - -However, I get: - -File: fpu/softfloat.c Lines: 8 3334 0.2 % -Date: 2019-08-05 19:56:58 Branches: 3 2376 0.1 % - -:( - -OK, I'll try to figure that out, and most likely I could live with it if it is -an isolated problem. - -Thank you for your assistance in this matter, -Aleksandar - -> -Which is not bad considering we don't exercise the 80 and 128 bit -> -softfloat code at all (which is not shared by the re-factored 16/32/64 -> -bit code). -> -> -Alex Bennée - -> -> it shows very low coverage for our FP code (softfloat), while, in -> -> fact, we know that "make check" contains detailed tests on FP -> -> functionalities. But this is most likely a separate problem of a very -> -> different nature, perhaps the issue of separate git repo for FP tests -> -> (testfloat) that our FP tests use as a mid-layer. -> -> -I get: -> -> -68.6 % 2593 / 3782 62.2 % 1690 / 2718 -> -This problem is solved too. (and it is my fault) - -I worked with multiple versions of QEMU, and my previous low-coverage results -were for QEMU 3.0, and for that version the directory tests/fp did not even -exist. :D (<blush>) - -For QEMU ToT, I get now: - -fpu/softfloat.c - 68.8 % 2592 / 3770 62.3 % 1693 / 2718 - -which is identical for all intents and purposes to your result. - -Yours cordially, -Aleksandar - diff --git a/results/classifier/02/other/55247116 b/results/classifier/02/other/55247116 deleted file mode 100644 index 84c8c4205..000000000 --- a/results/classifier/02/other/55247116 +++ /dev/null @@ -1,1311 +0,0 @@ -other: 0.945 -semantic: 0.928 -instruction: 0.928 -boot: 0.918 -mistranslation: 0.841 - -[Qemu-devel] [RFC/BUG] xen-mapcache: buggy invalidate map cache? - -Hi, - -In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -instead of first level entry (if map to rom other than guest memory -comes first), while in xen_invalidate_map_cache(), when VM ballooned -out memory, qemu did not invalidate cache entries in linked -list(entry->next), so when VM balloon back in memory, gfns probably -mapped to different mfns, thus if guest asks device to DMA to these -GPA, qemu may DMA to stale MFNs. - -So I think in xen_invalidate_map_cache() linked lists should also be -checked and invalidated. - -Whatâs your opinion? Is this a bug? Is my analyze correct? - -On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -Hi, -> -> -In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> -instead of first level entry (if map to rom other than guest memory -> -comes first), while in xen_invalidate_map_cache(), when VM ballooned -> -out memory, qemu did not invalidate cache entries in linked -> -list(entry->next), so when VM balloon back in memory, gfns probably -> -mapped to different mfns, thus if guest asks device to DMA to these -> -GPA, qemu may DMA to stale MFNs. -> -> -So I think in xen_invalidate_map_cache() linked lists should also be -> -checked and invalidated. -> -> -Whatâs your opinion? Is this a bug? Is my analyze correct? -Added Jun Nakajima and Alexander Graf - -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> Hi, -> -> -> -> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> -> instead of first level entry (if map to rom other than guest memory -> -> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> -> out memory, qemu did not invalidate cache entries in linked -> -> list(entry->next), so when VM balloon back in memory, gfns probably -> -> mapped to different mfns, thus if guest asks device to DMA to these -> -> GPA, qemu may DMA to stale MFNs. -> -> -> -> So I think in xen_invalidate_map_cache() linked lists should also be -> -> checked and invalidated. -> -> -> -> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> -Added Jun Nakajima and Alexander Graf -And correct Stefano Stabellini's email address. - -On Mon, 10 Apr 2017 00:36:02 +0800 -hrg <address@hidden> wrote: - -Hi, - -> -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> ->> Hi, -> ->> -> ->> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> ->> instead of first level entry (if map to rom other than guest memory -> ->> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> ->> out memory, qemu did not invalidate cache entries in linked -> ->> list(entry->next), so when VM balloon back in memory, gfns probably -> ->> mapped to different mfns, thus if guest asks device to DMA to these -> ->> GPA, qemu may DMA to stale MFNs. -> ->> -> ->> So I think in xen_invalidate_map_cache() linked lists should also be -> ->> checked and invalidated. -> ->> -> ->> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> -> -> Added Jun Nakajima and Alexander Graf -> -And correct Stefano Stabellini's email address. -There is a real issue with the xen-mapcache corruption in fact. I encountered -it a few months ago while experimenting with Q35 support on Xen. Q35 emulation -uses an AHCI controller by default, along with NCQ mode enabled. The issue can -be (somewhat) easily reproduced there, though using a normal i440 emulation -might possibly allow to reproduce the issue as well, using a dedicated test -code from a guest side. In case of Q35+NCQ the issue can be reproduced "as is". - -The issue occurs when a guest domain performs an intensive disk I/O, ex. while -guest OS booting. QEMU crashes with "Bad ram offset 980aa000" -message logged, where the address is different each time. The hard thing with -this issue is that it has a very low reproducibility rate. - -The corruption happens when there are multiple I/O commands in the NCQ queue. -So there are overlapping emulated DMA operations in flight and QEMU uses a -sequence of mapcache actions which can be executed in the "wrong" order thus -leading to an inconsistent xen-mapcache - so a bad address from the wrong -entry is returned. - -The bad thing with this issue is that QEMU crash due to "Bad ram offset" -appearance is a relatively good situation in the sense that this is a caught -error. But there might be a much worse (artificial) situation where the returned -address looks valid but points to a different mapped memory. - -The fix itself is not hard (ex. an additional checked field in MapCacheEntry), -but there is a need of some reliable way to test it considering the low -reproducibility rate. - -Regards, -Alex - -On Mon, 10 Apr 2017, hrg wrote: -> -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> ->> Hi, -> ->> -> ->> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> ->> instead of first level entry (if map to rom other than guest memory -> ->> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> ->> out memory, qemu did not invalidate cache entries in linked -> ->> list(entry->next), so when VM balloon back in memory, gfns probably -> ->> mapped to different mfns, thus if guest asks device to DMA to these -> ->> GPA, qemu may DMA to stale MFNs. -> ->> -> ->> So I think in xen_invalidate_map_cache() linked lists should also be -> ->> checked and invalidated. -> ->> -> ->> Whatâs your opinion? Is this a bug? Is my analyze correct? -Yes, you are right. We need to go through the list for each element of -the array in xen_invalidate_map_cache. Can you come up with a patch? - -On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -On Mon, 10 Apr 2017, hrg wrote: -> -> On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> >> Hi, -> -> >> -> -> >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> -> >> instead of first level entry (if map to rom other than guest memory -> -> >> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> -> >> out memory, qemu did not invalidate cache entries in linked -> -> >> list(entry->next), so when VM balloon back in memory, gfns probably -> -> >> mapped to different mfns, thus if guest asks device to DMA to these -> -> >> GPA, qemu may DMA to stale MFNs. -> -> >> -> -> >> So I think in xen_invalidate_map_cache() linked lists should also be -> -> >> checked and invalidated. -> -> >> -> -> >> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> -Yes, you are right. We need to go through the list for each element of -> -the array in xen_invalidate_map_cache. Can you come up with a patch? -I spoke too soon. In the regular case there should be no locked mappings -when xen_invalidate_map_cache is called (see the DPRINTF warning at the -beginning of the functions). Without locked mappings, there should never -be more than one element in each list (see xen_map_cache_unlocked: -entry->lock == true is a necessary condition to append a new entry to -the list, otherwise it is just remapped). - -Can you confirm that what you are seeing are locked mappings -when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -by turning it into a printf or by defininig MAPCACHE_DEBUG. - -On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -<address@hidden> wrote: -> -On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -> On Mon, 10 Apr 2017, hrg wrote: -> -> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> > >> Hi, -> -> > >> -> -> > >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> -> > >> instead of first level entry (if map to rom other than guest memory -> -> > >> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> -> > >> out memory, qemu did not invalidate cache entries in linked -> -> > >> list(entry->next), so when VM balloon back in memory, gfns probably -> -> > >> mapped to different mfns, thus if guest asks device to DMA to these -> -> > >> GPA, qemu may DMA to stale MFNs. -> -> > >> -> -> > >> So I think in xen_invalidate_map_cache() linked lists should also be -> -> > >> checked and invalidated. -> -> > >> -> -> > >> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> -> -> Yes, you are right. We need to go through the list for each element of -> -> the array in xen_invalidate_map_cache. Can you come up with a patch? -> -> -I spoke too soon. In the regular case there should be no locked mappings -> -when xen_invalidate_map_cache is called (see the DPRINTF warning at the -> -beginning of the functions). Without locked mappings, there should never -> -be more than one element in each list (see xen_map_cache_unlocked: -> -entry->lock == true is a necessary condition to append a new entry to -> -the list, otherwise it is just remapped). -> -> -Can you confirm that what you are seeing are locked mappings -> -when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -> -by turning it into a printf or by defininig MAPCACHE_DEBUG. -In fact, I think the DPRINTF above is incorrect too. In -pci_add_option_rom(), rtl8139 rom is locked mapped in -pci_add_option_rom->memory_region_get_ram_ptr (after -memory_region_init_ram). So actually I think we should remove the -DPRINTF warning as it is normal. - -On Tue, 11 Apr 2017, hrg wrote: -> -On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -> -<address@hidden> wrote: -> -> On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> ->> On Mon, 10 Apr 2017, hrg wrote: -> ->> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> ->> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> ->> > >> Hi, -> ->> > >> -> ->> > >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -> ->> > >> instead of first level entry (if map to rom other than guest memory -> ->> > >> comes first), while in xen_invalidate_map_cache(), when VM ballooned -> ->> > >> out memory, qemu did not invalidate cache entries in linked -> ->> > >> list(entry->next), so when VM balloon back in memory, gfns probably -> ->> > >> mapped to different mfns, thus if guest asks device to DMA to these -> ->> > >> GPA, qemu may DMA to stale MFNs. -> ->> > >> -> ->> > >> So I think in xen_invalidate_map_cache() linked lists should also be -> ->> > >> checked and invalidated. -> ->> > >> -> ->> > >> Whatâs your opinion? Is this a bug? Is my analyze correct? -> ->> -> ->> Yes, you are right. We need to go through the list for each element of -> ->> the array in xen_invalidate_map_cache. Can you come up with a patch? -> -> -> -> I spoke too soon. In the regular case there should be no locked mappings -> -> when xen_invalidate_map_cache is called (see the DPRINTF warning at the -> -> beginning of the functions). Without locked mappings, there should never -> -> be more than one element in each list (see xen_map_cache_unlocked: -> -> entry->lock == true is a necessary condition to append a new entry to -> -> the list, otherwise it is just remapped). -> -> -> -> Can you confirm that what you are seeing are locked mappings -> -> when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -> -> by turning it into a printf or by defininig MAPCACHE_DEBUG. -> -> -In fact, I think the DPRINTF above is incorrect too. In -> -pci_add_option_rom(), rtl8139 rom is locked mapped in -> -pci_add_option_rom->memory_region_get_ram_ptr (after -> -memory_region_init_ram). So actually I think we should remove the -> -DPRINTF warning as it is normal. -Let me explain why the DPRINTF warning is there: emulated dma operations -can involve locked mappings. Once a dma operation completes, the related -mapping is unlocked and can be safely destroyed. But if we destroy a -locked mapping in xen_invalidate_map_cache, while a dma is still -ongoing, QEMU will crash. We cannot handle that case. - -However, the scenario you described is different. It has nothing to do -with DMA. It looks like pci_add_option_rom calls -memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -locked mapping and it is never unlocked or destroyed. - -It looks like "ptr" is not used after pci_add_option_rom returns. Does -the append patch fix the problem you are seeing? For the proper fix, I -think we probably need some sort of memory_region_unmap wrapper or maybe -a call to address_space_unmap. - - -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -index e6b08e1..04f98b7 100644 ---- a/hw/pci/pci.c -+++ b/hw/pci/pci.c -@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool -is_default_rom, - } - - pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); -+ xen_invalidate_map_cache_entry(ptr); - } - - static void pci_del_option_rom(PCIDevice *pdev) - -On Tue, 11 Apr 2017 15:32:09 -0700 (PDT) -Stefano Stabellini <address@hidden> wrote: - -> -On Tue, 11 Apr 2017, hrg wrote: -> -> On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -> -> <address@hidden> wrote: -> -> > On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -> >> On Mon, 10 Apr 2017, hrg wrote: -> -> >> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> >> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> >> > >> Hi, -> -> >> > >> -> -> >> > >> In xen_map_cache_unlocked(), map to guest memory maybe in -> -> >> > >> entry->next instead of first level entry (if map to rom other than -> -> >> > >> guest memory comes first), while in xen_invalidate_map_cache(), -> -> >> > >> when VM ballooned out memory, qemu did not invalidate cache entries -> -> >> > >> in linked list(entry->next), so when VM balloon back in memory, -> -> >> > >> gfns probably mapped to different mfns, thus if guest asks device -> -> >> > >> to DMA to these GPA, qemu may DMA to stale MFNs. -> -> >> > >> -> -> >> > >> So I think in xen_invalidate_map_cache() linked lists should also be -> -> >> > >> checked and invalidated. -> -> >> > >> -> -> >> > >> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> >> -> -> >> Yes, you are right. We need to go through the list for each element of -> -> >> the array in xen_invalidate_map_cache. Can you come up with a patch? -> -> > -> -> > I spoke too soon. In the regular case there should be no locked mappings -> -> > when xen_invalidate_map_cache is called (see the DPRINTF warning at the -> -> > beginning of the functions). Without locked mappings, there should never -> -> > be more than one element in each list (see xen_map_cache_unlocked: -> -> > entry->lock == true is a necessary condition to append a new entry to -> -> > the list, otherwise it is just remapped). -> -> > -> -> > Can you confirm that what you are seeing are locked mappings -> -> > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -> -> > by turning it into a printf or by defininig MAPCACHE_DEBUG. -> -> -> -> In fact, I think the DPRINTF above is incorrect too. In -> -> pci_add_option_rom(), rtl8139 rom is locked mapped in -> -> pci_add_option_rom->memory_region_get_ram_ptr (after -> -> memory_region_init_ram). So actually I think we should remove the -> -> DPRINTF warning as it is normal. -> -> -Let me explain why the DPRINTF warning is there: emulated dma operations -> -can involve locked mappings. Once a dma operation completes, the related -> -mapping is unlocked and can be safely destroyed. But if we destroy a -> -locked mapping in xen_invalidate_map_cache, while a dma is still -> -ongoing, QEMU will crash. We cannot handle that case. -> -> -However, the scenario you described is different. It has nothing to do -> -with DMA. It looks like pci_add_option_rom calls -> -memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -> -locked mapping and it is never unlocked or destroyed. -> -> -It looks like "ptr" is not used after pci_add_option_rom returns. Does -> -the append patch fix the problem you are seeing? For the proper fix, I -> -think we probably need some sort of memory_region_unmap wrapper or maybe -> -a call to address_space_unmap. -Hmm, for some reason my message to the Xen-devel list got rejected but was sent -to Qemu-devel instead, without any notice. Sorry if I'm missing something -obvious as a list newbie. - -Stefano, hrg, - -There is an issue with inconsistency between the list of normal MapCacheEntry's -and their 'reverse' counterparts - MapCacheRev's in locked_entries. -When bad situation happens, there are multiple (locked) MapCacheEntry -entries in the bucket's linked list along with a number of MapCacheRev's. And -when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the -first list and calculates a wrong pointer from it which may then be caught with -the "Bad RAM offset" check (or not). Mapcache invalidation might be related to -this issue as well I think. - -I'll try to provide a test code which can reproduce the issue from the -guest side using an emulated IDE controller, though it's much simpler to achieve -this result with an AHCI controller using multiple NCQ I/O commands. So far I've -seen this issue only with Windows 7 (and above) guest on AHCI, but any block I/O -DMA should be enough I think. - -On 2017/4/12 14:17, Alexey G wrote: -On Tue, 11 Apr 2017 15:32:09 -0700 (PDT) -Stefano Stabellini <address@hidden> wrote: -On Tue, 11 Apr 2017, hrg wrote: -On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -<address@hidden> wrote: -On Mon, 10 Apr 2017, Stefano Stabellini wrote: -On Mon, 10 Apr 2017, hrg wrote: -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -Hi, - -In xen_map_cache_unlocked(), map to guest memory maybe in -entry->next instead of first level entry (if map to rom other than -guest memory comes first), while in xen_invalidate_map_cache(), -when VM ballooned out memory, qemu did not invalidate cache entries -in linked list(entry->next), so when VM balloon back in memory, -gfns probably mapped to different mfns, thus if guest asks device -to DMA to these GPA, qemu may DMA to stale MFNs. - -So I think in xen_invalidate_map_cache() linked lists should also be -checked and invalidated. - -Whatâs your opinion? Is this a bug? Is my analyze correct? -Yes, you are right. We need to go through the list for each element of -the array in xen_invalidate_map_cache. Can you come up with a patch? -I spoke too soon. In the regular case there should be no locked mappings -when xen_invalidate_map_cache is called (see the DPRINTF warning at the -beginning of the functions). Without locked mappings, there should never -be more than one element in each list (see xen_map_cache_unlocked: -entry->lock == true is a necessary condition to append a new entry to -the list, otherwise it is just remapped). - -Can you confirm that what you are seeing are locked mappings -when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -by turning it into a printf or by defininig MAPCACHE_DEBUG. -In fact, I think the DPRINTF above is incorrect too. In -pci_add_option_rom(), rtl8139 rom is locked mapped in -pci_add_option_rom->memory_region_get_ram_ptr (after -memory_region_init_ram). So actually I think we should remove the -DPRINTF warning as it is normal. -Let me explain why the DPRINTF warning is there: emulated dma operations -can involve locked mappings. Once a dma operation completes, the related -mapping is unlocked and can be safely destroyed. But if we destroy a -locked mapping in xen_invalidate_map_cache, while a dma is still -ongoing, QEMU will crash. We cannot handle that case. - -However, the scenario you described is different. It has nothing to do -with DMA. It looks like pci_add_option_rom calls -memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -locked mapping and it is never unlocked or destroyed. - -It looks like "ptr" is not used after pci_add_option_rom returns. Does -the append patch fix the problem you are seeing? For the proper fix, I -think we probably need some sort of memory_region_unmap wrapper or maybe -a call to address_space_unmap. -Hmm, for some reason my message to the Xen-devel list got rejected but was sent -to Qemu-devel instead, without any notice. Sorry if I'm missing something -obvious as a list newbie. - -Stefano, hrg, - -There is an issue with inconsistency between the list of normal MapCacheEntry's -and their 'reverse' counterparts - MapCacheRev's in locked_entries. -When bad situation happens, there are multiple (locked) MapCacheEntry -entries in the bucket's linked list along with a number of MapCacheRev's. And -when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the -first list and calculates a wrong pointer from it which may then be caught with -the "Bad RAM offset" check (or not). Mapcache invalidation might be related to -this issue as well I think. - -I'll try to provide a test code which can reproduce the issue from the -guest side using an emulated IDE controller, though it's much simpler to achieve -this result with an AHCI controller using multiple NCQ I/O commands. So far I've -seen this issue only with Windows 7 (and above) guest on AHCI, but any block I/O -DMA should be enough I think. -Yes, I think there may be other bugs lurking, considering the complexity, -though we need to reproduce it if we want to delve into it. - -On Wed, 12 Apr 2017, Alexey G wrote: -> -On Tue, 11 Apr 2017 15:32:09 -0700 (PDT) -> -Stefano Stabellini <address@hidden> wrote: -> -> -> On Tue, 11 Apr 2017, hrg wrote: -> -> > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -> -> > <address@hidden> wrote: -> -> > > On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -> > >> On Mon, 10 Apr 2017, hrg wrote: -> -> > >> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> > >> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> > >> > >> Hi, -> -> > >> > >> -> -> > >> > >> In xen_map_cache_unlocked(), map to guest memory maybe in -> -> > >> > >> entry->next instead of first level entry (if map to rom other than -> -> > >> > >> guest memory comes first), while in xen_invalidate_map_cache(), -> -> > >> > >> when VM ballooned out memory, qemu did not invalidate cache -> -> > >> > >> entries -> -> > >> > >> in linked list(entry->next), so when VM balloon back in memory, -> -> > >> > >> gfns probably mapped to different mfns, thus if guest asks device -> -> > >> > >> to DMA to these GPA, qemu may DMA to stale MFNs. -> -> > >> > >> -> -> > >> > >> So I think in xen_invalidate_map_cache() linked lists should also -> -> > >> > >> be -> -> > >> > >> checked and invalidated. -> -> > >> > >> -> -> > >> > >> Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> > >> -> -> > >> Yes, you are right. We need to go through the list for each element of -> -> > >> the array in xen_invalidate_map_cache. Can you come up with a patch? -> -> > > -> -> > > I spoke too soon. In the regular case there should be no locked mappings -> -> > > when xen_invalidate_map_cache is called (see the DPRINTF warning at the -> -> > > beginning of the functions). Without locked mappings, there should never -> -> > > be more than one element in each list (see xen_map_cache_unlocked: -> -> > > entry->lock == true is a necessary condition to append a new entry to -> -> > > the list, otherwise it is just remapped). -> -> > > -> -> > > Can you confirm that what you are seeing are locked mappings -> -> > > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -> -> > > by turning it into a printf or by defininig MAPCACHE_DEBUG. -> -> > -> -> > In fact, I think the DPRINTF above is incorrect too. In -> -> > pci_add_option_rom(), rtl8139 rom is locked mapped in -> -> > pci_add_option_rom->memory_region_get_ram_ptr (after -> -> > memory_region_init_ram). So actually I think we should remove the -> -> > DPRINTF warning as it is normal. -> -> -> -> Let me explain why the DPRINTF warning is there: emulated dma operations -> -> can involve locked mappings. Once a dma operation completes, the related -> -> mapping is unlocked and can be safely destroyed. But if we destroy a -> -> locked mapping in xen_invalidate_map_cache, while a dma is still -> -> ongoing, QEMU will crash. We cannot handle that case. -> -> -> -> However, the scenario you described is different. It has nothing to do -> -> with DMA. It looks like pci_add_option_rom calls -> -> memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -> -> locked mapping and it is never unlocked or destroyed. -> -> -> -> It looks like "ptr" is not used after pci_add_option_rom returns. Does -> -> the append patch fix the problem you are seeing? For the proper fix, I -> -> think we probably need some sort of memory_region_unmap wrapper or maybe -> -> a call to address_space_unmap. -> -> -Hmm, for some reason my message to the Xen-devel list got rejected but was -> -sent -> -to Qemu-devel instead, without any notice. Sorry if I'm missing something -> -obvious as a list newbie. -> -> -Stefano, hrg, -> -> -There is an issue with inconsistency between the list of normal -> -MapCacheEntry's -> -and their 'reverse' counterparts - MapCacheRev's in locked_entries. -> -When bad situation happens, there are multiple (locked) MapCacheEntry -> -entries in the bucket's linked list along with a number of MapCacheRev's. And -> -when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the -> -first list and calculates a wrong pointer from it which may then be caught -> -with -> -the "Bad RAM offset" check (or not). Mapcache invalidation might be related to -> -this issue as well I think. -> -> -I'll try to provide a test code which can reproduce the issue from the -> -guest side using an emulated IDE controller, though it's much simpler to -> -achieve -> -this result with an AHCI controller using multiple NCQ I/O commands. So far -> -I've -> -seen this issue only with Windows 7 (and above) guest on AHCI, but any block -> -I/O -> -DMA should be enough I think. -That would be helpful. Please see if you can reproduce it after fixing -the other issue ( -http://marc.info/?l=qemu-devel&m=149195042500707&w=2 -). - -On 2017/4/12 6:32, Stefano Stabellini wrote: -On Tue, 11 Apr 2017, hrg wrote: -On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -<address@hidden> wrote: -On Mon, 10 Apr 2017, Stefano Stabellini wrote: -On Mon, 10 Apr 2017, hrg wrote: -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -Hi, - -In xen_map_cache_unlocked(), map to guest memory maybe in entry->next -instead of first level entry (if map to rom other than guest memory -comes first), while in xen_invalidate_map_cache(), when VM ballooned -out memory, qemu did not invalidate cache entries in linked -list(entry->next), so when VM balloon back in memory, gfns probably -mapped to different mfns, thus if guest asks device to DMA to these -GPA, qemu may DMA to stale MFNs. - -So I think in xen_invalidate_map_cache() linked lists should also be -checked and invalidated. - -Whatâs your opinion? Is this a bug? Is my analyze correct? -Yes, you are right. We need to go through the list for each element of -the array in xen_invalidate_map_cache. Can you come up with a patch? -I spoke too soon. In the regular case there should be no locked mappings -when xen_invalidate_map_cache is called (see the DPRINTF warning at the -beginning of the functions). Without locked mappings, there should never -be more than one element in each list (see xen_map_cache_unlocked: -entry->lock == true is a necessary condition to append a new entry to -the list, otherwise it is just remapped). - -Can you confirm that what you are seeing are locked mappings -when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -by turning it into a printf or by defininig MAPCACHE_DEBUG. -In fact, I think the DPRINTF above is incorrect too. In -pci_add_option_rom(), rtl8139 rom is locked mapped in -pci_add_option_rom->memory_region_get_ram_ptr (after -memory_region_init_ram). So actually I think we should remove the -DPRINTF warning as it is normal. -Let me explain why the DPRINTF warning is there: emulated dma operations -can involve locked mappings. Once a dma operation completes, the related -mapping is unlocked and can be safely destroyed. But if we destroy a -locked mapping in xen_invalidate_map_cache, while a dma is still -ongoing, QEMU will crash. We cannot handle that case. - -However, the scenario you described is different. It has nothing to do -with DMA. It looks like pci_add_option_rom calls -memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -locked mapping and it is never unlocked or destroyed. - -It looks like "ptr" is not used after pci_add_option_rom returns. Does -the append patch fix the problem you are seeing? For the proper fix, I -think we probably need some sort of memory_region_unmap wrapper or maybe -a call to address_space_unmap. -Yes, I think so, maybe this is the proper way to fix this. -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -index e6b08e1..04f98b7 100644 ---- a/hw/pci/pci.c -+++ b/hw/pci/pci.c -@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool -is_default_rom, - } -pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); -+ xen_invalidate_map_cache_entry(ptr); - } -static void pci_del_option_rom(PCIDevice *pdev) - -On Wed, 12 Apr 2017, Herongguang (Stephen) wrote: -> -On 2017/4/12 6:32, Stefano Stabellini wrote: -> -> On Tue, 11 Apr 2017, hrg wrote: -> -> > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -> -> > <address@hidden> wrote: -> -> > > On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -> > > > On Mon, 10 Apr 2017, hrg wrote: -> -> > > > > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -> -> > > > > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -> -> > > > > > > Hi, -> -> > > > > > > -> -> > > > > > > In xen_map_cache_unlocked(), map to guest memory maybe in -> -> > > > > > > entry->next -> -> > > > > > > instead of first level entry (if map to rom other than guest -> -> > > > > > > memory -> -> > > > > > > comes first), while in xen_invalidate_map_cache(), when VM -> -> > > > > > > ballooned -> -> > > > > > > out memory, qemu did not invalidate cache entries in linked -> -> > > > > > > list(entry->next), so when VM balloon back in memory, gfns -> -> > > > > > > probably -> -> > > > > > > mapped to different mfns, thus if guest asks device to DMA to -> -> > > > > > > these -> -> > > > > > > GPA, qemu may DMA to stale MFNs. -> -> > > > > > > -> -> > > > > > > So I think in xen_invalidate_map_cache() linked lists should -> -> > > > > > > also be -> -> > > > > > > checked and invalidated. -> -> > > > > > > -> -> > > > > > > Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> > > > Yes, you are right. We need to go through the list for each element of -> -> > > > the array in xen_invalidate_map_cache. Can you come up with a patch? -> -> > > I spoke too soon. In the regular case there should be no locked mappings -> -> > > when xen_invalidate_map_cache is called (see the DPRINTF warning at the -> -> > > beginning of the functions). Without locked mappings, there should never -> -> > > be more than one element in each list (see xen_map_cache_unlocked: -> -> > > entry->lock == true is a necessary condition to append a new entry to -> -> > > the list, otherwise it is just remapped). -> -> > > -> -> > > Can you confirm that what you are seeing are locked mappings -> -> > > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -> -> > > by turning it into a printf or by defininig MAPCACHE_DEBUG. -> -> > In fact, I think the DPRINTF above is incorrect too. In -> -> > pci_add_option_rom(), rtl8139 rom is locked mapped in -> -> > pci_add_option_rom->memory_region_get_ram_ptr (after -> -> > memory_region_init_ram). So actually I think we should remove the -> -> > DPRINTF warning as it is normal. -> -> Let me explain why the DPRINTF warning is there: emulated dma operations -> -> can involve locked mappings. Once a dma operation completes, the related -> -> mapping is unlocked and can be safely destroyed. But if we destroy a -> -> locked mapping in xen_invalidate_map_cache, while a dma is still -> -> ongoing, QEMU will crash. We cannot handle that case. -> -> -> -> However, the scenario you described is different. It has nothing to do -> -> with DMA. It looks like pci_add_option_rom calls -> -> memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -> -> locked mapping and it is never unlocked or destroyed. -> -> -> -> It looks like "ptr" is not used after pci_add_option_rom returns. Does -> -> the append patch fix the problem you are seeing? For the proper fix, I -> -> think we probably need some sort of memory_region_unmap wrapper or maybe -> -> a call to address_space_unmap. -> -> -Yes, I think so, maybe this is the proper way to fix this. -Would you be up for sending a proper patch and testing it? We cannot call -xen_invalidate_map_cache_entry directly from pci.c though, it would need -to be one of the other functions like address_space_unmap for example. - - -> -> diff --git a/hw/pci/pci.c b/hw/pci/pci.c -> -> index e6b08e1..04f98b7 100644 -> -> --- a/hw/pci/pci.c -> -> +++ b/hw/pci/pci.c -> -> @@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool -> -> is_default_rom, -> -> } -> -> pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); -> -> + xen_invalidate_map_cache_entry(ptr); -> -> } -> -> static void pci_del_option_rom(PCIDevice *pdev) - -On 2017/4/13 7:51, Stefano Stabellini wrote: -On Wed, 12 Apr 2017, Herongguang (Stephen) wrote: -On 2017/4/12 6:32, Stefano Stabellini wrote: -On Tue, 11 Apr 2017, hrg wrote: -On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -<address@hidden> wrote: -On Mon, 10 Apr 2017, Stefano Stabellini wrote: -On Mon, 10 Apr 2017, hrg wrote: -On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote: -On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote: -Hi, - -In xen_map_cache_unlocked(), map to guest memory maybe in -entry->next -instead of first level entry (if map to rom other than guest -memory -comes first), while in xen_invalidate_map_cache(), when VM -ballooned -out memory, qemu did not invalidate cache entries in linked -list(entry->next), so when VM balloon back in memory, gfns -probably -mapped to different mfns, thus if guest asks device to DMA to -these -GPA, qemu may DMA to stale MFNs. - -So I think in xen_invalidate_map_cache() linked lists should -also be -checked and invalidated. - -Whatâs your opinion? Is this a bug? Is my analyze correct? -Yes, you are right. We need to go through the list for each element of -the array in xen_invalidate_map_cache. Can you come up with a patch? -I spoke too soon. In the regular case there should be no locked mappings -when xen_invalidate_map_cache is called (see the DPRINTF warning at the -beginning of the functions). Without locked mappings, there should never -be more than one element in each list (see xen_map_cache_unlocked: -entry->lock == true is a necessary condition to append a new entry to -the list, otherwise it is just remapped). - -Can you confirm that what you are seeing are locked mappings -when xen_invalidate_map_cache is called? To find out, enable the DPRINTK -by turning it into a printf or by defininig MAPCACHE_DEBUG. -In fact, I think the DPRINTF above is incorrect too. In -pci_add_option_rom(), rtl8139 rom is locked mapped in -pci_add_option_rom->memory_region_get_ram_ptr (after -memory_region_init_ram). So actually I think we should remove the -DPRINTF warning as it is normal. -Let me explain why the DPRINTF warning is there: emulated dma operations -can involve locked mappings. Once a dma operation completes, the related -mapping is unlocked and can be safely destroyed. But if we destroy a -locked mapping in xen_invalidate_map_cache, while a dma is still -ongoing, QEMU will crash. We cannot handle that case. - -However, the scenario you described is different. It has nothing to do -with DMA. It looks like pci_add_option_rom calls -memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -locked mapping and it is never unlocked or destroyed. - -It looks like "ptr" is not used after pci_add_option_rom returns. Does -the append patch fix the problem you are seeing? For the proper fix, I -think we probably need some sort of memory_region_unmap wrapper or maybe -a call to address_space_unmap. -Yes, I think so, maybe this is the proper way to fix this. -Would you be up for sending a proper patch and testing it? We cannot call -xen_invalidate_map_cache_entry directly from pci.c though, it would need -to be one of the other functions like address_space_unmap for example. -Yes, I will look into this. -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -index e6b08e1..04f98b7 100644 ---- a/hw/pci/pci.c -+++ b/hw/pci/pci.c -@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool -is_default_rom, - } - pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); -+ xen_invalidate_map_cache_entry(ptr); - } - static void pci_del_option_rom(PCIDevice *pdev) - -On Thu, 13 Apr 2017, Herongguang (Stephen) wrote: -> -On 2017/4/13 7:51, Stefano Stabellini wrote: -> -> On Wed, 12 Apr 2017, Herongguang (Stephen) wrote: -> -> > On 2017/4/12 6:32, Stefano Stabellini wrote: -> -> > > On Tue, 11 Apr 2017, hrg wrote: -> -> > > > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini -> -> > > > <address@hidden> wrote: -> -> > > > > On Mon, 10 Apr 2017, Stefano Stabellini wrote: -> -> > > > > > On Mon, 10 Apr 2017, hrg wrote: -> -> > > > > > > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> -> -> > > > > > > wrote: -> -> > > > > > > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> -> -> > > > > > > > wrote: -> -> > > > > > > > > Hi, -> -> > > > > > > > > -> -> > > > > > > > > In xen_map_cache_unlocked(), map to guest memory maybe in -> -> > > > > > > > > entry->next -> -> > > > > > > > > instead of first level entry (if map to rom other than guest -> -> > > > > > > > > memory -> -> > > > > > > > > comes first), while in xen_invalidate_map_cache(), when VM -> -> > > > > > > > > ballooned -> -> > > > > > > > > out memory, qemu did not invalidate cache entries in linked -> -> > > > > > > > > list(entry->next), so when VM balloon back in memory, gfns -> -> > > > > > > > > probably -> -> > > > > > > > > mapped to different mfns, thus if guest asks device to DMA -> -> > > > > > > > > to -> -> > > > > > > > > these -> -> > > > > > > > > GPA, qemu may DMA to stale MFNs. -> -> > > > > > > > > -> -> > > > > > > > > So I think in xen_invalidate_map_cache() linked lists should -> -> > > > > > > > > also be -> -> > > > > > > > > checked and invalidated. -> -> > > > > > > > > -> -> > > > > > > > > Whatâs your opinion? Is this a bug? Is my analyze correct? -> -> > > > > > Yes, you are right. We need to go through the list for each -> -> > > > > > element of -> -> > > > > > the array in xen_invalidate_map_cache. Can you come up with a -> -> > > > > > patch? -> -> > > > > I spoke too soon. In the regular case there should be no locked -> -> > > > > mappings -> -> > > > > when xen_invalidate_map_cache is called (see the DPRINTF warning at -> -> > > > > the -> -> > > > > beginning of the functions). Without locked mappings, there should -> -> > > > > never -> -> > > > > be more than one element in each list (see xen_map_cache_unlocked: -> -> > > > > entry->lock == true is a necessary condition to append a new entry -> -> > > > > to -> -> > > > > the list, otherwise it is just remapped). -> -> > > > > -> -> > > > > Can you confirm that what you are seeing are locked mappings -> -> > > > > when xen_invalidate_map_cache is called? To find out, enable the -> -> > > > > DPRINTK -> -> > > > > by turning it into a printf or by defininig MAPCACHE_DEBUG. -> -> > > > In fact, I think the DPRINTF above is incorrect too. In -> -> > > > pci_add_option_rom(), rtl8139 rom is locked mapped in -> -> > > > pci_add_option_rom->memory_region_get_ram_ptr (after -> -> > > > memory_region_init_ram). So actually I think we should remove the -> -> > > > DPRINTF warning as it is normal. -> -> > > Let me explain why the DPRINTF warning is there: emulated dma operations -> -> > > can involve locked mappings. Once a dma operation completes, the related -> -> > > mapping is unlocked and can be safely destroyed. But if we destroy a -> -> > > locked mapping in xen_invalidate_map_cache, while a dma is still -> -> > > ongoing, QEMU will crash. We cannot handle that case. -> -> > > -> -> > > However, the scenario you described is different. It has nothing to do -> -> > > with DMA. It looks like pci_add_option_rom calls -> -> > > memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a -> -> > > locked mapping and it is never unlocked or destroyed. -> -> > > -> -> > > It looks like "ptr" is not used after pci_add_option_rom returns. Does -> -> > > the append patch fix the problem you are seeing? For the proper fix, I -> -> > > think we probably need some sort of memory_region_unmap wrapper or maybe -> -> > > a call to address_space_unmap. -> -> > -> -> > Yes, I think so, maybe this is the proper way to fix this. -> -> -> -> Would you be up for sending a proper patch and testing it? We cannot call -> -> xen_invalidate_map_cache_entry directly from pci.c though, it would need -> -> to be one of the other functions like address_space_unmap for example. -> -> -> -> -> -Yes, I will look into this. -Any updates? - - -> -> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c -> -> > > index e6b08e1..04f98b7 100644 -> -> > > --- a/hw/pci/pci.c -> -> > > +++ b/hw/pci/pci.c -> -> > > @@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, -> -> > > bool -> -> > > is_default_rom, -> -> > > } -> -> > > pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); -> -> > > + xen_invalidate_map_cache_entry(ptr); -> -> > > } -> -> > > static void pci_del_option_rom(PCIDevice *pdev) -> - diff --git a/results/classifier/02/other/55367348 b/results/classifier/02/other/55367348 deleted file mode 100644 index e8cf95103..000000000 --- a/results/classifier/02/other/55367348 +++ /dev/null @@ -1,533 +0,0 @@ -other: 0.626 -mistranslation: 0.615 -instruction: 0.572 -semantic: 0.555 -boot: 0.486 - -[Qemu-devel] [Bug] Docs build fails at interop.rst - -https://paste.fedoraproject.org/paste/kOPx4jhtUli---TmxSLrlw -running python3-sphinx-2.0.1-1.fc31.noarch on Fedora release 31 -(Rawhide) - -uname - a -Linux iouring 5.1.0-0.rc6.git3.1.fc31.x86_64 #1 SMP Thu Apr 25 14:25:32 -UTC 2019 x86_64 x86_64 x86_64 GNU/Linux - -Reverting commmit 90edef80a0852cf8a3d2668898ee40e8970e431 -allows for the build to occur - -Regards -Aarushi Mehta - -On 5/20/19 7:30 AM, Aarushi Mehta wrote: -> -https://paste.fedoraproject.org/paste/kOPx4jhtUli---TmxSLrlw -> -running python3-sphinx-2.0.1-1.fc31.noarch on Fedora release 31 -> -(Rawhide) -> -> -uname - a -> -Linux iouring 5.1.0-0.rc6.git3.1.fc31.x86_64 #1 SMP Thu Apr 25 14:25:32 -> -UTC 2019 x86_64 x86_64 x86_64 GNU/Linux -> -> -Reverting commmit 90edef80a0852cf8a3d2668898ee40e8970e431 -> -allows for the build to occur -> -> -Regards -> -Aarushi Mehta -> -> -Ah, dang. The blocks aren't strictly conforming json, but the version I -tested this under didn't seem to care. Your version is much newer. (I -was using 1.7 as provided by Fedora 29.) - -For now, try reverting 9e5b6cb87db66dfb606604fe6cf40e5ddf1ef0e7 instead, -which should at least turn off the "warnings as errors" option, but I -don't think that reverting -n will turn off this warning. - -I'll try to get ahold of this newer version and see if I can't fix it -more appropriately. - ---js - -On 5/20/19 12:37 PM, John Snow wrote: -> -> -> -On 5/20/19 7:30 AM, Aarushi Mehta wrote: -> -> -https://paste.fedoraproject.org/paste/kOPx4jhtUli---TmxSLrlw -> -> running python3-sphinx-2.0.1-1.fc31.noarch on Fedora release 31 -> -> (Rawhide) -> -> -> -> uname - a -> -> Linux iouring 5.1.0-0.rc6.git3.1.fc31.x86_64 #1 SMP Thu Apr 25 14:25:32 -> -> UTC 2019 x86_64 x86_64 x86_64 GNU/Linux -> -> -> -> Reverting commmit 90edef80a0852cf8a3d2668898ee40e8970e431 -> -> allows for the build to occur -> -> -> -> Regards -> -> Aarushi Mehta -> -> -> -> -> -> -Ah, dang. The blocks aren't strictly conforming json, but the version I -> -tested this under didn't seem to care. Your version is much newer. (I -> -was using 1.7 as provided by Fedora 29.) -> -> -For now, try reverting 9e5b6cb87db66dfb606604fe6cf40e5ddf1ef0e7 instead, -> -which should at least turn off the "warnings as errors" option, but I -> -don't think that reverting -n will turn off this warning. -> -> -I'll try to get ahold of this newer version and see if I can't fix it -> -more appropriately. -> -> ---js -> -...Sigh, okay. - -So, I am still not actually sure what changed from pygments 2.2 and -sphinx 1.7 to pygments 2.4 and sphinx 2.0.1, but it appears as if Sphinx -by default always tries to do add a filter to the pygments lexer that -raises an error on highlighting failure, instead of the default behavior -which is to just highlight those errors in the output. There is no -option to Sphinx that I am aware of to retain this lexing behavior. -(Effectively, it's strict or nothing.) - -This approach, apparently, is broken in Sphinx 1.7/Pygments 2.2, so the -build works with our malformed json. - -There are a few options: - -1. Update conf.py to ignore these warnings (and all future lexing -errors), and settle for the fact that there will be no QMP highlighting -wherever we use the directionality indicators ('->', '<-'). - -2. Update bitmaps.rst to remove the directionality indicators. - -3. Update bitmaps.rst to format the QMP blocks as raw text instead of JSON. - -4. Update bitmaps.rst to remove the "json" specification from the code -block. This will cause sphinx to "guess" the formatting, and the -pygments guesser will decide it's Python3. - -This will parse well enough, but will mis-highlight 'true' and 'false' -which are not python keywords. This approach may break in the future if -the Python3 lexer is upgraded to be stricter (because '->' and '<-' are -still invalid), and leaves us at the mercy of both the guesser and the -lexer. - -I'm not actually sure what I dislike the least; I think I dislike #1 the -most. #4 gets us most of what we want but is perhaps porcelain. - -I suspect if we attempt to move more of our documentation to ReST and -Sphinx that we will need to answer for ourselves how we intend to -document QMP code flow examples. - ---js - -On Mon, May 20, 2019 at 05:25:28PM -0400, John Snow wrote: -> -> -> -On 5/20/19 12:37 PM, John Snow wrote: -> -> -> -> -> -> On 5/20/19 7:30 AM, Aarushi Mehta wrote: -> ->> -https://paste.fedoraproject.org/paste/kOPx4jhtUli---TmxSLrlw -> ->> running python3-sphinx-2.0.1-1.fc31.noarch on Fedora release 31 -> ->> (Rawhide) -> ->> -> ->> uname - a -> ->> Linux iouring 5.1.0-0.rc6.git3.1.fc31.x86_64 #1 SMP Thu Apr 25 14:25:32 -> ->> UTC 2019 x86_64 x86_64 x86_64 GNU/Linux -> ->> -> ->> Reverting commmit 90edef80a0852cf8a3d2668898ee40e8970e431 -> ->> allows for the build to occur -> ->> -> ->> Regards -> ->> Aarushi Mehta -> ->> -> ->> -> -> -> -> Ah, dang. The blocks aren't strictly conforming json, but the version I -> -> tested this under didn't seem to care. Your version is much newer. (I -> -> was using 1.7 as provided by Fedora 29.) -> -> -> -> For now, try reverting 9e5b6cb87db66dfb606604fe6cf40e5ddf1ef0e7 instead, -> -> which should at least turn off the "warnings as errors" option, but I -> -> don't think that reverting -n will turn off this warning. -> -> -> -> I'll try to get ahold of this newer version and see if I can't fix it -> -> more appropriately. -> -> -> -> --js -> -> -> -> -...Sigh, okay. -> -> -So, I am still not actually sure what changed from pygments 2.2 and -> -sphinx 1.7 to pygments 2.4 and sphinx 2.0.1, but it appears as if Sphinx -> -by default always tries to do add a filter to the pygments lexer that -> -raises an error on highlighting failure, instead of the default behavior -> -which is to just highlight those errors in the output. There is no -> -option to Sphinx that I am aware of to retain this lexing behavior. -> -(Effectively, it's strict or nothing.) -> -> -This approach, apparently, is broken in Sphinx 1.7/Pygments 2.2, so the -> -build works with our malformed json. -> -> -There are a few options: -> -> -1. Update conf.py to ignore these warnings (and all future lexing -> -errors), and settle for the fact that there will be no QMP highlighting -> -wherever we use the directionality indicators ('->', '<-'). -> -> -2. Update bitmaps.rst to remove the directionality indicators. -> -> -3. Update bitmaps.rst to format the QMP blocks as raw text instead of JSON. -> -> -4. Update bitmaps.rst to remove the "json" specification from the code -> -block. This will cause sphinx to "guess" the formatting, and the -> -pygments guesser will decide it's Python3. -> -> -This will parse well enough, but will mis-highlight 'true' and 'false' -> -which are not python keywords. This approach may break in the future if -> -the Python3 lexer is upgraded to be stricter (because '->' and '<-' are -> -still invalid), and leaves us at the mercy of both the guesser and the -> -lexer. -> -> -I'm not actually sure what I dislike the least; I think I dislike #1 the -> -most. #4 gets us most of what we want but is perhaps porcelain. -> -> -I suspect if we attempt to move more of our documentation to ReST and -> -Sphinx that we will need to answer for ourselves how we intend to -> -document QMP code flow examples. -Writing a custom lexer that handles "<-" and "->" was simple (see below). - -Now, is it possible to convince Sphinx to register and use a custom lexer? - -$ cat > /tmp/lexer.py <<EOF -from pygments.lexer import RegexLexer, DelegatingLexer -from pygments.lexers.data import JsonLexer -import re -from pygments.token import * - -class QMPExampleMarkersLexer(RegexLexer): - tokens = { - 'root': [ - (r' *-> *', Generic.Prompt), - (r' *<- *', Generic.Output), - ] - } - -class QMPExampleLexer(DelegatingLexer): - def __init__(self, **options): - super(QMPExampleLexer, self).__init__(JsonLexer, -QMPExampleMarkersLexer, Error, **options) -EOF -$ pygmentize -l /tmp/lexer.py:QMPExampleLexer -x -f html <<EOF - -> { - "execute": "drive-backup", - "arguments": { - "device": "drive0", - "bitmap": "bitmap0", - "target": "drive0.inc0.qcow2", - "format": "qcow2", - "sync": "incremental", - "mode": "existing" - } - } - - <- { "return": {} } -EOF -<div class="highlight"><pre><span></span><span class="gp"> -> -</span><span class="p">{</span> - <span class="nt">"execute"</span><span class="p">:</span> -<span class="s2">"drive-backup"</span><span class="p">,</span> - <span class="nt">"arguments"</span><span class="p">:</span> -<span class="p">{</span> - <span class="nt">"device"</span><span class="p">:</span> -<span class="s2">"drive0"</span><span class="p">,</span> - <span class="nt">"bitmap"</span><span class="p">:</span> -<span class="s2">"bitmap0"</span><span class="p">,</span> - <span class="nt">"target"</span><span class="p">:</span> -<span class="s2">"drive0.inc0.qcow2"</span><span class="p">,</span> - <span class="nt">"format"</span><span class="p">:</span> -<span class="s2">"qcow2"</span><span class="p">,</span> - <span class="nt">"sync"</span><span class="p">:</span> -<span class="s2">"incremental"</span><span class="p">,</span> - <span class="nt">"mode"</span><span class="p">:</span> -<span class="s2">"existing"</span> - <span class="p">}</span> - <span class="p">}</span> - -<span class="go"> <- </span><span class="p">{</span> <span -class="nt">"return"</span><span class="p">:</span> <span -class="p">{}</span> <span class="p">}</span> -</pre></div> -$ - - --- -Eduardo - -On 5/20/19 7:04 PM, Eduardo Habkost wrote: -> -On Mon, May 20, 2019 at 05:25:28PM -0400, John Snow wrote: -> -> -> -> -> -> On 5/20/19 12:37 PM, John Snow wrote: -> ->> -> ->> -> ->> On 5/20/19 7:30 AM, Aarushi Mehta wrote: -> ->>> -https://paste.fedoraproject.org/paste/kOPx4jhtUli---TmxSLrlw -> ->>> running python3-sphinx-2.0.1-1.fc31.noarch on Fedora release 31 -> ->>> (Rawhide) -> ->>> -> ->>> uname - a -> ->>> Linux iouring 5.1.0-0.rc6.git3.1.fc31.x86_64 #1 SMP Thu Apr 25 14:25:32 -> ->>> UTC 2019 x86_64 x86_64 x86_64 GNU/Linux -> ->>> -> ->>> Reverting commmit 90edef80a0852cf8a3d2668898ee40e8970e431 -> ->>> allows for the build to occur -> ->>> -> ->>> Regards -> ->>> Aarushi Mehta -> ->>> -> ->>> -> ->> -> ->> Ah, dang. The blocks aren't strictly conforming json, but the version I -> ->> tested this under didn't seem to care. Your version is much newer. (I -> ->> was using 1.7 as provided by Fedora 29.) -> ->> -> ->> For now, try reverting 9e5b6cb87db66dfb606604fe6cf40e5ddf1ef0e7 instead, -> ->> which should at least turn off the "warnings as errors" option, but I -> ->> don't think that reverting -n will turn off this warning. -> ->> -> ->> I'll try to get ahold of this newer version and see if I can't fix it -> ->> more appropriately. -> ->> -> ->> --js -> ->> -> -> -> -> ...Sigh, okay. -> -> -> -> So, I am still not actually sure what changed from pygments 2.2 and -> -> sphinx 1.7 to pygments 2.4 and sphinx 2.0.1, but it appears as if Sphinx -> -> by default always tries to do add a filter to the pygments lexer that -> -> raises an error on highlighting failure, instead of the default behavior -> -> which is to just highlight those errors in the output. There is no -> -> option to Sphinx that I am aware of to retain this lexing behavior. -> -> (Effectively, it's strict or nothing.) -> -> -> -> This approach, apparently, is broken in Sphinx 1.7/Pygments 2.2, so the -> -> build works with our malformed json. -> -> -> -> There are a few options: -> -> -> -> 1. Update conf.py to ignore these warnings (and all future lexing -> -> errors), and settle for the fact that there will be no QMP highlighting -> -> wherever we use the directionality indicators ('->', '<-'). -> -> -> -> 2. Update bitmaps.rst to remove the directionality indicators. -> -> -> -> 3. Update bitmaps.rst to format the QMP blocks as raw text instead of JSON. -> -> -> -> 4. Update bitmaps.rst to remove the "json" specification from the code -> -> block. This will cause sphinx to "guess" the formatting, and the -> -> pygments guesser will decide it's Python3. -> -> -> -> This will parse well enough, but will mis-highlight 'true' and 'false' -> -> which are not python keywords. This approach may break in the future if -> -> the Python3 lexer is upgraded to be stricter (because '->' and '<-' are -> -> still invalid), and leaves us at the mercy of both the guesser and the -> -> lexer. -> -> -> -> I'm not actually sure what I dislike the least; I think I dislike #1 the -> -> most. #4 gets us most of what we want but is perhaps porcelain. -> -> -> -> I suspect if we attempt to move more of our documentation to ReST and -> -> Sphinx that we will need to answer for ourselves how we intend to -> -> document QMP code flow examples. -> -> -Writing a custom lexer that handles "<-" and "->" was simple (see below). -> -> -Now, is it possible to convince Sphinx to register and use a custom lexer? -> -Spoilers, yes, and I've sent a patch to list. Thanks for your help! - diff --git a/results/classifier/02/other/55753058 b/results/classifier/02/other/55753058 deleted file mode 100644 index e572849ab..000000000 --- a/results/classifier/02/other/55753058 +++ /dev/null @@ -1,294 +0,0 @@ -other: 0.734 -mistranslation: 0.649 -instruction: 0.581 -semantic: 0.577 -boot: 0.478 - -[RESEND][BUG FIX HELP] QEMU main thread endlessly hangs in __ppoll() - -Hi Genius, -I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may still -exist in the mainline. -Thanks in advance to heroes who can take a look and share understanding. - -The qemu main thread endlessly hangs in the handle of the qmp statement: -{'execute': 'human-monitor-command', 'arguments':{ 'command-line': -'drive_del replication0' } } -and we have the call trace looks like: -#0 0x00007f3c22045bf6 in __ppoll (fds=0x555611328410, nfds=1, -timeout=<optimized out>, timeout@entry=0x7ffc56c66db0, -sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44 -#1 0x000055561021f415 in ppoll (__ss=0x0, __timeout=0x7ffc56c66db0, -__nfds=<optimized out>, __fds=<optimized out>) -at /usr/include/x86_64-linux-gnu/bits/poll2.h:77 -#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, -timeout=<optimized out>) at util/qemu-timer.c:348 -#3 0x0000555610221430 in aio_poll (ctx=ctx@entry=0x5556113010f0, -blocking=blocking@entry=true) at util/aio-posix.c:669 -#4 0x000055561019268d in bdrv_do_drained_begin (poll=true, -ignore_bds_parents=false, parent=0x0, recursive=false, -bs=0x55561138b0a0) at block/io.c:430 -#5 bdrv_do_drained_begin (bs=0x55561138b0a0, recursive=<optimized out>, -parent=0x0, ignore_bds_parents=<optimized out>, -poll=<optimized out>) at block/io.c:396 -#6 0x000055561017b60b in quorum_del_child (bs=0x55561138b0a0, -child=0x7f36dc0ce380, errp=<optimized out>) -at block/quorum.c:1063 -#7 0x000055560ff5836b in qmp_x_blockdev_change (parent=0x555612373120 -"colo-disk0", has_child=<optimized out>, -child=0x5556112df3e0 "children.1", has_node=<optimized out>, node=0x0, -errp=0x7ffc56c66f98) at blockdev.c:4494 -#8 0x00005556100f8f57 in qmp_marshal_x_blockdev_change (args=<optimized -out>, ret=<optimized out>, errp=0x7ffc56c67018) -at qapi/qapi-commands-block-core.c:1538 -#9 0x00005556101d8290 in do_qmp_dispatch (errp=0x7ffc56c67010, -allow_oob=<optimized out>, request=<optimized out>, -cmds=0x5556109c69a0 <qmp_commands>) at qapi/qmp-dispatch.c:132 -#10 qmp_dispatch (cmds=0x5556109c69a0 <qmp_commands>, request=<optimized -out>, allow_oob=<optimized out>) -at qapi/qmp-dispatch.c:175 -#11 0x00005556100d4c4d in monitor_qmp_dispatch (mon=0x5556113a6f40, -req=<optimized out>) at monitor/qmp.c:145 -#12 0x00005556100d5437 in monitor_qmp_bh_dispatcher (data=<optimized out>) -at monitor/qmp.c:234 -#13 0x000055561021dbec in aio_bh_call (bh=0x5556112164bGrateful0) at -util/async.c:117 -#14 aio_bh_poll (ctx=ctx@entry=0x5556112151b0) at util/async.c:117 -#15 0x00005556102212c4 in aio_dispatch (ctx=0x5556112151b0) at -util/aio-posix.c:459 -#16 0x000055561021dab2 in aio_ctx_dispatch (source=<optimized out>, -callback=<optimized out>, user_data=<optimized out>) -at util/async.c:260 -#17 0x00007f3c22302fbd in g_main_context_dispatch () from -/lib/x86_64-linux-gnu/libglib-2.0.so.0 -#18 0x0000555610220358 in glib_pollfds_poll () at util/main-loop.c:219 -#19 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 -#20 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 -#21 0x000055560ff600fe in main_loop () at vl.c:1814 -#22 0x000055560fddbce9 in main (argc=<optimized out>, argv=<optimized out>, -envp=<optimized out>) at vl.c:4503 -We found that we're doing endless check in the line of -block/io.c:bdrv_do_drained_begin(): -BDRV_POLL_WHILE(bs, bdrv_drain_poll_top_level(bs, recursive, parent)); -and it turns out that the bdrv_drain_poll() always get true from: -- bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents) -- AND atomic_read(&bs->in_flight) - -I personally think this is a deadlock issue in the a QEMU block layer -(as we know, we have some #FIXME comments in related codes, such as block -permisson update). -Any comments are welcome and appreciated. - ---- -thx,likexu - -On 2/28/21 9:39 PM, Like Xu wrote: -Hi Genius, -I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may -still exist in the mainline. -Thanks in advance to heroes who can take a look and share understanding. -Do you have a test case that reproduces on 5.2? It'd be nice to know if -it was still a problem in the latest source tree or not. ---js -The qemu main thread endlessly hangs in the handle of the qmp statement: -{'execute': 'human-monitor-command', 'arguments':{ 'command-line': -'drive_del replication0' } } -and we have the call trace looks like: -#0 0x00007f3c22045bf6 in __ppoll (fds=0x555611328410, nfds=1, -timeout=<optimized out>, timeout@entry=0x7ffc56c66db0, -sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44 -#1 0x000055561021f415 in ppoll (__ss=0x0, __timeout=0x7ffc56c66db0, -__nfds=<optimized out>, __fds=<optimized out>) -at /usr/include/x86_64-linux-gnu/bits/poll2.h:77 -#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, -timeout=<optimized out>) at util/qemu-timer.c:348 -#3 0x0000555610221430 in aio_poll (ctx=ctx@entry=0x5556113010f0, -blocking=blocking@entry=true) at util/aio-posix.c:669 -#4 0x000055561019268d in bdrv_do_drained_begin (poll=true, -ignore_bds_parents=false, parent=0x0, recursive=false, -bs=0x55561138b0a0) at block/io.c:430 -#5 bdrv_do_drained_begin (bs=0x55561138b0a0, recursive=<optimized out>, -parent=0x0, ignore_bds_parents=<optimized out>, -poll=<optimized out>) at block/io.c:396 -#6 0x000055561017b60b in quorum_del_child (bs=0x55561138b0a0, -child=0x7f36dc0ce380, errp=<optimized out>) -at block/quorum.c:1063 -#7 0x000055560ff5836b in qmp_x_blockdev_change (parent=0x555612373120 -"colo-disk0", has_child=<optimized out>, -child=0x5556112df3e0 "children.1", has_node=<optimized out>, node=0x0, -errp=0x7ffc56c66f98) at blockdev.c:4494 -#8 0x00005556100f8f57 in qmp_marshal_x_blockdev_change (args=<optimized -out>, ret=<optimized out>, errp=0x7ffc56c67018) -at qapi/qapi-commands-block-core.c:1538 -#9 0x00005556101d8290 in do_qmp_dispatch (errp=0x7ffc56c67010, -allow_oob=<optimized out>, request=<optimized out>, -cmds=0x5556109c69a0 <qmp_commands>) at qapi/qmp-dispatch.c:132 -#10 qmp_dispatch (cmds=0x5556109c69a0 <qmp_commands>, request=<optimized -out>, allow_oob=<optimized out>) -at qapi/qmp-dispatch.c:175 -#11 0x00005556100d4c4d in monitor_qmp_dispatch (mon=0x5556113a6f40, -req=<optimized out>) at monitor/qmp.c:145 -#12 0x00005556100d5437 in monitor_qmp_bh_dispatcher (data=<optimized -out>) at monitor/qmp.c:234 -#13 0x000055561021dbec in aio_bh_call (bh=0x5556112164bGrateful0) at -util/async.c:117 -#14 aio_bh_poll (ctx=ctx@entry=0x5556112151b0) at util/async.c:117 -#15 0x00005556102212c4 in aio_dispatch (ctx=0x5556112151b0) at -util/aio-posix.c:459 -#16 0x000055561021dab2 in aio_ctx_dispatch (source=<optimized out>, -callback=<optimized out>, user_data=<optimized out>) -at util/async.c:260 -#17 0x00007f3c22302fbd in g_main_context_dispatch () from -/lib/x86_64-linux-gnu/libglib-2.0.so.0 -#18 0x0000555610220358 in glib_pollfds_poll () at util/main-loop.c:219 -#19 os_host_main_loop_wait (timeout=<optimized out>) at -util/main-loop.c:242 -#20 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 -#21 0x000055560ff600fe in main_loop () at vl.c:1814 -#22 0x000055560fddbce9 in main (argc=<optimized out>, argv=<optimized -out>, envp=<optimized out>) at vl.c:4503 -We found that we're doing endless check in the line of -block/io.c:bdrv_do_drained_begin(): -    BDRV_POLL_WHILE(bs, bdrv_drain_poll_top_level(bs, recursive, parent)); -and it turns out that the bdrv_drain_poll() always get true from: -- bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents) -- AND atomic_read(&bs->in_flight) - -I personally think this is a deadlock issue in the a QEMU block layer -(as we know, we have some #FIXME comments in related codes, such as -block permisson update). -Any comments are welcome and appreciated. - ---- -thx,likexu - -Hi John, - -Thanks for your comment. - -On 2021/3/5 7:53, John Snow wrote: -On 2/28/21 9:39 PM, Like Xu wrote: -Hi Genius, -I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may -still exist in the mainline. -Thanks in advance to heroes who can take a look and share understanding. -Do you have a test case that reproduces on 5.2? It'd be nice to know if it -was still a problem in the latest source tree or not. -We narrowed down the source of the bug, which basically came from -the following qmp usage: -{'execute': 'human-monitor-command', 'arguments':{ 'command-line': -'drive_del replication0' } } -One of the test cases is the COLO usage (docs/colo-proxy.txt). - -This issue is sporadic,the probability may be 1/15 for a io-heavy guest. - -I believe it's reproducible on 5.2 and the latest tree. ---js -The qemu main thread endlessly hangs in the handle of the qmp statement: -{'execute': 'human-monitor-command', 'arguments':{ 'command-line': -'drive_del replication0' } } -and we have the call trace looks like: -#0 0x00007f3c22045bf6 in __ppoll (fds=0x555611328410, nfds=1, -timeout=<optimized out>, timeout@entry=0x7ffc56c66db0, -sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44 -#1 0x000055561021f415 in ppoll (__ss=0x0, __timeout=0x7ffc56c66db0, -__nfds=<optimized out>, __fds=<optimized out>) -at /usr/include/x86_64-linux-gnu/bits/poll2.h:77 -#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, -timeout=<optimized out>) at util/qemu-timer.c:348 -#3 0x0000555610221430 in aio_poll (ctx=ctx@entry=0x5556113010f0, -blocking=blocking@entry=true) at util/aio-posix.c:669 -#4 0x000055561019268d in bdrv_do_drained_begin (poll=true, -ignore_bds_parents=false, parent=0x0, recursive=false, -bs=0x55561138b0a0) at block/io.c:430 -#5 bdrv_do_drained_begin (bs=0x55561138b0a0, recursive=<optimized out>, -parent=0x0, ignore_bds_parents=<optimized out>, -poll=<optimized out>) at block/io.c:396 -#6 0x000055561017b60b in quorum_del_child (bs=0x55561138b0a0, -child=0x7f36dc0ce380, errp=<optimized out>) -at block/quorum.c:1063 -#7 0x000055560ff5836b in qmp_x_blockdev_change (parent=0x555612373120 -"colo-disk0", has_child=<optimized out>, -child=0x5556112df3e0 "children.1", has_node=<optimized out>, node=0x0, -errp=0x7ffc56c66f98) at blockdev.c:4494 -#8 0x00005556100f8f57 in qmp_marshal_x_blockdev_change (args=<optimized -out>, ret=<optimized out>, errp=0x7ffc56c67018) -at qapi/qapi-commands-block-core.c:1538 -#9 0x00005556101d8290 in do_qmp_dispatch (errp=0x7ffc56c67010, -allow_oob=<optimized out>, request=<optimized out>, -cmds=0x5556109c69a0 <qmp_commands>) at qapi/qmp-dispatch.c:132 -#10 qmp_dispatch (cmds=0x5556109c69a0 <qmp_commands>, request=<optimized -out>, allow_oob=<optimized out>) -at qapi/qmp-dispatch.c:175 -#11 0x00005556100d4c4d in monitor_qmp_dispatch (mon=0x5556113a6f40, -req=<optimized out>) at monitor/qmp.c:145 -#12 0x00005556100d5437 in monitor_qmp_bh_dispatcher (data=<optimized -out>) at monitor/qmp.c:234 -#13 0x000055561021dbec in aio_bh_call (bh=0x5556112164bGrateful0) at -util/async.c:117 -#14 aio_bh_poll (ctx=ctx@entry=0x5556112151b0) at util/async.c:117 -#15 0x00005556102212c4 in aio_dispatch (ctx=0x5556112151b0) at -util/aio-posix.c:459 -#16 0x000055561021dab2 in aio_ctx_dispatch (source=<optimized out>, -callback=<optimized out>, user_data=<optimized out>) -at util/async.c:260 -#17 0x00007f3c22302fbd in g_main_context_dispatch () from -/lib/x86_64-linux-gnu/libglib-2.0.so.0 -#18 0x0000555610220358 in glib_pollfds_poll () at util/main-loop.c:219 -#19 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 -#20 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 -#21 0x000055560ff600fe in main_loop () at vl.c:1814 -#22 0x000055560fddbce9 in main (argc=<optimized out>, argv=<optimized -out>, envp=<optimized out>) at vl.c:4503 -We found that we're doing endless check in the line of -block/io.c:bdrv_do_drained_begin(): -     BDRV_POLL_WHILE(bs, bdrv_drain_poll_top_level(bs, recursive, parent)); -and it turns out that the bdrv_drain_poll() always get true from: -- bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents) -- AND atomic_read(&bs->in_flight) - -I personally think this is a deadlock issue in the a QEMU block layer -(as we know, we have some #FIXME comments in related codes, such as block -permisson update). -Any comments are welcome and appreciated. - ---- -thx,likexu - -On 3/4/21 10:08 PM, Like Xu wrote: -Hi John, - -Thanks for your comment. - -On 2021/3/5 7:53, John Snow wrote: -On 2/28/21 9:39 PM, Like Xu wrote: -Hi Genius, -I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may -still exist in the mainline. -Thanks in advance to heroes who can take a look and share understanding. -Do you have a test case that reproduces on 5.2? It'd be nice to know -if it was still a problem in the latest source tree or not. -We narrowed down the source of the bug, which basically came from -the following qmp usage: -{'execute': 'human-monitor-command', 'arguments':{ 'command-line': -'drive_del replication0' } } -One of the test cases is the COLO usage (docs/colo-proxy.txt). - -This issue is sporadic,the probability may be 1/15 for a io-heavy guest. - -I believe it's reproducible on 5.2 and the latest tree. -Can you please test and confirm that this is the case, and then file a -bug report on the LP: -https://launchpad.net/qemu -and include: -- The exact commit you used (current origin/master debug build would be -the most ideal.) -- Which QEMU binary you are using (qemu-system-x86_64?) -- The shortest command line you are aware of that reproduces the problem -- The host OS and kernel version -- An updated call trace -- Any relevant commands issued prior to the one that caused the hang; or -detailed reproduction steps if possible. -Thanks, ---js - diff --git a/results/classifier/02/other/56309929 b/results/classifier/02/other/56309929 deleted file mode 100644 index 94bfa27b4..000000000 --- a/results/classifier/02/other/56309929 +++ /dev/null @@ -1,181 +0,0 @@ -other: 0.690 -instruction: 0.581 -boot: 0.578 -mistranslation: 0.554 -semantic: 0.521 - -[Qemu-devel] [BUG 2.6] Broken CONFIG_TPM? - -A compilation test with clang -Weverything reported this problem: - -config-host.h:112:20: warning: '$' in identifier -[-Wdollar-in-identifier-extension] - -The line of code looks like this: - -#define CONFIG_TPM $(CONFIG_SOFTMMU) - -This is fine for Makefile code, but won't work as expected in C code. - -Am 28.04.2016 um 22:33 schrieb Stefan Weil: -> -A compilation test with clang -Weverything reported this problem: -> -> -config-host.h:112:20: warning: '$' in identifier -> -[-Wdollar-in-identifier-extension] -> -> -The line of code looks like this: -> -> -#define CONFIG_TPM $(CONFIG_SOFTMMU) -> -> -This is fine for Makefile code, but won't work as expected in C code. -> -A complete 64 bit build with clang -Weverything creates a log file of -1.7 GB. -Here are the uniq warnings sorted by their frequency: - - 1 -Wflexible-array-extensions - 1 -Wgnu-folding-constant - 1 -Wunknown-pragmas - 1 -Wunknown-warning-option - 1 -Wunreachable-code-loop-increment - 2 -Warray-bounds-pointer-arithmetic - 2 -Wdollar-in-identifier-extension - 3 -Woverlength-strings - 3 -Wweak-vtables - 4 -Wgnu-empty-struct - 4 -Wstring-conversion - 6 -Wclass-varargs - 7 -Wc99-extensions - 7 -Wc++-compat - 8 -Wfloat-equal - 11 -Wformat-nonliteral - 16 -Wshift-negative-value - 19 -Wglobal-constructors - 28 -Wc++11-long-long - 29 -Wembedded-directive - 38 -Wvla - 40 -Wcovered-switch-default - 40 -Wmissing-variable-declarations - 49 -Wold-style-cast - 53 -Wgnu-conditional-omitted-operand - 56 -Wformat-pedantic - 61 -Wvariadic-macros - 77 -Wc++11-extensions - 83 -Wgnu-flexible-array-initializer - 83 -Wzero-length-array - 96 -Wgnu-designator - 102 -Wmissing-noreturn - 103 -Wconditional-uninitialized - 107 -Wdisabled-macro-expansion - 115 -Wunreachable-code-return - 134 -Wunreachable-code - 243 -Wunreachable-code-break - 257 -Wfloat-conversion - 280 -Wswitch-enum - 291 -Wpointer-arith - 298 -Wshadow - 378 -Wassign-enum - 395 -Wused-but-marked-unused - 420 -Wreserved-id-macro - 493 -Wdocumentation - 510 -Wshift-sign-overflow - 565 -Wgnu-case-range - 566 -Wgnu-zero-variadic-macro-arguments - 650 -Wbad-function-cast - 705 -Wmissing-field-initializers - 817 -Wgnu-statement-expression - 968 -Wdocumentation-unknown-command - 1021 -Wextra-semi - 1112 -Wgnu-empty-initializer - 1138 -Wcast-qual - 1509 -Wcast-align - 1766 -Wextended-offsetof - 1937 -Wsign-compare - 2130 -Wpacked - 2404 -Wunused-macros - 3081 -Wpadded - 4182 -Wconversion - 5430 -Wlanguage-extension-token - 6655 -Wshorten-64-to-32 - 6995 -Wpedantic - 7354 -Wunused-parameter - 27659 -Wsign-conversion - -Stefan Weil <address@hidden> writes: - -> -A compilation test with clang -Weverything reported this problem: -> -> -config-host.h:112:20: warning: '$' in identifier -> -[-Wdollar-in-identifier-extension] -> -> -The line of code looks like this: -> -> -#define CONFIG_TPM $(CONFIG_SOFTMMU) -> -> -This is fine for Makefile code, but won't work as expected in C code. -Broken in commit 3b8acc1 "configure: fix TPM logic". Cc'ing Paolo. - -Impact: #ifdef CONFIG_TPM never disables code. There are no other uses -of CONFIG_TPM in C code. - -I had a quick peek at configure and create_config, but refrained from -attempting to fix this, since I don't understand when exactly CONFIG_TPM -should be defined. - -On 29 April 2016 at 08:42, Markus Armbruster <address@hidden> wrote: -> -Stefan Weil <address@hidden> writes: -> -> -> A compilation test with clang -Weverything reported this problem: -> -> -> -> config-host.h:112:20: warning: '$' in identifier -> -> [-Wdollar-in-identifier-extension] -> -> -> -> The line of code looks like this: -> -> -> -> #define CONFIG_TPM $(CONFIG_SOFTMMU) -> -> -> -> This is fine for Makefile code, but won't work as expected in C code. -> -> -Broken in commit 3b8acc1 "configure: fix TPM logic". Cc'ing Paolo. -> -> -Impact: #ifdef CONFIG_TPM never disables code. There are no other uses -> -of CONFIG_TPM in C code. -> -> -I had a quick peek at configure and create_config, but refrained from -> -attempting to fix this, since I don't understand when exactly CONFIG_TPM -> -should be defined. -Looking at 'git blame' suggests this has been wrong like this for -some years, so we don't need to scramble to fix it for 2.6. - -thanks --- PMM - diff --git a/results/classifier/02/other/56937788 b/results/classifier/02/other/56937788 deleted file mode 100644 index fc61ffe71..000000000 --- a/results/classifier/02/other/56937788 +++ /dev/null @@ -1,345 +0,0 @@ -other: 0.791 -mistranslation: 0.735 -semantic: 0.705 -instruction: 0.653 -boot: 0.636 - -[Qemu-devel] [Bug] virtio-blk: qemu will crash if hotplug virtio-blk device failed - -I found that hotplug virtio-blk device will lead to qemu crash. - -Re-production steps: - -1. Run VM named vm001 - -2. Create a virtio-blk.xml which contains wrong configurations: -<disk device="lun" rawio="yes" type="block"> - <driver cache="none" io="native" name="qemu" type="raw" /> - <source dev="/dev/mapper/11-dm" /> - <target bus="virtio" dev="vdx" /> -</disk> - -3. Run command : virsh attach-device vm001 vm001 - -Libvirt will return err msg: - -error: Failed to attach device from blk-scsi.xml - -error: internal error: unable to execute QEMU command 'device_add': Please set -scsi=off for virtio-blk devices in order to use virtio 1.0 - -it means hotplug virtio-blk device failed. - -4. Suspend or shutdown VM will leads to qemu crash - - - -from gdb: - - -(gdb) bt -#0 object_get_class (address@hidden) at qom/object.c:750 -#1 0x00007f9a72582e01 in virtio_vmstate_change (opaque=0x7f9a73d10960, -running=0, state=<optimized out>) at -/mnt/sdb/lzc/code/open/qemu/hw/virtio/virtio.c:2203 -#2 0x00007f9a7261ef52 in vm_state_notify (address@hidden, address@hidden) at -vl.c:1685 -#3 0x00007f9a7252603a in do_vm_stop (state=RUN_STATE_PAUSED) at -/mnt/sdb/lzc/code/open/qemu/cpus.c:941 -#4 vm_stop (address@hidden) at /mnt/sdb/lzc/code/open/qemu/cpus.c:1807 -#5 0x00007f9a7262eb1b in qmp_stop (address@hidden) at qmp.c:102 -#6 0x00007f9a7262c70a in qmp_marshal_stop (args=<optimized out>, -ret=<optimized out>, errp=0x7ffe63e255d8) at qmp-marshal.c:5854 -#7 0x00007f9a72897e79 in do_qmp_dispatch (errp=0x7ffe63e255d0, -request=0x7f9a76510120, cmds=0x7f9a72ee7980 <qmp_commands>) at -qapi/qmp-dispatch.c:104 -#8 qmp_dispatch (cmds=0x7f9a72ee7980 <qmp_commands>, address@hidden) at -qapi/qmp-dispatch.c:131 -#9 0x00007f9a725288d5 in handle_qmp_command (parser=<optimized out>, -tokens=<optimized out>) at /mnt/sdb/lzc/code/open/qemu/monitor.c:3852 -#10 0x00007f9a7289d514 in json_message_process_token (lexer=0x7f9a73ce4498, -input=0x7f9a73cc6880, type=JSON_RCURLY, x=36, y=17) at -qobject/json-streamer.c:105 -#11 0x00007f9a728bb69b in json_lexer_feed_char (address@hidden, ch=125 '}', -address@hidden) at qobject/json-lexer.c:323 -#12 0x00007f9a728bb75e in json_lexer_feed (lexer=0x7f9a73ce4498, -buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:373 -#13 0x00007f9a7289d5d9 in json_message_parser_feed (parser=<optimized out>, -buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 -#14 0x00007f9a7252722e in monitor_qmp_read (opaque=<optimized out>, -buf=<optimized out>, size=<optimized out>) at -/mnt/sdb/lzc/code/open/qemu/monitor.c:3894 -#15 0x00007f9a7284ee1b in tcp_chr_read (chan=<optimized out>, cond=<optimized -out>, opaque=<optimized out>) at chardev/char-socket.c:441 -#16 0x00007f9a6e03e99a in g_main_context_dispatch () from -/usr/lib64/libglib-2.0.so.0 -#17 0x00007f9a728a342c in glib_pollfds_poll () at util/main-loop.c:214 -#18 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 -#19 main_loop_wait (address@hidden) at util/main-loop.c:515 -#20 0x00007f9a724e7547 in main_loop () at vl.c:1999 -#21 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at -vl.c:4877 - -Problem happens in virtio_vmstate_change which is called by vm_state_notify, -static void virtio_vmstate_change(void *opaque, int running, RunState state) -{ - VirtIODevice *vdev = opaque; - BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); - VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); - bool backend_run = running && (vdev->status & VIRTIO_CONFIG_S_DRIVER_OK); - vdev->vm_running = running; - - if (backend_run) { - virtio_set_status(vdev, vdev->status); - } - - if (k->vmstate_change) { - k->vmstate_change(qbus->parent, backend_run); - } - - if (!backend_run) { - virtio_set_status(vdev, vdev->status); - } -} - -Vdev's parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash. -virtio_vmstate_change is added to the list vm_change_state_head at -virtio_blk_device_realize(virtio_init), -but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed -from vm_change_state_head. - - -I apply a patch as follews: - -diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c -index 5884ce3..ea532dc 100644 ---- a/hw/virtio/virtio.c -+++ b/hw/virtio/virtio.c -@@ -2491,6 +2491,7 @@ static void virtio_device_realize(DeviceState *dev, Error -**errp) - virtio_bus_device_plugged(vdev, &err); - if (err != NULL) { - error_propagate(errp, err); -+ vdc->unrealize(dev, NULL); - return; - } - -On Tue, Oct 31, 2017 at 05:19:08AM +0000, linzhecheng wrote: -> -I found that hotplug virtio-blk device will lead to qemu crash. -The author posted a patch in a separate email thread. Please see -"[PATCH] fix: unrealize virtio device if we fail to hotplug it". - -> -Re-production steps: -> -> -1. Run VM named vm001 -> -> -2. Create a virtio-blk.xml which contains wrong configurations: -> -<disk device="lun" rawio="yes" type="block"> -> -<driver cache="none" io="native" name="qemu" type="raw" /> -> -<source dev="/dev/mapper/11-dm" /> -> -<target bus="virtio" dev="vdx" /> -> -</disk> -> -> -3. Run command : virsh attach-device vm001 vm001 -> -> -Libvirt will return err msg: -> -> -error: Failed to attach device from blk-scsi.xml -> -> -error: internal error: unable to execute QEMU command 'device_add': Please -> -set scsi=off for virtio-blk devices in order to use virtio 1.0 -> -> -it means hotplug virtio-blk device failed. -> -> -4. Suspend or shutdown VM will leads to qemu crash -> -> -> -> -from gdb: -> -> -> -(gdb) bt -> -#0 object_get_class (address@hidden) at qom/object.c:750 -> -#1 0x00007f9a72582e01 in virtio_vmstate_change (opaque=0x7f9a73d10960, -> -running=0, state=<optimized out>) at -> -/mnt/sdb/lzc/code/open/qemu/hw/virtio/virtio.c:2203 -> -#2 0x00007f9a7261ef52 in vm_state_notify (address@hidden, address@hidden) at -> -vl.c:1685 -> -#3 0x00007f9a7252603a in do_vm_stop (state=RUN_STATE_PAUSED) at -> -/mnt/sdb/lzc/code/open/qemu/cpus.c:941 -> -#4 vm_stop (address@hidden) at /mnt/sdb/lzc/code/open/qemu/cpus.c:1807 -> -#5 0x00007f9a7262eb1b in qmp_stop (address@hidden) at qmp.c:102 -> -#6 0x00007f9a7262c70a in qmp_marshal_stop (args=<optimized out>, -> -ret=<optimized out>, errp=0x7ffe63e255d8) at qmp-marshal.c:5854 -> -#7 0x00007f9a72897e79 in do_qmp_dispatch (errp=0x7ffe63e255d0, -> -request=0x7f9a76510120, cmds=0x7f9a72ee7980 <qmp_commands>) at -> -qapi/qmp-dispatch.c:104 -> -#8 qmp_dispatch (cmds=0x7f9a72ee7980 <qmp_commands>, address@hidden) at -> -qapi/qmp-dispatch.c:131 -> -#9 0x00007f9a725288d5 in handle_qmp_command (parser=<optimized out>, -> -tokens=<optimized out>) at /mnt/sdb/lzc/code/open/qemu/monitor.c:3852 -> -#10 0x00007f9a7289d514 in json_message_process_token (lexer=0x7f9a73ce4498, -> -input=0x7f9a73cc6880, type=JSON_RCURLY, x=36, y=17) at -> -qobject/json-streamer.c:105 -> -#11 0x00007f9a728bb69b in json_lexer_feed_char (address@hidden, ch=125 '}', -> -address@hidden) at qobject/json-lexer.c:323 -> -#12 0x00007f9a728bb75e in json_lexer_feed (lexer=0x7f9a73ce4498, -> -buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:373 -> -#13 0x00007f9a7289d5d9 in json_message_parser_feed (parser=<optimized out>, -> -buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 -> -#14 0x00007f9a7252722e in monitor_qmp_read (opaque=<optimized out>, -> -buf=<optimized out>, size=<optimized out>) at -> -/mnt/sdb/lzc/code/open/qemu/monitor.c:3894 -> -#15 0x00007f9a7284ee1b in tcp_chr_read (chan=<optimized out>, cond=<optimized -> -out>, opaque=<optimized out>) at chardev/char-socket.c:441 -> -#16 0x00007f9a6e03e99a in g_main_context_dispatch () from -> -/usr/lib64/libglib-2.0.so.0 -> -#17 0x00007f9a728a342c in glib_pollfds_poll () at util/main-loop.c:214 -> -#18 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 -> -#19 main_loop_wait (address@hidden) at util/main-loop.c:515 -> -#20 0x00007f9a724e7547 in main_loop () at vl.c:1999 -> -#21 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) -> -at vl.c:4877 -> -> -Problem happens in virtio_vmstate_change which is called by vm_state_notify, -> -static void virtio_vmstate_change(void *opaque, int running, RunState state) -> -{ -> -VirtIODevice *vdev = opaque; -> -BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); -> -VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); -> -bool backend_run = running && (vdev->status & VIRTIO_CONFIG_S_DRIVER_OK); -> -vdev->vm_running = running; -> -> -if (backend_run) { -> -virtio_set_status(vdev, vdev->status); -> -} -> -> -if (k->vmstate_change) { -> -k->vmstate_change(qbus->parent, backend_run); -> -} -> -> -if (!backend_run) { -> -virtio_set_status(vdev, vdev->status); -> -} -> -} -> -> -Vdev's parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash. -> -virtio_vmstate_change is added to the list vm_change_state_head at -> -virtio_blk_device_realize(virtio_init), -> -but after hotplug virtio-blk failed, virtio_vmstate_change will not be -> -removed from vm_change_state_head. -> -> -> -I apply a patch as follews: -> -> -diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c -> -index 5884ce3..ea532dc 100644 -> ---- a/hw/virtio/virtio.c -> -+++ b/hw/virtio/virtio.c -> -@@ -2491,6 +2491,7 @@ static void virtio_device_realize(DeviceState *dev, -> -Error **errp) -> -virtio_bus_device_plugged(vdev, &err); -> -if (err != NULL) { -> -error_propagate(errp, err); -> -+ vdc->unrealize(dev, NULL); -> -return; -> -} -signature.asc -Description: -PGP signature - diff --git a/results/classifier/02/other/57195159 b/results/classifier/02/other/57195159 deleted file mode 100644 index e1d853240..000000000 --- a/results/classifier/02/other/57195159 +++ /dev/null @@ -1,316 +0,0 @@ -other: 0.868 -instruction: 0.833 -semantic: 0.794 -boot: 0.781 -mistranslation: 0.665 - -[BUG Report] Got a use-after-free error while start arm64 VM with lots of pci controllers - -Hi, - -We got a use-after-free report in our Euler Robot Test, it is can be reproduced -quite easily, -It can be reproduced by start VM with lots of pci controller and virtio-scsi -devices. -You can find the full qemu log from attachment. -We have analyzed the log and got the rough process how it happened, but don't -know how to fix it. - -Could anyone help to fix it ? - -The key message shows bellow: -har device redirected to /dev/pts/1 (label charserial0) -==1517174==WARNING: ASan doesn't fully support makecontext/swapcontext -functions and may produce false positives in some cases! -================================================================= -==1517174==ERROR: AddressSanitizer: heap-use-after-free on address -0xfffc31a002a0 at pc 0xaaad73e1f668 bp 0xfffc319fddb0 sp 0xfffc319fddd0 -READ of size 8 at 0xfffc31a002a0 thread T1 - #0 0xaaad73e1f667 in memory_region_unref /home/qemu/memory.c:1771 - #1 0xaaad73e1f667 in flatview_destroy /home/qemu/memory.c:291 - #2 0xaaad74adc85b in call_rcu_thread util/rcu.c:283 - #3 0xaaad74ab31db in qemu_thread_start util/qemu-thread-posix.c:519 - #4 0xfffc3a1678bb (/lib64/libpthread.so.0+0x78bb) - #5 0xfffc3a0a616b (/lib64/libc.so.6+0xd616b) - -0xfffc31a002a0 is located 544 bytes inside of 1440-byte region -[0xfffc31a00080,0xfffc31a00620) -freed by thread T37 (CPU 0/KVM) here: - #0 0xfffc3c102e23 in free (/lib64/libasan.so.4+0xd2e23) - #1 0xfffc3bbc729f in g_free (/lib64/libglib-2.0.so.0+0x5729f) - #2 0xaaad745cce03 in pci_bridge_update_mappings hw/pci/pci_bridge.c:245 - #3 0xaaad745ccf33 in pci_bridge_write_config hw/pci/pci_bridge.c:271 - #4 0xaaad745ba867 in pci_bridge_dev_write_config -hw/pci-bridge/pci_bridge_dev.c:153 - #5 0xaaad745d6013 in pci_host_config_write_common hw/pci/pci_host.c:81 - #6 0xaaad73e2346f in memory_region_write_accessor /home/qemu/memory.c:483 - #7 0xaaad73e1d9ff in access_with_adjusted_size /home/qemu/memory.c:544 - #8 0xaaad73e28d1f in memory_region_dispatch_write /home/qemu/memory.c:1482 - #9 0xaaad73d7274f in flatview_write_continue /home/qemu/exec.c:3167 - #10 0xaaad73d72a53 in flatview_write /home/qemu/exec.c:3207 - #11 0xaaad73d7c8c3 in address_space_write /home/qemu/exec.c:3297 - #12 0xaaad73e5059b in kvm_cpu_exec /home/qemu/accel/kvm/kvm-all.c:2386 - #13 0xaaad73e07ac7 in qemu_kvm_cpu_thread_fn /home/qemu/cpus.c:1246 - #14 0xaaad74ab31db in qemu_thread_start util/qemu-thread-posix.c:519 - #15 0xfffc3a1678bb (/lib64/libpthread.so.0+0x78bb) - #16 0xfffc3a0a616b (/lib64/libc.so.6+0xd616b) - -previously allocated by thread T0 here: - #0 0xfffc3c1031cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) - #1 0xfffc3bbc7163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) - #2 0xaaad745ccb57 in pci_bridge_region_init hw/pci/pci_bridge.c:188 - #3 0xaaad745cd8cb in pci_bridge_initfn hw/pci/pci_bridge.c:385 - #4 0xaaad745baaf3 in pci_bridge_dev_realize -hw/pci-bridge/pci_bridge_dev.c:64 - #5 0xaaad745cacd7 in pci_qdev_realize hw/pci/pci.c:2095 - #6 0xaaad7439d9f7 in device_set_realized hw/core/qdev.c:865 - #7 0xaaad7485ed23 in property_set_bool qom/object.c:2102 - #8 0xaaad74868f4b in object_property_set_qobject qom/qom-qobject.c:26 - #9 0xaaad74863a43 in object_property_set_bool qom/object.c:1360 - #10 0xaaad742a53b7 in qdev_device_add /home/qemu/qdev-monitor.c:675 - #11 0xaaad742a9c7b in device_init_func /home/qemu/vl.c:2074 - #12 0xaaad74ad4d33 in qemu_opts_foreach util/qemu-option.c:1170 - #13 0xaaad73d60c17 in main /home/qemu/vl.c:4313 - #14 0xfffc39ff0b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) - #15 0xaaad73d6db33 -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) - -Thread T1 created by T0 here: - #0 0xfffc3c068f6f in __interceptor_pthread_create -(/lib64/libasan.so.4+0x38f6f) - #1 0xaaad74ab54ab in qemu_thread_create util/qemu-thread-posix.c:556 - #2 0xaaad74adc6a7 in rcu_init_complete util/rcu.c:326 - #3 0xaaad74bab2a7 in __libc_csu_init -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x17cb2a7) - #4 0xfffc39ff0b47 in __libc_start_main (/lib64/libc.so.6+0x20b47) - #5 0xaaad73d6db33 (/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) - -Thread T37 (CPU 0/KVM) created by T0 here: - #0 0xfffc3c068f6f in __interceptor_pthread_create -(/lib64/libasan.so.4+0x38f6f) - #1 0xaaad74ab54ab in qemu_thread_create util/qemu-thread-posix.c:556 - #2 0xaaad73e09b0f in qemu_dummy_start_vcpu /home/qemu/cpus.c:2045 - #3 0xaaad73e09b0f in qemu_init_vcpu /home/qemu/cpus.c:2077 - #4 0xaaad740d36b7 in arm_cpu_realizefn /home/qemu/target/arm/cpu.c:1712 - #5 0xaaad7439d9f7 in device_set_realized hw/core/qdev.c:865 - #6 0xaaad7485ed23 in property_set_bool qom/object.c:2102 - #7 0xaaad74868f4b in object_property_set_qobject qom/qom-qobject.c:26 - #8 0xaaad74863a43 in object_property_set_bool qom/object.c:1360 - #9 0xaaad73fe3e67 in machvirt_init /home/qemu/hw/arm/virt.c:1682 - #10 0xaaad743acfc7 in machine_run_board_init hw/core/machine.c:1077 - #11 0xaaad73d60b73 in main /home/qemu/vl.c:4292 - #12 0xfffc39ff0b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) - #13 0xaaad73d6db33 -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) - -SUMMARY: AddressSanitizer: heap-use-after-free /home/qemu/memory.c:1771 in -memory_region_unref - -Thanks -use-after-free-qemu.log -Description: -Text document - -Cc: address@hidden - -On 1/17/2020 4:18 PM, Pan Nengyuan wrote: -> -Hi, -> -> -We got a use-after-free report in our Euler Robot Test, it is can be -> -reproduced quite easily, -> -It can be reproduced by start VM with lots of pci controller and virtio-scsi -> -devices. -> -You can find the full qemu log from attachment. -> -We have analyzed the log and got the rough process how it happened, but don't -> -know how to fix it. -> -> -Could anyone help to fix it ? -> -> -The key message shows bellow: -> -har device redirected to /dev/pts/1 (label charserial0) -> -==1517174==WARNING: ASan doesn't fully support makecontext/swapcontext -> -functions and may produce false positives in some cases! -> -================================================================= -> -==1517174==ERROR: AddressSanitizer: heap-use-after-free on address -> -0xfffc31a002a0 at pc 0xaaad73e1f668 bp 0xfffc319fddb0 sp 0xfffc319fddd0 -> -READ of size 8 at 0xfffc31a002a0 thread T1 -> -#0 0xaaad73e1f667 in memory_region_unref /home/qemu/memory.c:1771 -> -#1 0xaaad73e1f667 in flatview_destroy /home/qemu/memory.c:291 -> -#2 0xaaad74adc85b in call_rcu_thread util/rcu.c:283 -> -#3 0xaaad74ab31db in qemu_thread_start util/qemu-thread-posix.c:519 -> -#4 0xfffc3a1678bb (/lib64/libpthread.so.0+0x78bb) -> -#5 0xfffc3a0a616b (/lib64/libc.so.6+0xd616b) -> -> -0xfffc31a002a0 is located 544 bytes inside of 1440-byte region -> -[0xfffc31a00080,0xfffc31a00620) -> -freed by thread T37 (CPU 0/KVM) here: -> -#0 0xfffc3c102e23 in free (/lib64/libasan.so.4+0xd2e23) -> -#1 0xfffc3bbc729f in g_free (/lib64/libglib-2.0.so.0+0x5729f) -> -#2 0xaaad745cce03 in pci_bridge_update_mappings hw/pci/pci_bridge.c:245 -> -#3 0xaaad745ccf33 in pci_bridge_write_config hw/pci/pci_bridge.c:271 -> -#4 0xaaad745ba867 in pci_bridge_dev_write_config -> -hw/pci-bridge/pci_bridge_dev.c:153 -> -#5 0xaaad745d6013 in pci_host_config_write_common hw/pci/pci_host.c:81 -> -#6 0xaaad73e2346f in memory_region_write_accessor /home/qemu/memory.c:483 -> -#7 0xaaad73e1d9ff in access_with_adjusted_size /home/qemu/memory.c:544 -> -#8 0xaaad73e28d1f in memory_region_dispatch_write /home/qemu/memory.c:1482 -> -#9 0xaaad73d7274f in flatview_write_continue /home/qemu/exec.c:3167 -> -#10 0xaaad73d72a53 in flatview_write /home/qemu/exec.c:3207 -> -#11 0xaaad73d7c8c3 in address_space_write /home/qemu/exec.c:3297 -> -#12 0xaaad73e5059b in kvm_cpu_exec /home/qemu/accel/kvm/kvm-all.c:2386 -> -#13 0xaaad73e07ac7 in qemu_kvm_cpu_thread_fn /home/qemu/cpus.c:1246 -> -#14 0xaaad74ab31db in qemu_thread_start util/qemu-thread-posix.c:519 -> -#15 0xfffc3a1678bb (/lib64/libpthread.so.0+0x78bb) -> -#16 0xfffc3a0a616b (/lib64/libc.so.6+0xd616b) -> -> -previously allocated by thread T0 here: -> -#0 0xfffc3c1031cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) -> -#1 0xfffc3bbc7163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) -> -#2 0xaaad745ccb57 in pci_bridge_region_init hw/pci/pci_bridge.c:188 -> -#3 0xaaad745cd8cb in pci_bridge_initfn hw/pci/pci_bridge.c:385 -> -#4 0xaaad745baaf3 in pci_bridge_dev_realize -> -hw/pci-bridge/pci_bridge_dev.c:64 -> -#5 0xaaad745cacd7 in pci_qdev_realize hw/pci/pci.c:2095 -> -#6 0xaaad7439d9f7 in device_set_realized hw/core/qdev.c:865 -> -#7 0xaaad7485ed23 in property_set_bool qom/object.c:2102 -> -#8 0xaaad74868f4b in object_property_set_qobject qom/qom-qobject.c:26 -> -#9 0xaaad74863a43 in object_property_set_bool qom/object.c:1360 -> -#10 0xaaad742a53b7 in qdev_device_add /home/qemu/qdev-monitor.c:675 -> -#11 0xaaad742a9c7b in device_init_func /home/qemu/vl.c:2074 -> -#12 0xaaad74ad4d33 in qemu_opts_foreach util/qemu-option.c:1170 -> -#13 0xaaad73d60c17 in main /home/qemu/vl.c:4313 -> -#14 0xfffc39ff0b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) -> -#15 0xaaad73d6db33 -> -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) -> -> -Thread T1 created by T0 here: -> -#0 0xfffc3c068f6f in __interceptor_pthread_create -> -(/lib64/libasan.so.4+0x38f6f) -> -#1 0xaaad74ab54ab in qemu_thread_create util/qemu-thread-posix.c:556 -> -#2 0xaaad74adc6a7 in rcu_init_complete util/rcu.c:326 -> -#3 0xaaad74bab2a7 in __libc_csu_init -> -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x17cb2a7) -> -#4 0xfffc39ff0b47 in __libc_start_main (/lib64/libc.so.6+0x20b47) -> -#5 0xaaad73d6db33 -> -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) -> -> -Thread T37 (CPU 0/KVM) created by T0 here: -> -#0 0xfffc3c068f6f in __interceptor_pthread_create -> -(/lib64/libasan.so.4+0x38f6f) -> -#1 0xaaad74ab54ab in qemu_thread_create util/qemu-thread-posix.c:556 -> -#2 0xaaad73e09b0f in qemu_dummy_start_vcpu /home/qemu/cpus.c:2045 -> -#3 0xaaad73e09b0f in qemu_init_vcpu /home/qemu/cpus.c:2077 -> -#4 0xaaad740d36b7 in arm_cpu_realizefn /home/qemu/target/arm/cpu.c:1712 -> -#5 0xaaad7439d9f7 in device_set_realized hw/core/qdev.c:865 -> -#6 0xaaad7485ed23 in property_set_bool qom/object.c:2102 -> -#7 0xaaad74868f4b in object_property_set_qobject qom/qom-qobject.c:26 -> -#8 0xaaad74863a43 in object_property_set_bool qom/object.c:1360 -> -#9 0xaaad73fe3e67 in machvirt_init /home/qemu/hw/arm/virt.c:1682 -> -#10 0xaaad743acfc7 in machine_run_board_init hw/core/machine.c:1077 -> -#11 0xaaad73d60b73 in main /home/qemu/vl.c:4292 -> -#12 0xfffc39ff0b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) -> -#13 0xaaad73d6db33 -> -(/home/qemu/aarch64-softmmu/qemu-system-aarch64+0x98db33) -> -> -SUMMARY: AddressSanitizer: heap-use-after-free /home/qemu/memory.c:1771 in -> -memory_region_unref -> -> -Thanks -> -use-after-free-qemu.log -Description: -Text document - diff --git a/results/classifier/02/other/57231878 b/results/classifier/02/other/57231878 deleted file mode 100644 index 095b63fca..000000000 --- a/results/classifier/02/other/57231878 +++ /dev/null @@ -1,243 +0,0 @@ -other: 0.788 -semantic: 0.774 -mistranslation: 0.719 -instruction: 0.661 -boot: 0.609 - -[Qemu-devel] [BUG] qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed. - -Hello all, -I wanted to submit a bug report in the tracker, but it seem to require -an Ubuntu One account, which I'm having trouble with, so I'll just -give it here and hopefully somebody can make use of it. The issue -seems to be in an experimental format, so it's likely not very -consequential anyway. - -For the sake of anyone else simply googling for a workaround, I'll -just paste in the (cleaned up) brief IRC conversation about my issue -from the official channel: -<quy> I'm using QEMU version 2.12.0 on an x86_64 host (Arch Linux, -Kernel v4.17.2), and I'm trying to create an x86_64 virtual machine -(FreeBSD-11.1). The VM always aborts at the same point in the -installation (downloading 'ports.tgz') with the following error -message: -"qemu-system-x86_64: /build/qemu/src/qemu-2.12.0/block/qed.c:1197: -qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed. -zsh: abort (core dumped) qemu-system-x86_64 -smp 2 -m 4096 --enable-kvm -hda freebsd/freebsd.qed -devic" -The commands I ran to create the machine are as follows: -"qemu-img create -f qed freebsd/freebsd.qed 16G" -"qemu-system-x86_64 -smp 2 -m 4096 -enable-kvm -hda -freebsd/freebsd.qed -device e1000,netdev=net0 -netdev user,id=net0 --cdrom FreeBSD-11.1-RELEASE-amd64-bootonly.iso -boot order=d" -I tried adding logging options with the -d flag, but I didn't get -anything that seemed relevant, since I'm not sure what to look for. -<stsquad> ohh what's a qed device? -<stsquad> quy: it might be a workaround to use a qcow2 image for now -<stsquad> ahh the wiki has a statement "It is not recommended to use -QED for any new images. " -<danpb> 'qed' was an experimental disk image format created by IBM -before qcow2 v3 came along -<danpb> honestly nothing should ever use QED these days -<danpb> the good ideas from QED became qcow2v3 -<stsquad> danpb: sounds like we should put a warning on the option to -remind users of that fact -<danpb> quy: sounds like qed driver is simply broken - please do file -a bug against qemu bug tracker -<danpb> quy: but you should also really switch to qcow2 -<quy> I see; some people need to update their wikis then. I don't -remember where which guide I read when I first learned what little -QEMU I know, but I remember it specifically remember it saying QED was -the newest and most optimal format. -<stsquad> quy: we can only be responsible for our own wiki I'm afraid... -<danpb> if you remember where you saw that please let us know so we -can try to get it fixed -<quy> Thank you very much for the info; I will switch to QCOW. -Unfortunately, I'm not sure if I will be able to file any bug reports -in the tracker as I can't seem to log Launchpad, which it seems to -require. -<danpb> quy: an email to the mailing list would suffice too if you -can't deal with launchpad -<danpb> kwolf: ^^^ in case you're interested in possible QED -assertions from 2.12 - -If any more info is needed, feel free to email me; I'm not actually -subscribed to this list though. -Thank you, -Quytelda Kahja - -CC Qemu Block; looks like QED is a bit busted. - -On 06/27/2018 10:25 AM, Quytelda Kahja wrote: -> -Hello all, -> -I wanted to submit a bug report in the tracker, but it seem to require -> -an Ubuntu One account, which I'm having trouble with, so I'll just -> -give it here and hopefully somebody can make use of it. The issue -> -seems to be in an experimental format, so it's likely not very -> -consequential anyway. -> -> -For the sake of anyone else simply googling for a workaround, I'll -> -just paste in the (cleaned up) brief IRC conversation about my issue -> -from the official channel: -> -<quy> I'm using QEMU version 2.12.0 on an x86_64 host (Arch Linux, -> -Kernel v4.17.2), and I'm trying to create an x86_64 virtual machine -> -(FreeBSD-11.1). The VM always aborts at the same point in the -> -installation (downloading 'ports.tgz') with the following error -> -message: -> -"qemu-system-x86_64: /build/qemu/src/qemu-2.12.0/block/qed.c:1197: -> -qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed. -> -zsh: abort (core dumped) qemu-system-x86_64 -smp 2 -m 4096 -> --enable-kvm -hda freebsd/freebsd.qed -devic" -> -The commands I ran to create the machine are as follows: -> -"qemu-img create -f qed freebsd/freebsd.qed 16G" -> -"qemu-system-x86_64 -smp 2 -m 4096 -enable-kvm -hda -> -freebsd/freebsd.qed -device e1000,netdev=net0 -netdev user,id=net0 -> --cdrom FreeBSD-11.1-RELEASE-amd64-bootonly.iso -boot order=d" -> -I tried adding logging options with the -d flag, but I didn't get -> -anything that seemed relevant, since I'm not sure what to look for. -> -<stsquad> ohh what's a qed device? -> -<stsquad> quy: it might be a workaround to use a qcow2 image for now -> -<stsquad> ahh the wiki has a statement "It is not recommended to use -> -QED for any new images. " -> -<danpb> 'qed' was an experimental disk image format created by IBM -> -before qcow2 v3 came along -> -<danpb> honestly nothing should ever use QED these days -> -<danpb> the good ideas from QED became qcow2v3 -> -<stsquad> danpb: sounds like we should put a warning on the option to -> -remind users of that fact -> -<danpb> quy: sounds like qed driver is simply broken - please do file -> -a bug against qemu bug tracker -> -<danpb> quy: but you should also really switch to qcow2 -> -<quy> I see; some people need to update their wikis then. I don't -> -remember where which guide I read when I first learned what little -> -QEMU I know, but I remember it specifically remember it saying QED was -> -the newest and most optimal format. -> -<stsquad> quy: we can only be responsible for our own wiki I'm afraid... -> -<danpb> if you remember where you saw that please let us know so we -> -can try to get it fixed -> -<quy> Thank you very much for the info; I will switch to QCOW. -> -Unfortunately, I'm not sure if I will be able to file any bug reports -> -in the tracker as I can't seem to log Launchpad, which it seems to -> -require. -> -<danpb> quy: an email to the mailing list would suffice too if you -> -can't deal with launchpad -> -<danpb> kwolf: ^^^ in case you're interested in possible QED -> -assertions from 2.12 -> -> -If any more info is needed, feel free to email me; I'm not actually -> -subscribed to this list though. -> -Thank you, -> -Quytelda Kahja -> - -On 06/29/2018 03:07 PM, John Snow wrote: -CC Qemu Block; looks like QED is a bit busted. - -On 06/27/2018 10:25 AM, Quytelda Kahja wrote: -Hello all, -I wanted to submit a bug report in the tracker, but it seem to require -an Ubuntu One account, which I'm having trouble with, so I'll just -give it here and hopefully somebody can make use of it. The issue -seems to be in an experimental format, so it's likely not very -consequential anyway. -Analysis in another thread may be relevant: -https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08963.html --- -Eric Blake, Principal Software Engineer -Red Hat, Inc. +1-919-301-3266 -Virtualization: qemu.org | libvirt.org - -Am 29.06.2018 um 22:16 hat Eric Blake geschrieben: -> -On 06/29/2018 03:07 PM, John Snow wrote: -> -> CC Qemu Block; looks like QED is a bit busted. -> -> -> -> On 06/27/2018 10:25 AM, Quytelda Kahja wrote: -> -> > Hello all, -> -> > I wanted to submit a bug report in the tracker, but it seem to require -> -> > an Ubuntu One account, which I'm having trouble with, so I'll just -> -> > give it here and hopefully somebody can make use of it. The issue -> -> > seems to be in an experimental format, so it's likely not very -> -> > consequential anyway. -> -> -Analysis in another thread may be relevant: -> -> -https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08963.html -The assertion there was: - -qemu-system-x86_64: block.c:3434: bdrv_replace_node: Assertion -`!atomic_read(&to->in_flight)' failed. - -Which quite clearly pointed to a drain bug. This one, however, doesn't -seem to be related to drain, so I think it's probably a different bug. - -Kevin - diff --git a/results/classifier/02/other/57756589 b/results/classifier/02/other/57756589 deleted file mode 100644 index 05e47fd45..000000000 --- a/results/classifier/02/other/57756589 +++ /dev/null @@ -1,1422 +0,0 @@ -other: 0.899 -mistranslation: 0.861 -instruction: 0.854 -semantic: 0.835 -boot: 0.827 - -[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: [BUG]COLO failover hang - -amost like wikiï¼but panic in Primary Node. - - - - -setp: - -1 - -Primary Node. - -x86_64-softmmu/qemu-system-x86_64 -enable-kvm -boot c -m 2048 -smp 2 -qmp stdio --vnc :7 -name primary -cpu qemu64,+kvmclock -device piix3-usb-uhci -usb --usbdevice tablet\ - - -drive -if=virtio,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1, - - -children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=qcow2 - -S \ - - -netdev -tap,id=hn1,vhost=off,script=/etc/qemu-ifup2,downscript=/etc/qemu-ifdown2 \ - - -device e1000,id=e1,netdev=hn1,mac=52:a4:00:12:78:67 \ - - -netdev -tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ - - -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 \ - - -chardev socket,id=mirror0,host=9.61.1.8,port=9003,server,nowait -chardev -socket,id=compare1,host=9.61.1.8,port=9004,server,nowait \ - - -chardev socket,id=compare0,host=9.61.1.8,port=9001,server,nowait -chardev -socket,id=compare0-0,host=9.61.1.8,port=9001 \ - - -chardev socket,id=compare_out,host=9.61.1.8,port=9005,server,nowait \ - - -chardev socket,id=compare_out0,host=9.61.1.8,port=9005 \ - - -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \ - - -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out --object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \ - - -object -colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0 - -2 Second node: - -x86_64-softmmu/qemu-system-x86_64 -boot c -m 2048 -smp 2 -qmp stdio -vnc :7 --name secondary -enable-kvm -cpu qemu64,+kvmclock -device piix3-usb-uhci -usb --usbdevice tablet\ - - -drive -if=none,id=colo-disk0,file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,driver=qcow2,node-name=node0 - \ - - -drive -if=virtio,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,top-id=active-disk0,file.file.filename=/mnt/ramfstest/active_disk.img,file.backing.driver=qcow2,file.backing.file.filename=/mnt/ramfstest/hidden_disk.img,file.backing.backing=colo-disk0 - \ - - -netdev -tap,id=hn1,vhost=off,script=/etc/qemu-ifup2,downscript=/etc/qemu-ifdown2 \ - - -device e1000,id=e1,netdev=hn1,mac=52:a4:00:12:78:67 \ - - -netdev -tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ - - -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 -chardev -socket,id=red0,host=9.61.1.8,port=9003 \ - - -chardev socket,id=red1,host=9.61.1.8,port=9004 \ - - -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \ - - -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \ - - -object filter-rewriter,id=rew0,netdev=hn0,queue=all -incoming tcp:0:8888 - -3 Secondary node: - -{'execute':'qmp_capabilities'} - -{ 'execute': 'nbd-server-start', - - 'arguments': {'addr': {'type': 'inet', 'data': {'host': '9.61.1.7', 'port': -'8889'} } } - -} - -{'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk0', 'writable': -true } } - -4:Primary Nodeï¼ - -{'execute':'qmp_capabilities'} - - -{ 'execute': 'human-monitor-command', - - 'arguments': {'command-line': 'drive_add -n buddy -driver=replication,mode=primary,file.driver=nbd,file.host=9.61.1.7,file.port=8889,file.export=colo-disk0,node-name=node0'}} - -{ 'execute':'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': -'node0' } } - -{ 'execute': 'migrate-set-capabilities', - - 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } -] } } - -{ 'execute': 'migrate', 'arguments': {'uri': 'tcp:9.61.1.7:8888' } } - - - - -then can see two runing VMs, whenever you make changes to PVM, SVM will be -synced. - - - - -5ï¼Primary Nodeï¼ - -echo c ï¼ /proc/sysrq-trigger - - - - -ï¼ï¼Secondary node: - -{ 'execute': 'nbd-server-stop' } - -{ "execute": "x-colo-lost-heartbeat" } - - - - -then can see the Secondary node hang at recvmsg recvmsg . - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ21æ¥ 16:27 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang - - - - - -Hi, - -On 2017/3/21 16:10, address@hidden wrote: -ï¼ Thank youã -ï¼ -ï¼ I have test areadyã -ï¼ -ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same placeã -ï¼ -ï¼ Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node qemu -will not produce the problem,but Primary Node panic canã -ï¼ -ï¼ I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -ï¼ -ï¼ - -Yes, you are right, when we do failover for primary/secondary VM, we will -shutdown the related -fd in case it is stuck in the read/write fd. - -It seems that you didn't follow the above introduction exactly to do the test. -Could you -share your test procedures ? Especially the commands used in the test. - -Thanks, -Hailiang - -ï¼ when failover,channel_shutdown could not shut down the channel. -ï¼ -ï¼ -ï¼ so the colo_process_incoming_thread will hang at recvmsg. -ï¼ -ï¼ -ï¼ I test a patch: -ï¼ -ï¼ -ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ -ï¼ -ï¼ index 13966f1..d65a0ea 100644 -ï¼ -ï¼ -ï¼ --- a/migration/socket.c -ï¼ -ï¼ -ï¼ +++ b/migration/socket.c -ï¼ -ï¼ -ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ -ï¼ -ï¼ } -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ trace_migration_socket_incoming_accepted() -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -ï¼ -ï¼ -ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ -ï¼ -ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ -ï¼ -ï¼ QIO_CHANNEL(sioc)) -ï¼ -ï¼ -ï¼ object_unref(OBJECT(sioc)) -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ My test will not hang any more. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ -ï¼ -ï¼ -ï¼ åä»¶äººï¼ address@hidden -ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden -ï¼ æéäººï¼ address@hidden address@hidden -ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -ï¼ ä¸» é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ Hi,Wang. -ï¼ -ï¼ You can test this branch: -ï¼ -ï¼ -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -ï¼ -ï¼ and please follow wiki ensure your own configuration correctly. -ï¼ -ï¼ -http://wiki.qemu-project.org/Features/COLO -ï¼ -ï¼ -ï¼ Thanks -ï¼ -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ ï¼ -ï¼ ï¼ hi. -ï¼ ï¼ -ï¼ ï¼ I test the git qemu master have the same problem. -ï¼ ï¼ -ï¼ ï¼ (gdb) bt -ï¼ ï¼ -ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ ï¼ -ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ ï¼ (address@hidden, address@hidden "", -ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ ï¼ -ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ -ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ ï¼ migration/qemu-file.c:295 -ï¼ ï¼ -ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ ï¼ -ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:568 -ï¼ ï¼ -ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:648 -ï¼ ï¼ -ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ ï¼ address@hidden) at migration/colo.c:244 -ï¼ ï¼ -ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ ï¼ outï¼, address@hidden, -ï¼ ï¼ address@hidden) -ï¼ ï¼ -ï¼ ï¼ at migration/colo.c:264 -ï¼ ï¼ -ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ ï¼ -ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ -ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ -ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ -ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ ï¼ -ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ ï¼ -ï¼ ï¼ $3 = 0 -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ (gdb) bt -ï¼ ï¼ -ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ ï¼ -ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ ï¼ gmain.c:3054 -ï¼ ï¼ -ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ ï¼ address@hidden) at gmain.c:3630 -ï¼ ï¼ -ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ ï¼ -ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ ï¼ util/main-loop.c:258 -ï¼ ï¼ -ï¼ ï¼ #5 main_loop_wait (address@hidden) at -ï¼ ï¼ util/main-loop.c:506 -ï¼ ï¼ -ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ ï¼ -ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ ï¼ outï¼) at vl.c:4709 -ï¼ ï¼ -ï¼ ï¼ (gdb) p ioc-ï¼features -ï¼ ï¼ -ï¼ ï¼ $1 = 6 -ï¼ ï¼ -ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ ï¼ -ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ May be socket_accept_incoming_migration should -ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ thank you. -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ åå§é®ä»¶ -ï¼ ï¼ address@hidden -ï¼ ï¼ address@hidden -ï¼ ï¼ address@hidden@huawei.comï¼ -ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ ï¼ Do the colo have any plan for development? -ï¼ ï¼ -ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ ï¼ -ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ ï¼ -ï¼ ï¼ In our internal version can run it successfully, -ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ ï¼ Next time if you have some question about COLO, -ï¼ ï¼ please cc me and zhanghailiang address@hidden -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ Thanks -ï¼ ï¼ Zhang Chen -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ ï¼ (gdb) bt -ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ ï¼ address@hidden) at migration/colo..c:215 -ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -- -ï¼ ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -- -ï¼ ï¼ Thanks -ï¼ ï¼ Zhang Chen -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ - -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -Is this patch ok? - -I have test it . The test could not hang any more. - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ address@hidden address@hidden -æéäººï¼ address@hidden address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang - - - - - -On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -ï¼ * Hailiang Zhang (address@hidden) wrote: -ï¼ï¼ Hi, -ï¼ï¼ -ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. -ï¼ï¼ -ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do failover, -ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) -ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if -ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). -ï¼ï¼ -ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden -ï¼ï¼ -ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() -ï¼ï¼ if we tried to cancel migration. -ï¼ï¼ -ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, -Error **errp) -ï¼ï¼ { -ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") -ï¼ï¼ migration_channel_connect(s, ioc, NULL) -ï¼ï¼ ... ... -ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -ï¼ï¼ and the -ï¼ï¼ migrate_fd_cancel() -ï¼ï¼ { -ï¼ï¼ ... ... -ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { -ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? -ï¼ï¼ } -ï¼ï¼ } -ï¼ -ï¼ (cc'd in Daniel Berrange). -ï¼ I see that we call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN) -at the -ï¼ top of qio_channel_socket_new so I think that's safe isn't it? -ï¼ - -Hmm, you are right, this problem is only exist for the migration incoming fd, -thanks. - -ï¼ Dave -ï¼ -ï¼ï¼ Thanks, -ï¼ï¼ Hailiang -ï¼ï¼ -ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: -ï¼ï¼ï¼ Thank youã -ï¼ï¼ï¼ -ï¼ï¼ï¼ I have test areadyã -ï¼ï¼ï¼ -ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same placeã -ï¼ï¼ï¼ -ï¼ï¼ï¼ Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node -qemu will not produce the problem,but Primary Node panic canã -ï¼ï¼ï¼ -ï¼ï¼ï¼ I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ I test a patch: -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ --- a/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ +++ b/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ } -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ QIO_CHANNEL(sioc)) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ object_unref(OBJECT(sioc)) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ My test will not hang any more. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ åå§é®ä»¶ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ åä»¶äººï¼ address@hidden -ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden -ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden -ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ Hi,Wang. -ï¼ï¼ï¼ -ï¼ï¼ï¼ You can test this branch: -ï¼ï¼ï¼ -ï¼ï¼ï¼ -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -ï¼ï¼ï¼ -ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -http://wiki.qemu-project.org/Features/COLO -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ Thanks -ï¼ï¼ï¼ -ï¼ï¼ï¼ Zhang Chen -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ hi. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", -ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ï¼ï¼ ï¼ outï¼, address@hidden, -ï¼ï¼ï¼ ï¼ address@hidden) -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ at migration/colo.c:264 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $3 = 0 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ gmain.c:3054 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ util/main-loop.c:258 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at -ï¼ï¼ï¼ ï¼ util/main-loop.c:506 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $1 = 6 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should -ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ thank you. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ åå§é®ä»¶ -ï¼ï¼ï¼ ï¼ address@hidden -ï¼ï¼ï¼ ï¼ address@hidden -ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ -ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ï¼ï¼ ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, -ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, -ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ Thanks -ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) -at -ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 -ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 -ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 -ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 -ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 -ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 -ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -- -ï¼ï¼ï¼ ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -- -ï¼ï¼ï¼ ï¼ Thanks -ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ -ï¼ï¼ -ï¼ -- -ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK -ï¼ -ï¼ . -ï¼ - -Hi, - -On 2017/3/22 9:42, address@hidden wrote: -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -Is this patch ok? -Yes, i think this works, but a better way maybe to call -qio_channel_set_feature() -in qio_channel_socket_accept(), we didn't set the SHUTDOWN feature for the -socket accept fd, -Or fix it by this: - -diff --git a/io/channel-socket.c b/io/channel-socket.c -index f546c68..ce6894c 100644 ---- a/io/channel-socket.c -+++ b/io/channel-socket.c -@@ -330,9 +330,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, - Error **errp) - { - QIOChannelSocket *cioc; -- -- cioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); -- cioc->fd = -1; -+ -+ cioc = qio_channel_socket_new(); - cioc->remoteAddrLen = sizeof(ioc->remoteAddr); - cioc->localAddrLen = sizeof(ioc->localAddr); - - -Thanks, -Hailiang -I have test it . The test could not hang any more. - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ address@hidden address@hidden -æéäººï¼ address@hidden address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang - - - - - -On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -ï¼ * Hailiang Zhang (address@hidden) wrote: -ï¼ï¼ Hi, -ï¼ï¼ -ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. -ï¼ï¼ -ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do failover, -ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) -ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if -ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). -ï¼ï¼ -ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden -ï¼ï¼ -ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() -ï¼ï¼ if we tried to cancel migration. -ï¼ï¼ -ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, -Error **errp) -ï¼ï¼ { -ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") -ï¼ï¼ migration_channel_connect(s, ioc, NULL) -ï¼ï¼ ... ... -ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -ï¼ï¼ and the -ï¼ï¼ migrate_fd_cancel() -ï¼ï¼ { -ï¼ï¼ ... ... -ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { -ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? -ï¼ï¼ } -ï¼ï¼ } -ï¼ -ï¼ (cc'd in Daniel Berrange). -ï¼ I see that we call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN) -at the -ï¼ top of qio_channel_socket_new so I think that's safe isn't it? -ï¼ - -Hmm, you are right, this problem is only exist for the migration incoming fd, -thanks. - -ï¼ Dave -ï¼ -ï¼ï¼ Thanks, -ï¼ï¼ Hailiang -ï¼ï¼ -ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: -ï¼ï¼ï¼ Thank youã -ï¼ï¼ï¼ -ï¼ï¼ï¼ I have test areadyã -ï¼ï¼ï¼ -ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same placeã -ï¼ï¼ï¼ -ï¼ï¼ï¼ Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node -qemu will not produce the problem,but Primary Node panic canã -ï¼ï¼ï¼ -ï¼ï¼ï¼ I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ I test a patch: -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ --- a/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ +++ b/migration/socket.c -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean -socket_accept_incoming_migration(QIOChannel *ioc, -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ } -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ QIO_CHANNEL(sioc)) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ object_unref(OBJECT(sioc)) -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ My test will not hang any more. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ åå§é®ä»¶ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ åä»¶äººï¼ address@hidden -ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden -ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden -ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ Hi,Wang. -ï¼ï¼ï¼ -ï¼ï¼ï¼ You can test this branch: -ï¼ï¼ï¼ -ï¼ï¼ï¼ -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -ï¼ï¼ï¼ -ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. -ï¼ï¼ï¼ -ï¼ï¼ï¼ -http://wiki.qemu-project.org/Features/COLO -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ Thanks -ï¼ï¼ï¼ -ï¼ï¼ï¼ Zhang Chen -ï¼ï¼ï¼ -ï¼ï¼ï¼ -ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ hi. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", -ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ï¼ï¼ ï¼ outï¼, address@hidden, -ï¼ï¼ï¼ ï¼ address@hidden) -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ at migration/colo.c:264 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $3 = 0 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ gmain.c:3054 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ util/main-loop.c:258 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at -ï¼ï¼ï¼ ï¼ util/main-loop.c:506 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $1 = 6 -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should -ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ thank you. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ åå§é®ä»¶ -ï¼ï¼ï¼ ï¼ address@hidden -ï¼ï¼ï¼ ï¼ address@hidden -ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ -ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ï¼ï¼ ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, -ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, -ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ Thanks -ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt -ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) -at -ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 -ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 -ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 -ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 -ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 -ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 -ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -- -ï¼ï¼ï¼ ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -- -ï¼ï¼ï¼ ï¼ Thanks -ï¼ï¼ï¼ ï¼ Zhang Chen -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ ï¼ -ï¼ï¼ï¼ -ï¼ï¼ -ï¼ -- -ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK -ï¼ -ï¼ . -ï¼ - diff --git a/results/classifier/02/other/59540920 b/results/classifier/02/other/59540920 deleted file mode 100644 index fb94e844d..000000000 --- a/results/classifier/02/other/59540920 +++ /dev/null @@ -1,377 +0,0 @@ -other: 0.989 -instruction: 0.986 -semantic: 0.985 -boot: 0.980 -mistranslation: 0.978 - -[BUG] No irqchip created after commit 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an accelerator property") - -I apologize if this was already reported, - -I just noticed that with the latest updates QEMU doesn't start with the -following configuration: - -qemu-system-x86_64 -name guest=win10 -machine pc,accel=kvm -cpu -host,hv_vpindex,hv_synic ... - -qemu-system-x86_64: failed to turn on HyperV SynIC in KVM: Invalid argument -qemu-system-x86_64: kvm_init_vcpu failed: Invalid argument - -If I add 'kernel-irqchip=split' or ',kernel-irqchip=on' it starts as -usual. I bisected this to the following commit: - -commit 11bc4a13d1f4b07dafbd1dda4d4bf0fdd7ad65f2 (HEAD, refs/bisect/bad) -Author: Paolo Bonzini <address@hidden> -Date: Wed Nov 13 10:56:53 2019 +0100 - - kvm: convert "-machine kernel_irqchip" to an accelerator property - -so aparently we now default to 'kernel_irqchip=off'. Is this the desired -behavior? - --- -Vitaly - -No, absolutely not. I was sure I had tested it, but I will take a look. -Paolo -Il ven 20 dic 2019, 15:11 Vitaly Kuznetsov < -address@hidden -> ha scritto: -I apologize if this was already reported, -I just noticed that with the latest updates QEMU doesn't start with the -following configuration: -qemu-system-x86_64 -name guest=win10 -machine pc,accel=kvm -cpu host,hv_vpindex,hv_synic ... -qemu-system-x86_64: failed to turn on HyperV SynIC in KVM: Invalid argument -qemu-system-x86_64: kvm_init_vcpu failed: Invalid argument -If I add 'kernel-irqchip=split' or ',kernel-irqchip=on' it starts as -usual. I bisected this to the following commit: -commit 11bc4a13d1f4b07dafbd1dda4d4bf0fdd7ad65f2 (HEAD, refs/bisect/bad) -Author: Paolo Bonzini < -address@hidden -> -Date:  Wed Nov 13 10:56:53 2019 +0100 -  kvm: convert "-machine kernel_irqchip" to an accelerator property -so aparently we now default to 'kernel_irqchip=off'. Is this the desired -behavior? --- -Vitaly - -Commit 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an -accelerator property") moves kernel_irqchip property from "-machine" to -"-accel kvm", but it forgets to set the default value of -kernel_irqchip_allowed and kernel_irqchip_split. - -Also cleaning up the three useless members (kernel_irqchip_allowed, -kernel_irqchip_required, kernel_irqchip_split) in struct MachineState. - -Fixes: 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an accelerator -property") -Signed-off-by: Xiaoyao Li <address@hidden> ---- - accel/kvm/kvm-all.c | 3 +++ - include/hw/boards.h | 3 --- - 2 files changed, 3 insertions(+), 3 deletions(-) - -diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c -index b2f1a5bcb5ef..40f74094f8d3 100644 ---- a/accel/kvm/kvm-all.c -+++ b/accel/kvm/kvm-all.c -@@ -3044,8 +3044,11 @@ bool kvm_kernel_irqchip_split(void) - static void kvm_accel_instance_init(Object *obj) - { - KVMState *s = KVM_STATE(obj); -+ MachineClass *mc = MACHINE_GET_CLASS(current_machine); - - s->kvm_shadow_mem = -1; -+ s->kernel_irqchip_allowed = true; -+ s->kernel_irqchip_split = mc->default_kernel_irqchip_split; - } - - static void kvm_accel_class_init(ObjectClass *oc, void *data) -diff --git a/include/hw/boards.h b/include/hw/boards.h -index 61f8bb8e5a42..fb1b43d5b972 100644 ---- a/include/hw/boards.h -+++ b/include/hw/boards.h -@@ -271,9 +271,6 @@ struct MachineState { - - /*< public >*/ - -- bool kernel_irqchip_allowed; -- bool kernel_irqchip_required; -- bool kernel_irqchip_split; - char *dtb; - char *dumpdtb; - int phandle_start; --- -2.19.1 - -Il sab 28 dic 2019, 09:48 Xiaoyao Li < -address@hidden -> ha scritto: -Commit 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an -accelerator property") moves kernel_irqchip property from "-machine" to -"-accel kvm", but it forgets to set the default value of -kernel_irqchip_allowed and kernel_irqchip_split. -Also cleaning up the three useless members (kernel_irqchip_allowed, -kernel_irqchip_required, kernel_irqchip_split) in struct MachineState. -Fixes: 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an accelerator property") -Signed-off-by: Xiaoyao Li < -address@hidden -> -Please also add a Reported-by line for Vitaly Kuznetsov. ---- - accel/kvm/kvm-all.c | 3 +++ - include/hw/boards.h | 3 --- - 2 files changed, 3 insertions(+), 3 deletions(-) -diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c -index b2f1a5bcb5ef..40f74094f8d3 100644 ---- a/accel/kvm/kvm-all.c -+++ b/accel/kvm/kvm-all.c -@@ -3044,8 +3044,11 @@ bool kvm_kernel_irqchip_split(void) - static void kvm_accel_instance_init(Object *obj) - { -   KVMState *s = KVM_STATE(obj); -+  MachineClass *mc = MACHINE_GET_CLASS(current_machine); -   s->kvm_shadow_mem = -1; -+  s->kernel_irqchip_allowed = true; -+  s->kernel_irqchip_split = mc->default_kernel_irqchip_split; -Can you initialize this from the init_machine method instead of assuming that current_machine has been initialized earlier? -Thanks for the quick fix! -Paolo - } - static void kvm_accel_class_init(ObjectClass *oc, void *data) -diff --git a/include/hw/boards.h b/include/hw/boards.h -index 61f8bb8e5a42..fb1b43d5b972 100644 ---- a/include/hw/boards.h -+++ b/include/hw/boards.h -@@ -271,9 +271,6 @@ struct MachineState { -   /*< public >*/ --  bool kernel_irqchip_allowed; --  bool kernel_irqchip_required; --  bool kernel_irqchip_split; -   char *dtb; -   char *dumpdtb; -   int phandle_start; --- -2.19.1 - -On Sat, 2019-12-28 at 10:02 +0000, Paolo Bonzini wrote: -> -> -> -Il sab 28 dic 2019, 09:48 Xiaoyao Li <address@hidden> ha scritto: -> -> Commit 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an -> -> accelerator property") moves kernel_irqchip property from "-machine" to -> -> "-accel kvm", but it forgets to set the default value of -> -> kernel_irqchip_allowed and kernel_irqchip_split. -> -> -> -> Also cleaning up the three useless members (kernel_irqchip_allowed, -> -> kernel_irqchip_required, kernel_irqchip_split) in struct MachineState. -> -> -> -> Fixes: 11bc4a13d1f4 ("kvm: convert "-machine kernel_irqchip" to an -> -> accelerator property") -> -> Signed-off-by: Xiaoyao Li <address@hidden> -> -> -Please also add a Reported-by line for Vitaly Kuznetsov. -Sure. - -> -> --- -> -> accel/kvm/kvm-all.c | 3 +++ -> -> include/hw/boards.h | 3 --- -> -> 2 files changed, 3 insertions(+), 3 deletions(-) -> -> -> -> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c -> -> index b2f1a5bcb5ef..40f74094f8d3 100644 -> -> --- a/accel/kvm/kvm-all.c -> -> +++ b/accel/kvm/kvm-all.c -> -> @@ -3044,8 +3044,11 @@ bool kvm_kernel_irqchip_split(void) -> -> static void kvm_accel_instance_init(Object *obj) -> -> { -> -> KVMState *s = KVM_STATE(obj); -> -> + MachineClass *mc = MACHINE_GET_CLASS(current_machine); -> -> -> -> s->kvm_shadow_mem = -1; -> -> + s->kernel_irqchip_allowed = true; -> -> + s->kernel_irqchip_split = mc->default_kernel_irqchip_split; -> -> -Can you initialize this from the init_machine method instead of assuming that -> -current_machine has been initialized earlier? -OK, will do it in v2. - -> -Thanks for the quick fix! -BTW, it seems that this patch makes kernel_irqchip default on to workaround the -bug. -However, when explicitly configuring kernel_irqchip=off, guest still fails -booting due to "KVM: failed to send PV IPI: -95" with a latest upstream kernel -ubuntu guest. Any idea about this? - -> -Paolo -> -> } -> -> -> -> static void kvm_accel_class_init(ObjectClass *oc, void *data) -> -> diff --git a/include/hw/boards.h b/include/hw/boards.h -> -> index 61f8bb8e5a42..fb1b43d5b972 100644 -> -> --- a/include/hw/boards.h -> -> +++ b/include/hw/boards.h -> -> @@ -271,9 +271,6 @@ struct MachineState { -> -> -> -> /*< public >*/ -> -> -> -> - bool kernel_irqchip_allowed; -> -> - bool kernel_irqchip_required; -> -> - bool kernel_irqchip_split; -> -> char *dtb; -> -> char *dumpdtb; -> -> int phandle_start; - -Il sab 28 dic 2019, 10:24 Xiaoyao Li < -address@hidden -> ha scritto: -BTW, it seems that this patch makes kernel_irqchip default on to workaround the -bug. -However, when explicitly configuring kernel_irqchip=off, guest still fails -booting due to "KVM: failed to send PV IPI: -95" with a latest upstream kernel -ubuntu guest. Any idea about this? -We need to clear the PV IPI feature for userspace irqchip. Are you using -cpu host by chance? -Paolo -> Paolo -> > } -> > -> > static void kvm_accel_class_init(ObjectClass *oc, void *data) -> > diff --git a/include/hw/boards.h b/include/hw/boards.h -> > index 61f8bb8e5a42..fb1b43d5b972 100644 -> > --- a/include/hw/boards.h -> > +++ b/include/hw/boards.h -> > @@ -271,9 +271,6 @@ struct MachineState { -> > -> >   /*< public >*/ -> > -> > -  bool kernel_irqchip_allowed; -> > -  bool kernel_irqchip_required; -> > -  bool kernel_irqchip_split; -> >   char *dtb; -> >   char *dumpdtb; -> >   int phandle_start; - -On Sat, 2019-12-28 at 10:57 +0000, Paolo Bonzini wrote: -> -> -> -Il sab 28 dic 2019, 10:24 Xiaoyao Li <address@hidden> ha scritto: -> -> BTW, it seems that this patch makes kernel_irqchip default on to workaround -> -> the -> -> bug. -> -> However, when explicitly configuring kernel_irqchip=off, guest still fails -> -> booting due to "KVM: failed to send PV IPI: -95" with a latest upstream -> -> kernel -> -> ubuntu guest. Any idea about this? -> -> -We need to clear the PV IPI feature for userspace irqchip. Are you using -cpu -> -host by chance? -Yes, I used -cpu host. - -After using "-cpu host,-kvm-pv-ipi" with kernel_irqchip=off, it can boot -successfully. - -> -Paolo -> -> -> > Paolo -> -> > > } -> -> > > -> -> > > static void kvm_accel_class_init(ObjectClass *oc, void *data) -> -> > > diff --git a/include/hw/boards.h b/include/hw/boards.h -> -> > > index 61f8bb8e5a42..fb1b43d5b972 100644 -> -> > > --- a/include/hw/boards.h -> -> > > +++ b/include/hw/boards.h -> -> > > @@ -271,9 +271,6 @@ struct MachineState { -> -> > > -> -> > > /*< public >*/ -> -> > > -> -> > > - bool kernel_irqchip_allowed; -> -> > > - bool kernel_irqchip_required; -> -> > > - bool kernel_irqchip_split; -> -> > > char *dtb; -> -> > > char *dumpdtb; -> -> > > int phandle_start; -> -> - diff --git a/results/classifier/02/other/64571620 b/results/classifier/02/other/64571620 deleted file mode 100644 index c3d370815..000000000 --- a/results/classifier/02/other/64571620 +++ /dev/null @@ -1,786 +0,0 @@ -other: 0.922 -mistranslation: 0.917 -semantic: 0.903 -instruction: 0.894 -boot: 0.879 - -[BUG] Migration hv_time rollback - -Hi, - -We are experiencing timestamp rollbacks during live-migration of -Windows 10 guests with the following qemu configuration (linux 5.4.46 -and qemu master): -``` -$ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -``` - -I have tracked the bug to the fact that `kvmclock` is not exposed and -disabled from qemu PoV but is in fact used by `hv-time` (in KVM). - -I think we should enable the `kvmclock` (qemu device) if `hv-time` is -present and add Hyper-V support for the `kvmclock_current_nsec` -function. - -I'm asking for advice because I am unsure this is the _right_ approach -and how to keep migration compatibility between qemu versions. - -Thank you all, - --- -Antoine 'xdbob' Damhet -signature.asc -Description: -PGP signature - -cc'ing in Vitaly who knows about the hv stuff. - -* Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> -Hi, -> -> -We are experiencing timestamp rollbacks during live-migration of -> -Windows 10 guests with the following qemu configuration (linux 5.4.46 -> -and qemu master): -> -``` -> -$ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> -``` -How big a jump are you seeing, and how did you notice it in the guest? - -Dave - -> -I have tracked the bug to the fact that `kvmclock` is not exposed and -> -disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> -> -I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> -present and add Hyper-V support for the `kvmclock_current_nsec` -> -function. -> -> -I'm asking for advice because I am unsure this is the _right_ approach -> -and how to keep migration compatibility between qemu versions. -> -> -Thank you all, -> -> --- -> -Antoine 'xdbob' Damhet --- -Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK - -"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: - -> -cc'ing in Vitaly who knows about the hv stuff. -> -cc'ing Marcelo who knows about clocksources :-) - -> -* Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> -> Hi, -> -> -> -> We are experiencing timestamp rollbacks during live-migration of -> -> Windows 10 guests -Are you migrating to the same hardware (with the same TSC frequency)? Is -TSC used as the clocksource on the host? - -> -> with the following qemu configuration (linux 5.4.46 -> -> and qemu master): -> -> ``` -> -> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> -> ``` -Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is -not going to check for KVM identification anyway so we pretend we're -Hyper-V. - -Also, have you tried adding more Hyper-V enlightenments? - -> -> -How big a jump are you seeing, and how did you notice it in the guest? -> -> -Dave -> -> -> I have tracked the bug to the fact that `kvmclock` is not exposed and -> -> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> -> -> -> I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> -> present and add Hyper-V support for the `kvmclock_current_nsec` -> -> function. -AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by -the guest: - - if (!(env->system_time_msr & 1ULL)) { - /* KVM clock not active */ - return 0; - } - -and this is (and way) always false for Windows guests. - -> -> -> -> I'm asking for advice because I am unsure this is the _right_ approach -> -> and how to keep migration compatibility between qemu versions. -> -> -> -> Thank you all, -> -> -> -> -- -> -> Antoine 'xdbob' Damhet --- -Vitaly - -On Wed, Sep 16, 2020 at 01:59:43PM +0200, Vitaly Kuznetsov wrote: -> -"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: -> -> -> cc'ing in Vitaly who knows about the hv stuff. -> -> -> -> -cc'ing Marcelo who knows about clocksources :-) -> -> -> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> ->> Hi, -> ->> -> ->> We are experiencing timestamp rollbacks during live-migration of -> ->> Windows 10 guests -> -> -Are you migrating to the same hardware (with the same TSC frequency)? Is -> -TSC used as the clocksource on the host? -Yes we are migrating to the exact same hardware. And yes TSC is used as -a clocksource in the host (but the bug is still happening with `hpet` as -a clocksource). - -> -> ->> with the following qemu configuration (linux 5.4.46 -> ->> and qemu master): -> ->> ``` -> ->> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> ->> ``` -> -> -Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is -> -not going to check for KVM identification anyway so we pretend we're -> -Hyper-V. -Some softwares explicitly checks for the presence of KVM and then crash -if they find it in CPUID :/ - -> -> -Also, have you tried adding more Hyper-V enlightenments? -Yes, I published a stripped-down command-line for a minimal reproducer -but even `hv-frequencies` and `hv-reenlightenment` don't help. - -> -> -> -> -> How big a jump are you seeing, and how did you notice it in the guest? -> -> -> -> Dave -> -> -> ->> I have tracked the bug to the fact that `kvmclock` is not exposed and -> ->> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> ->> -> ->> I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> ->> present and add Hyper-V support for the `kvmclock_current_nsec` -> ->> function. -> -> -AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by -> -the guest: -> -> -if (!(env->system_time_msr & 1ULL)) { -> -/* KVM clock not active */ -> -return 0; -> -} -> -> -and this is (and way) always false for Windows guests. -Hooo, I missed this piece. When is `clock_is_reliable` expected to be -false ? Because if it is I still think we should be able to query at -least `HV_X64_MSR_REFERENCE_TSC` - -> -> ->> -> ->> I'm asking for advice because I am unsure this is the _right_ approach -> ->> and how to keep migration compatibility between qemu versions. -> ->> -> ->> Thank you all, -> ->> -> ->> -- -> ->> Antoine 'xdbob' Damhet -> -> --- -> -Vitaly -> --- -Antoine 'xdbob' Damhet -signature.asc -Description: -PGP signature - -On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: -> -cc'ing in Vitaly who knows about the hv stuff. -Thanks - -> -> -* Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> -> Hi, -> -> -> -> We are experiencing timestamp rollbacks during live-migration of -> -> Windows 10 guests with the following qemu configuration (linux 5.4.46 -> -> and qemu master): -> -> ``` -> -> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> -> ``` -> -> -How big a jump are you seeing, and how did you notice it in the guest? -I'm seeing jumps of about the guest uptime (indicating a reset of the -counter). It's expected because we won't call `KVM_SET_CLOCK` to -restore any value. - -We first noticed it because after some migrations `dwm.exe` crashes with -the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in -the past." error code. - -I can also confirm the following hack makes the behavior disappear: - -``` -diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c -index 64283358f9..f334bdf35f 100644 ---- a/hw/i386/kvm/clock.c -+++ b/hw/i386/kvm/clock.c -@@ -332,11 +332,7 @@ void kvmclock_create(void) - { - X86CPU *cpu = X86_CPU(first_cpu); - -- if (kvm_enabled() && -- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { -- sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -- } -+ sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); - } - - static void kvmclock_register_types(void) -diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c -index 32b1453e6a..11d980ba85 100644 ---- a/hw/i386/pc_piix.c -+++ b/hw/i386/pc_piix.c -@@ -158,9 +158,7 @@ static void pc_init1(MachineState *machine, - - x86_cpus_init(x86ms, pcmc->default_cpu_version); - -- if (kvm_enabled() && pcmc->kvmclock_enabled) { -- kvmclock_create(); -- } -+ kvmclock_create(); - - if (pcmc->pci_enabled) { - pci_memory = g_new(MemoryRegion, 1); -``` - -> -> -Dave -> -> -> I have tracked the bug to the fact that `kvmclock` is not exposed and -> -> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> -> -> -> I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> -> present and add Hyper-V support for the `kvmclock_current_nsec` -> -> function. -> -> -> -> I'm asking for advice because I am unsure this is the _right_ approach -> -> and how to keep migration compatibility between qemu versions. -> -> -> -> Thank you all, -> -> -> -> -- -> -> Antoine 'xdbob' Damhet -> -> -> --- -> -Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -> --- -Antoine 'xdbob' Damhet -signature.asc -Description: -PGP signature - -Antoine Damhet <antoine.damhet@blade-group.com> writes: - -> -On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: -> -> cc'ing in Vitaly who knows about the hv stuff. -> -> -Thanks -> -> -> -> -> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> -> > Hi, -> -> > -> -> > We are experiencing timestamp rollbacks during live-migration of -> -> > Windows 10 guests with the following qemu configuration (linux 5.4.46 -> -> > and qemu master): -> -> > ``` -> -> > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> -> > ``` -> -> -> -> How big a jump are you seeing, and how did you notice it in the guest? -> -> -I'm seeing jumps of about the guest uptime (indicating a reset of the -> -counter). It's expected because we won't call `KVM_SET_CLOCK` to -> -restore any value. -> -> -We first noticed it because after some migrations `dwm.exe` crashes with -> -the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in -> -the past." error code. -> -> -I can also confirm the following hack makes the behavior disappear: -> -> -``` -> -diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c -> -index 64283358f9..f334bdf35f 100644 -> ---- a/hw/i386/kvm/clock.c -> -+++ b/hw/i386/kvm/clock.c -> -@@ -332,11 +332,7 @@ void kvmclock_create(void) -> -{ -> -X86CPU *cpu = X86_CPU(first_cpu); -> -> -- if (kvm_enabled() && -> -- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -> -- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { -> -- sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -> -- } -> -+ sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -> -} -> -Oh, I think I see what's going on. When you add 'kvm=off' -cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so -kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on -migration. - -In case we really want to support 'kvm=off' I think we can add Hyper-V -features check here along with KVM, this should do the job. - --- -Vitaly - -Vitaly Kuznetsov <vkuznets@redhat.com> writes: - -> -Antoine Damhet <antoine.damhet@blade-group.com> writes: -> -> -> On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: -> ->> cc'ing in Vitaly who knows about the hv stuff. -> -> -> -> Thanks -> -> -> ->> -> ->> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: -> ->> > Hi, -> ->> > -> ->> > We are experiencing timestamp rollbacks during live-migration of -> ->> > Windows 10 guests with the following qemu configuration (linux 5.4.46 -> ->> > and qemu master): -> ->> > ``` -> ->> > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] -> ->> > ``` -> ->> -> ->> How big a jump are you seeing, and how did you notice it in the guest? -> -> -> -> I'm seeing jumps of about the guest uptime (indicating a reset of the -> -> counter). It's expected because we won't call `KVM_SET_CLOCK` to -> -> restore any value. -> -> -> -> We first noticed it because after some migrations `dwm.exe` crashes with -> -> the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in -> -> the past." error code. -> -> -> -> I can also confirm the following hack makes the behavior disappear: -> -> -> -> ``` -> -> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c -> -> index 64283358f9..f334bdf35f 100644 -> -> --- a/hw/i386/kvm/clock.c -> -> +++ b/hw/i386/kvm/clock.c -> -> @@ -332,11 +332,7 @@ void kvmclock_create(void) -> -> { -> -> X86CPU *cpu = X86_CPU(first_cpu); -> -> -> -> - if (kvm_enabled() && -> -> - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -> -> - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) -> -> { -> -> - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -> -> - } -> -> + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -> -> } -> -> -> -> -> -Oh, I think I see what's going on. When you add 'kvm=off' -> -cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so -> -kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on -> -migration. -> -> -In case we really want to support 'kvm=off' I think we can add Hyper-V -> -features check here along with KVM, this should do the job. -Does the untested - -diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c -index 64283358f91d..e03b2ca6d8f6 100644 ---- a/hw/i386/kvm/clock.c -+++ b/hw/i386/kvm/clock.c -@@ -333,8 +333,9 @@ void kvmclock_create(void) - X86CPU *cpu = X86_CPU(first_cpu); - - if (kvm_enabled() && -- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { -+ ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -+ (1ULL << KVM_FEATURE_CLOCKSOURCE2))) -|| -+ (cpu->env.features[FEAT_HYPERV_EAX] & HV_TIME_REF_COUNT_AVAILABLE))) { - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); - } - } - -help? - -(I don't think we need to remove all 'if (kvm_enabled())' checks from -machine types as 'kvm=off' should not be related). - --- -Vitaly - -On Wed, Sep 16, 2020 at 02:50:56PM +0200, Vitaly Kuznetsov wrote: -[...] - -> ->> -> -> -> -> -> -> Oh, I think I see what's going on. When you add 'kvm=off' -> -> cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so -> -> kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on -> -> migration. -> -> -> -> In case we really want to support 'kvm=off' I think we can add Hyper-V -> -> features check here along with KVM, this should do the job. -> -> -Does the untested -> -> -diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c -> -index 64283358f91d..e03b2ca6d8f6 100644 -> ---- a/hw/i386/kvm/clock.c -> -+++ b/hw/i386/kvm/clock.c -> -@@ -333,8 +333,9 @@ void kvmclock_create(void) -> -X86CPU *cpu = X86_CPU(first_cpu); -> -> -if (kvm_enabled() && -> -- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -> -- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { -> -+ ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | -> -+ (1ULL << -> -KVM_FEATURE_CLOCKSOURCE2))) || -> -+ (cpu->env.features[FEAT_HYPERV_EAX] & -> -HV_TIME_REF_COUNT_AVAILABLE))) { -> -sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); -> -} -> -} -> -> -help? -It appears to work :) - -> -> -(I don't think we need to remove all 'if (kvm_enabled())' checks from -> -machine types as 'kvm=off' should not be related). -Indeed (I didn't look at the macro, it was just quick & dirty). - -> -> --- -> -Vitaly -> -> --- -Antoine 'xdbob' Damhet -signature.asc -Description: -PGP signature - -On 16/09/20 13:29, Dr. David Alan Gilbert wrote: -> -> I have tracked the bug to the fact that `kvmclock` is not exposed and -> -> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> -> -> -> I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> -> present and add Hyper-V support for the `kvmclock_current_nsec` -> -> function. -Yes, this seems correct. I would have to check but it may even be -better to _always_ send kvmclock data in the live migration stream. - -Paolo - -Paolo Bonzini <pbonzini@redhat.com> writes: - -> -On 16/09/20 13:29, Dr. David Alan Gilbert wrote: -> ->> I have tracked the bug to the fact that `kvmclock` is not exposed and -> ->> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). -> ->> -> ->> I think we should enable the `kvmclock` (qemu device) if `hv-time` is -> ->> present and add Hyper-V support for the `kvmclock_current_nsec` -> ->> function. -> -> -Yes, this seems correct. I would have to check but it may even be -> -better to _always_ send kvmclock data in the live migration stream. -> -The question I have is: with 'kvm=off', do we actually restore TSC -reading on migration? (and I guess the answer is 'no' or Hyper-V TSC -page would 'just work' I guess). So yea, maybe dropping the -'cpu->env.features[FEAT_KVM]' check is the right fix. - --- -Vitaly - diff --git a/results/classifier/02/other/65781993 b/results/classifier/02/other/65781993 deleted file mode 100644 index 5f9b600da..000000000 --- a/results/classifier/02/other/65781993 +++ /dev/null @@ -1,2794 +0,0 @@ -other: 0.727 -instruction: 0.670 -semantic: 0.665 -mistranslation: 0.650 -boot: 0.635 - -[Qemu-devel] 答复: Re: 答复: Re: [BUG]COLO failover hang - -Thank youã - -I have test areadyã - -When the Primary Node panic,the Secondary Node qemu hang at the same placeã - -Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node qemu -will not produce the problem,but Primary Node panic canã - -I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. - - -when failover,channel_shutdown could not shut down the channel. - - -so the colo_process_incoming_thread will hang at recvmsg. - - -I test a patch: - - -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -My test will not hang any more. - - - - - - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang - - - - - -Hi,Wang. - -You can test this branch: -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -and please follow wiki ensure your own configuration correctly. -http://wiki.qemu-project.org/Features/COLO -Thanks - -Zhang Chen - - -On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ -ï¼ hi. -ï¼ -ï¼ I test the git qemu master have the same problem. -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ -ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ (address@hidden, address@hidden "", -ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ -ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ -ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ migration/qemu-file.c:295 -ï¼ -ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ -ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:568 -ï¼ -ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:648 -ï¼ -ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ address@hidden) at migration/colo.c:244 -ï¼ -ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ outï¼, address@hidden, -ï¼ address@hidden) -ï¼ -ï¼ at migration/colo.c:264 -ï¼ -ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ -ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ -ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ -ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ -ï¼ $3 = 0 -ï¼ -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ -ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ gmain.c:3054 -ï¼ -ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ address@hidden) at gmain.c:3630 -ï¼ -ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ -ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ util/main-loop.c:258 -ï¼ -ï¼ #5 main_loop_wait (address@hidden) at -ï¼ util/main-loop.c:506 -ï¼ -ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ -ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ outï¼) at vl.c:4709 -ï¼ -ï¼ (gdb) p ioc-ï¼features -ï¼ -ï¼ $1 = 6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ -ï¼ -ï¼ May be socket_accept_incoming_migration should -ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ -ï¼ -ï¼ thank you. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ address@hidden -ï¼ address@hidden -ï¼ address@hidden@huawei.comï¼ -ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ -ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ -ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ Do the colo have any plan for development? -ï¼ -ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ -ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ -ï¼ In our internal version can run it successfully, -ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ Next time if you have some question about COLO, -ï¼ please cc me and zhanghailiang address@hidden -ï¼ -ï¼ -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ (gdb) bt -ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -- -ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ -ï¼ -- -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ - --- -Thanks -Zhang Chen - -Hi, - -On 2017/3/21 16:10, address@hidden wrote: -Thank youã - -I have test areadyã - -When the Primary Node panic,the Secondary Node qemu hang at the same placeã - -Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node qemu -will not produce the problem,but Primary Node panic canã - -I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. -Yes, you are right, when we do failover for primary/secondary VM, we will -shutdown the related -fd in case it is stuck in the read/write fd. - -It seems that you didn't follow the above introduction exactly to do the test. -Could you -share your test procedures ? Especially the commands used in the test. - -Thanks, -Hailiang -when failover,channel_shutdown could not shut down the channel. - - -so the colo_process_incoming_thread will hang at recvmsg. - - -I test a patch: - - -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -My test will not hang any more. - - - - - - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang - - - - - -Hi,Wang. - -You can test this branch: -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -and please follow wiki ensure your own configuration correctly. -http://wiki.qemu-project.org/Features/COLO -Thanks - -Zhang Chen - - -On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ -ï¼ hi. -ï¼ -ï¼ I test the git qemu master have the same problem. -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ -ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ (address@hidden, address@hidden "", -ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ -ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ -ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ migration/qemu-file.c:295 -ï¼ -ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ -ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:568 -ï¼ -ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:648 -ï¼ -ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ address@hidden) at migration/colo.c:244 -ï¼ -ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ outï¼, address@hidden, -ï¼ address@hidden) -ï¼ -ï¼ at migration/colo.c:264 -ï¼ -ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ -ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ -ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ -ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ -ï¼ $3 = 0 -ï¼ -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ -ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ gmain.c:3054 -ï¼ -ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ address@hidden) at gmain.c:3630 -ï¼ -ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ -ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ util/main-loop.c:258 -ï¼ -ï¼ #5 main_loop_wait (address@hidden) at -ï¼ util/main-loop.c:506 -ï¼ -ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ -ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ outï¼) at vl.c:4709 -ï¼ -ï¼ (gdb) p ioc-ï¼features -ï¼ -ï¼ $1 = 6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ -ï¼ -ï¼ May be socket_accept_incoming_migration should -ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ -ï¼ -ï¼ thank you. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ address@hidden -ï¼ address@hidden -ï¼ address@hidden@huawei.comï¼ -ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ -ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ -ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ Do the colo have any plan for development? -ï¼ -ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ -ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ -ï¼ In our internal version can run it successfully, -ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ Next time if you have some question about COLO, -ï¼ please cc me and zhanghailiang address@hidden -ï¼ -ï¼ -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ (gdb) bt -ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -- -ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ -ï¼ -- -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ - -Hi, - -Thanks for reporting this, and i confirmed it in my test, and it is a bug. - -Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -case COLO thread/incoming thread is stuck in read/write() while do failover, -but it didn't take effect, because all the fd used by COLO (also migration) -has been wrapped by qio channel, and it will not call the shutdown API if -we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). - -Cc: Dr. David Alan Gilbert <address@hidden> - -I doubted migration cancel has the same problem, it may be stuck in write() -if we tried to cancel migration. - -void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error -**errp) -{ - qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing"); - migration_channel_connect(s, ioc, NULL); - ... ... -We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -and the -migrate_fd_cancel() -{ - ... ... - if (s->state == MIGRATION_STATUS_CANCELLING && f) { - qemu_file_shutdown(f); --> This will not take effect. No ? - } -} - -Thanks, -Hailiang - -On 2017/3/21 16:10, address@hidden wrote: -Thank youã - -I have test areadyã - -When the Primary Node panic,the Secondary Node qemu hang at the same placeã - -Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node qemu -will not produce the problem,but Primary Node panic canã - -I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. - - -when failover,channel_shutdown could not shut down the channel. - - -so the colo_process_incoming_thread will hang at recvmsg. - - -I test a patch: - - -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -My test will not hang any more. - - - - - - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang - - - - - -Hi,Wang. - -You can test this branch: -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -and please follow wiki ensure your own configuration correctly. -http://wiki.qemu-project.org/Features/COLO -Thanks - -Zhang Chen - - -On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ -ï¼ hi. -ï¼ -ï¼ I test the git qemu master have the same problem. -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ -ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ (address@hidden, address@hidden "", -ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ -ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ -ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ migration/qemu-file.c:295 -ï¼ -ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ -ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:568 -ï¼ -ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:648 -ï¼ -ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ address@hidden) at migration/colo.c:244 -ï¼ -ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ outï¼, address@hidden, -ï¼ address@hidden) -ï¼ -ï¼ at migration/colo.c:264 -ï¼ -ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ -ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ -ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ -ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ -ï¼ $3 = 0 -ï¼ -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ -ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ gmain.c:3054 -ï¼ -ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ address@hidden) at gmain.c:3630 -ï¼ -ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ -ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ util/main-loop.c:258 -ï¼ -ï¼ #5 main_loop_wait (address@hidden) at -ï¼ util/main-loop.c:506 -ï¼ -ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ -ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ outï¼) at vl.c:4709 -ï¼ -ï¼ (gdb) p ioc-ï¼features -ï¼ -ï¼ $1 = 6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ -ï¼ -ï¼ May be socket_accept_incoming_migration should -ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ -ï¼ -ï¼ thank you. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ address@hidden -ï¼ address@hidden -ï¼ address@hidden@huawei.comï¼ -ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ -ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ -ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ Do the colo have any plan for development? -ï¼ -ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ -ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ -ï¼ In our internal version can run it successfully, -ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ Next time if you have some question about COLO, -ï¼ please cc me and zhanghailiang address@hidden -ï¼ -ï¼ -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ (gdb) bt -ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -- -ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ -ï¼ -- -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ - -* Hailiang Zhang (address@hidden) wrote: -> -Hi, -> -> -Thanks for reporting this, and i confirmed it in my test, and it is a bug. -> -> -Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -> -case COLO thread/incoming thread is stuck in read/write() while do failover, -> -but it didn't take effect, because all the fd used by COLO (also migration) -> -has been wrapped by qio channel, and it will not call the shutdown API if -> -we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -> -QIO_CHANNEL_FEATURE_SHUTDOWN). -> -> -Cc: Dr. David Alan Gilbert <address@hidden> -> -> -I doubted migration cancel has the same problem, it may be stuck in write() -> -if we tried to cancel migration. -> -> -void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error -> -**errp) -> -{ -> -qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing"); -> -migration_channel_connect(s, ioc, NULL); -> -... ... -> -We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -> -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -> -and the -> -migrate_fd_cancel() -> -{ -> -... ... -> -if (s->state == MIGRATION_STATUS_CANCELLING && f) { -> -qemu_file_shutdown(f); --> This will not take effect. No ? -> -} -> -} -(cc'd in Daniel Berrange). -I see that we call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); -at the -top of qio_channel_socket_new; so I think that's safe isn't it? - -Dave - -> -Thanks, -> -Hailiang -> -> -On 2017/3/21 16:10, address@hidden wrote: -> -> Thank youã -> -> -> -> I have test areadyã -> -> -> -> When the Primary Node panic,the Secondary Node qemu hang at the same placeã -> -> -> -> Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node -> -> qemu will not produce the problem,but Primary Node panic canã -> -> -> -> I think due to the feature of channel does not support -> -> QIO_CHANNEL_FEATURE_SHUTDOWN. -> -> -> -> -> -> when failover,channel_shutdown could not shut down the channel. -> -> -> -> -> -> so the colo_process_incoming_thread will hang at recvmsg. -> -> -> -> -> -> I test a patch: -> -> -> -> -> -> diff --git a/migration/socket.c b/migration/socket.c -> -> -> -> -> -> index 13966f1..d65a0ea 100644 -> -> -> -> -> -> --- a/migration/socket.c -> -> -> -> -> -> +++ b/migration/socket.c -> -> -> -> -> -> @@ -147,8 +147,9 @@ static gboolean -> -> socket_accept_incoming_migration(QIOChannel *ioc, -> -> -> -> -> -> } -> -> -> -> -> -> -> -> -> -> -> -> trace_migration_socket_incoming_accepted() -> -> -> -> -> -> -> -> -> -> -> -> qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") -> -> -> -> -> -> + qio_channel_set_feature(QIO_CHANNEL(sioc), -> -> QIO_CHANNEL_FEATURE_SHUTDOWN) -> -> -> -> -> -> migration_channel_process_incoming(migrate_get_current(), -> -> -> -> -> -> QIO_CHANNEL(sioc)) -> -> -> -> -> -> object_unref(OBJECT(sioc)) -> -> -> -> -> -> -> -> -> -> My test will not hang any more. -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> -> åå§é®ä»¶ -> -> -> -> -> -> -> -> åä»¶äººï¼ address@hidden -> -> æ¶ä»¶äººï¼ç广10165992 address@hidden -> -> æéäººï¼ address@hidden address@hidden -> -> æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -> -> 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -> -> -> -> -> -> -> -> -> -> -> -> Hi,Wang. -> -> -> -> You can test this branch: -> -> -> -> -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -> -> -> -> and please follow wiki ensure your own configuration correctly. -> -> -> -> -http://wiki.qemu-project.org/Features/COLO -> -> -> -> -> -> Thanks -> -> -> -> Zhang Chen -> -> -> -> -> -> On 03/21/2017 03:27 PM, address@hidden wrote: -> -> ï¼ -> -> ï¼ hi. -> -> ï¼ -> -> ï¼ I test the git qemu master have the same problem. -> -> ï¼ -> -> ï¼ (gdb) bt -> -> ï¼ -> -> ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -> -> ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -> -> ï¼ -> -> ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -> -> ï¼ (address@hidden, address@hidden "", -> -> ï¼ address@hidden, address@hidden) at io/channel.c:114 -> -> ï¼ -> -> ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -> -> ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -> -> ï¼ migration/qemu-file-channel.c:78 -> -> ï¼ -> -> ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -> -> ï¼ migration/qemu-file.c:295 -> -> ï¼ -> -> ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -> -> ï¼ address@hidden) at migration/qemu-file.c:555 -> -> ï¼ -> -> ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -> -> ï¼ migration/qemu-file.c:568 -> -> ï¼ -> -> ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -> -> ï¼ migration/qemu-file.c:648 -> -> ï¼ -> -> ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -> -> ï¼ address@hidden) at migration/colo.c:244 -> -> ï¼ -> -> ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -> -> ï¼ outï¼, address@hidden, -> -> ï¼ address@hidden) -> -> ï¼ -> -> ï¼ at migration/colo.c:264 -> -> ï¼ -> -> ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -> -> ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -> -> ï¼ -> -> ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -> -> ï¼ -> -> ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -> -> ï¼ -> -> ï¼ (gdb) p ioc-ï¼name -> -> ï¼ -> -> ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -> -> ï¼ -> -> ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -> -> ï¼ -> -> ï¼ $3 = 0 -> -> ï¼ -> -> ï¼ -> -> ï¼ (gdb) bt -> -> ï¼ -> -> ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -> -> ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -> -> ï¼ -> -> ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -> -> ï¼ gmain.c:3054 -> -> ï¼ -> -> ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -> -> ï¼ address@hidden) at gmain.c:3630 -> -> ï¼ -> -> ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -> -> ï¼ -> -> ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -> -> ï¼ util/main-loop.c:258 -> -> ï¼ -> -> ï¼ #5 main_loop_wait (address@hidden) at -> -> ï¼ util/main-loop.c:506 -> -> ï¼ -> -> ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -> -> ï¼ -> -> ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -> -> ï¼ outï¼) at vl.c:4709 -> -> ï¼ -> -> ï¼ (gdb) p ioc-ï¼features -> -> ï¼ -> -> ï¼ $1 = 6 -> -> ï¼ -> -> ï¼ (gdb) p ioc-ï¼name -> -> ï¼ -> -> ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -> -> ï¼ -> -> ï¼ -> -> ï¼ May be socket_accept_incoming_migration should -> -> ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -> -> ï¼ -> -> ï¼ -> -> ï¼ thank you. -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ åå§é®ä»¶ -> -> ï¼ address@hidden -> -> ï¼ address@hidden -> -> ï¼ address@hidden@huawei.comï¼ -> -> ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -> -> ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -> -> ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -> -> ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -> -> ï¼ ï¼ -> -> ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -> -> ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -> -> ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -> -> ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -> -> ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -> -> ï¼ ï¼ -> -> ï¼ ï¼ I found that the colo in qemu is not complete yet. -> -> ï¼ ï¼ Do the colo have any plan for development? -> -> ï¼ -> -> ï¼ Yes, We are developing. You can see some of patch we pushing. -> -> ï¼ -> -> ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -> -> ï¼ -> -> ï¼ In our internal version can run it successfully, -> -> ï¼ The failover detail you can ask Zhanghailiang for help. -> -> ï¼ Next time if you have some question about COLO, -> -> ï¼ please cc me and zhanghailiang address@hidden -> -> ï¼ -> -> ï¼ -> -> ï¼ Thanks -> -> ï¼ Zhang Chen -> -> ï¼ -> -> ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ centos7.2+qemu2.7.50 -> -> ï¼ ï¼ (gdb) bt -> -> ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -> -> ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -> -> ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) -> -> at -> -> ï¼ ï¼ io/channel-socket.c:497 -> -> ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -> -> ï¼ ï¼ address@hidden "", address@hidden, -> -> ï¼ ï¼ address@hidden) at io/channel.c:97 -> -> ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -> -> ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -> -> ï¼ ï¼ migration/qemu-file-channel.c:78 -> -> ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -> -> ï¼ ï¼ migration/qemu-file.c:257 -> -> ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -> -> ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -> -> ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -> -> ï¼ ï¼ migration/qemu-file.c:523 -> -> ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -> -> ï¼ ï¼ migration/qemu-file.c:603 -> -> ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -> -> ï¼ ï¼ address@hidden) at migration/colo.c:215 -> -> ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -> -> ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -> -> ï¼ ï¼ migration/colo.c:546 -> -> ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -> -> ï¼ ï¼ migration/colo.c:649 -> -> ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -> -> ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -- -> -> ï¼ ï¼ View this message in context: -> -> -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -> -> ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ ï¼ -> -> ï¼ -> -> ï¼ -- -> -> ï¼ Thanks -> -> ï¼ Zhang Chen -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> ï¼ -> -> -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -* Hailiang Zhang (address@hidden) wrote: -Hi, - -Thanks for reporting this, and i confirmed it in my test, and it is a bug. - -Though we tried to call qemu_file_shutdown() to shutdown the related fd, in -case COLO thread/incoming thread is stuck in read/write() while do failover, -but it didn't take effect, because all the fd used by COLO (also migration) -has been wrapped by qio channel, and it will not call the shutdown API if -we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN). - -Cc: Dr. David Alan Gilbert <address@hidden> - -I doubted migration cancel has the same problem, it may be stuck in write() -if we tried to cancel migration. - -void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error -**errp) -{ - qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing"); - migration_channel_connect(s, ioc, NULL); - ... ... -We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -QIO_CHANNEL_FEATURE_SHUTDOWN) above, -and the -migrate_fd_cancel() -{ - ... ... - if (s->state == MIGRATION_STATUS_CANCELLING && f) { - qemu_file_shutdown(f); --> This will not take effect. No ? - } -} -(cc'd in Daniel Berrange). -I see that we call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); -at the -top of qio_channel_socket_new; so I think that's safe isn't it? -Hmm, you are right, this problem is only exist for the migration incoming fd, -thanks. -Dave -Thanks, -Hailiang - -On 2017/3/21 16:10, address@hidden wrote: -Thank youã - -I have test areadyã - -When the Primary Node panic,the Secondary Node qemu hang at the same placeã - -Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary Node qemu -will not produce the problem,but Primary Node panic canã - -I think due to the feature of channel does not support -QIO_CHANNEL_FEATURE_SHUTDOWN. - - -when failover,channel_shutdown could not shut down the channel. - - -so the colo_process_incoming_thread will hang at recvmsg. - - -I test a patch: - - -diff --git a/migration/socket.c b/migration/socket.c - - -index 13966f1..d65a0ea 100644 - - ---- a/migration/socket.c - - -+++ b/migration/socket.c - - -@@ -147,8 +147,9 @@ static gboolean socket_accept_incoming_migration(QIOChannel -*ioc, - - - } - - - - - - trace_migration_socket_incoming_accepted() - - - - - - qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") - - -+ qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) - - - migration_channel_process_incoming(migrate_get_current(), - - - QIO_CHANNEL(sioc)) - - - object_unref(OBJECT(sioc)) - - - - -My test will not hang any more. - - - - - - - - - - - - - - - - - -åå§é®ä»¶ - - - -åä»¶äººï¼ address@hidden -æ¶ä»¶äººï¼ç广10165992 address@hidden -æéäººï¼ address@hidden address@hidden -æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang - - - - - -Hi,Wang. - -You can test this branch: -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -and please follow wiki ensure your own configuration correctly. -http://wiki.qemu-project.org/Features/COLO -Thanks - -Zhang Chen - - -On 03/21/2017 03:27 PM, address@hidden wrote: -ï¼ -ï¼ hi. -ï¼ -ï¼ I test the git qemu master have the same problem. -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -ï¼ -ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -ï¼ (address@hidden, address@hidden "", -ï¼ address@hidden, address@hidden) at io/channel.c:114 -ï¼ -ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ migration/qemu-file-channel.c:78 -ï¼ -ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -ï¼ migration/qemu-file.c:295 -ï¼ -ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -ï¼ address@hidden) at migration/qemu-file.c:555 -ï¼ -ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -ï¼ migration/qemu-file.c:568 -ï¼ -ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -ï¼ migration/qemu-file.c:648 -ï¼ -ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -ï¼ address@hidden) at migration/colo.c:244 -ï¼ -ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -ï¼ outï¼, address@hidden, -ï¼ address@hidden) -ï¼ -ï¼ at migration/colo.c:264 -ï¼ -ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -ï¼ -ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ -ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -ï¼ -ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN -ï¼ -ï¼ $3 = 0 -ï¼ -ï¼ -ï¼ (gdb) bt -ï¼ -ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -ï¼ -ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -ï¼ gmain.c:3054 -ï¼ -ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -ï¼ address@hidden) at gmain.c:3630 -ï¼ -ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -ï¼ -ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -ï¼ util/main-loop.c:258 -ï¼ -ï¼ #5 main_loop_wait (address@hidden) at -ï¼ util/main-loop.c:506 -ï¼ -ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -ï¼ -ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -ï¼ outï¼) at vl.c:4709 -ï¼ -ï¼ (gdb) p ioc-ï¼features -ï¼ -ï¼ $1 = 6 -ï¼ -ï¼ (gdb) p ioc-ï¼name -ï¼ -ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -ï¼ -ï¼ -ï¼ May be socket_accept_incoming_migration should -ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -ï¼ -ï¼ -ï¼ thank you. -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ åå§é®ä»¶ -ï¼ address@hidden -ï¼ address@hidden -ï¼ address@hidden@huawei.comï¼ -ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -ï¼ ï¼ -ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -ï¼ ï¼ -ï¼ ï¼ I found that the colo in qemu is not complete yet. -ï¼ ï¼ Do the colo have any plan for development? -ï¼ -ï¼ Yes, We are developing. You can see some of patch we pushing. -ï¼ -ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -ï¼ -ï¼ In our internal version can run it successfully, -ï¼ The failover detail you can ask Zhanghailiang for help. -ï¼ Next time if you have some question about COLO, -ï¼ please cc me and zhanghailiang address@hidden -ï¼ -ï¼ -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ centos7.2+qemu2.7.50 -ï¼ ï¼ (gdb) bt -ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized outï¼, -ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, errp=0x0) at -ï¼ ï¼ io/channel-socket.c:497 -ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -ï¼ ï¼ address@hidden "", address@hidden, -ï¼ ï¼ address@hidden) at io/channel.c:97 -ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, -ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -ï¼ ï¼ migration/qemu-file-channel.c:78 -ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -ï¼ ï¼ migration/qemu-file.c:257 -ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:523 -ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -ï¼ ï¼ migration/qemu-file.c:603 -ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -ï¼ ï¼ address@hidden) at migration/colo.c:215 -ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message (errp=0x7f3d62bfaa48, -ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -ï¼ ï¼ migration/colo.c:546 -ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -ï¼ ï¼ migration/colo.c:649 -ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 -ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -- -ï¼ ï¼ View this message in context: -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ ï¼ -ï¼ -ï¼ -- -ï¼ Thanks -ï¼ Zhang Chen -ï¼ -ï¼ -ï¼ -ï¼ -ï¼ --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -. - -* Hailiang Zhang (address@hidden) wrote: -> -On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: -> -> * Hailiang Zhang (address@hidden) wrote: -> -> > Hi, -> -> > -> -> > Thanks for reporting this, and i confirmed it in my test, and it is a bug. -> -> > -> -> > Though we tried to call qemu_file_shutdown() to shutdown the related fd, -> -> > in -> -> > case COLO thread/incoming thread is stuck in read/write() while do -> -> > failover, -> -> > but it didn't take effect, because all the fd used by COLO (also -> -> > migration) -> -> > has been wrapped by qio channel, and it will not call the shutdown API if -> -> > we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), -> -> > QIO_CHANNEL_FEATURE_SHUTDOWN). -> -> > -> -> > Cc: Dr. David Alan Gilbert <address@hidden> -> -> > -> -> > I doubted migration cancel has the same problem, it may be stuck in -> -> > write() -> -> > if we tried to cancel migration. -> -> > -> -> > void fd_start_outgoing_migration(MigrationState *s, const char *fdname, -> -> > Error **errp) -> -> > { -> -> > qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing"); -> -> > migration_channel_connect(s, ioc, NULL); -> -> > ... ... -> -> > We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), -> -> > QIO_CHANNEL_FEATURE_SHUTDOWN) above, -> -> > and the -> -> > migrate_fd_cancel() -> -> > { -> -> > ... ... -> -> > if (s->state == MIGRATION_STATUS_CANCELLING && f) { -> -> > qemu_file_shutdown(f); --> This will not take effect. No ? -> -> > } -> -> > } -> -> -> -> (cc'd in Daniel Berrange). -> -> I see that we call qio_channel_set_feature(ioc, -> -> QIO_CHANNEL_FEATURE_SHUTDOWN); at the -> -> top of qio_channel_socket_new; so I think that's safe isn't it? -> -> -> -> -Hmm, you are right, this problem is only exist for the migration incoming fd, -> -thanks. -Yes, and I don't think we normally do a cancel on the incoming side of a -migration. - -Dave - -> -> Dave -> -> -> -> > Thanks, -> -> > Hailiang -> -> > -> -> > On 2017/3/21 16:10, address@hidden wrote: -> -> > > Thank youã -> -> > > -> -> > > I have test areadyã -> -> > > -> -> > > When the Primary Node panic,the Secondary Node qemu hang at the same -> -> > > placeã -> -> > > -> -> > > Incorrding -http://wiki.qemu-project.org/Features/COLO -ï¼kill Primary -> -> > > Node qemu will not produce the problem,but Primary Node panic canã -> -> > > -> -> > > I think due to the feature of channel does not support -> -> > > QIO_CHANNEL_FEATURE_SHUTDOWN. -> -> > > -> -> > > -> -> > > when failover,channel_shutdown could not shut down the channel. -> -> > > -> -> > > -> -> > > so the colo_process_incoming_thread will hang at recvmsg. -> -> > > -> -> > > -> -> > > I test a patch: -> -> > > -> -> > > -> -> > > diff --git a/migration/socket.c b/migration/socket.c -> -> > > -> -> > > -> -> > > index 13966f1..d65a0ea 100644 -> -> > > -> -> > > -> -> > > --- a/migration/socket.c -> -> > > -> -> > > -> -> > > +++ b/migration/socket.c -> -> > > -> -> > > -> -> > > @@ -147,8 +147,9 @@ static gboolean -> -> > > socket_accept_incoming_migration(QIOChannel *ioc, -> -> > > -> -> > > -> -> > > } -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > trace_migration_socket_incoming_accepted() -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > qio_channel_set_name(QIO_CHANNEL(sioc), -> -> > > "migration-socket-incoming") -> -> > > -> -> > > -> -> > > + qio_channel_set_feature(QIO_CHANNEL(sioc), -> -> > > QIO_CHANNEL_FEATURE_SHUTDOWN) -> -> > > -> -> > > -> -> > > migration_channel_process_incoming(migrate_get_current(), -> -> > > -> -> > > -> -> > > QIO_CHANNEL(sioc)) -> -> > > -> -> > > -> -> > > object_unref(OBJECT(sioc)) -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > My test will not hang any more. -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > åå§é®ä»¶ -> -> > > -> -> > > -> -> > > -> -> > > åä»¶äººï¼ address@hidden -> -> > > æ¶ä»¶äººï¼ç广10165992 address@hidden -> -> > > æéäººï¼ address@hidden address@hidden -> -> > > æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 -> -> > > 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > -> -> > > Hi,Wang. -> -> > > -> -> > > You can test this branch: -> -> > > -> -> > > -https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk -> -> > > -> -> > > and please follow wiki ensure your own configuration correctly. -> -> > > -> -> > > -http://wiki.qemu-project.org/Features/COLO -> -> > > -> -> > > -> -> > > Thanks -> -> > > -> -> > > Zhang Chen -> -> > > -> -> > > -> -> > > On 03/21/2017 03:27 PM, address@hidden wrote: -> -> > > ï¼ -> -> > > ï¼ hi. -> -> > > ï¼ -> -> > > ï¼ I test the git qemu master have the same problem. -> -> > > ï¼ -> -> > > ï¼ (gdb) bt -> -> > > ï¼ -> -> > > ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, -> -> > > ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 -> -> > > ï¼ -> -> > > ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read -> -> > > ï¼ (address@hidden, address@hidden "", -> -> > > ï¼ address@hidden, address@hidden) at io/channel.c:114 -> -> > > ï¼ -> -> > > ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, -> -> > > ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at -> -> > > ï¼ migration/qemu-file-channel.c:78 -> -> > > ï¼ -> -> > > ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at -> -> > > ï¼ migration/qemu-file.c:295 -> -> > > ï¼ -> -> > > ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, -> -> > > ï¼ address@hidden) at migration/qemu-file.c:555 -> -> > > ï¼ -> -> > > ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at -> -> > > ï¼ migration/qemu-file.c:568 -> -> > > ï¼ -> -> > > ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at -> -> > > ï¼ migration/qemu-file.c:648 -> -> > > ï¼ -> -> > > ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, -> -> > > ï¼ address@hidden) at migration/colo.c:244 -> -> > > ï¼ -> -> > > ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized -> -> > > ï¼ outï¼, address@hidden, -> -> > > ï¼ address@hidden) -> -> > > ï¼ -> -> > > ï¼ at migration/colo.c:264 -> -> > > ï¼ -> -> > > ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread -> -> > > ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 -> -> > > ï¼ -> -> > > ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 -> -> > > ï¼ -> -> > > ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 -> -> > > ï¼ -> -> > > ï¼ (gdb) p ioc-ï¼name -> -> > > ï¼ -> -> > > ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" -> -> > > ï¼ -> -> > > ï¼ (gdb) p ioc-ï¼features Do not support -> -> > > QIO_CHANNEL_FEATURE_SHUTDOWN -> -> > > ï¼ -> -> > > ï¼ $3 = 0 -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ (gdb) bt -> -> > > ï¼ -> -> > > ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, -> -> > > ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 -> -> > > ï¼ -> -> > > ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at -> -> > > ï¼ gmain.c:3054 -> -> > > ï¼ -> -> > > ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, -> -> > > ï¼ address@hidden) at gmain.c:3630 -> -> > > ï¼ -> -> > > ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 -> -> > > ï¼ -> -> > > ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at -> -> > > ï¼ util/main-loop.c:258 -> -> > > ï¼ -> -> > > ï¼ #5 main_loop_wait (address@hidden) at -> -> > > ï¼ util/main-loop.c:506 -> -> > > ï¼ -> -> > > ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 -> -> > > ï¼ -> -> > > ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized -> -> > > ï¼ outï¼) at vl.c:4709 -> -> > > ï¼ -> -> > > ï¼ (gdb) p ioc-ï¼features -> -> > > ï¼ -> -> > > ï¼ $1 = 6 -> -> > > ï¼ -> -> > > ï¼ (gdb) p ioc-ï¼name -> -> > > ï¼ -> -> > > ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ May be socket_accept_incoming_migration should -> -> > > ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ thank you. -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ åå§é®ä»¶ -> -> > > ï¼ address@hidden -> -> > > ï¼ address@hidden -> -> > > ï¼ address@hidden@huawei.comï¼ -> -> > > ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 -> -> > > ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ On 03/15/2017 05:06 PM, wangguang wrote: -> -> > > ï¼ ï¼ am testing QEMU COLO feature described here [QEMU -> -> > > ï¼ ï¼ Wiki]( -http://wiki.qemu-project.org/Features/COLO -). -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. -> -> > > ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. -> -> > > ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": -> -> > > ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's -> -> > > ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ I found that the colo in qemu is not complete yet. -> -> > > ï¼ ï¼ Do the colo have any plan for development? -> -> > > ï¼ -> -> > > ï¼ Yes, We are developing. You can see some of patch we pushing. -> -> > > ï¼ -> -> > > ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! -> -> > > ï¼ -> -> > > ï¼ In our internal version can run it successfully, -> -> > > ï¼ The failover detail you can ask Zhanghailiang for help. -> -> > > ï¼ Next time if you have some question about COLO, -> -> > > ï¼ please cc me and zhanghailiang address@hidden -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ Thanks -> -> > > ï¼ Zhang Chen -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ centos7.2+qemu2.7.50 -> -> > > ï¼ ï¼ (gdb) bt -> -> > > ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 -> -> > > ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized -> -> > > outï¼, -> -> > > ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, -> -> > > errp=0x0) at -> -> > > ï¼ ï¼ io/channel-socket.c:497 -> -> > > ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, -> -> > > ï¼ ï¼ address@hidden "", address@hidden, -> -> > > ï¼ ï¼ address@hidden) at io/channel.c:97 -> -> > > ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized -> -> > > outï¼, -> -> > > ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at -> -> > > ï¼ ï¼ migration/qemu-file-channel.c:78 -> -> > > ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at -> -> > > ï¼ ï¼ migration/qemu-file.c:257 -> -> > > ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, -> -> > > ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 -> -> > > ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at -> -> > > ï¼ ï¼ migration/qemu-file.c:523 -> -> > > ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at -> -> > > ï¼ ï¼ migration/qemu-file.c:603 -> -> > > ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, -> -> > > ï¼ ï¼ address@hidden) at migration/colo.c:215 -> -> > > ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message -> -> > > (errp=0x7f3d62bfaa48, -> -> > > ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at -> -> > > ï¼ ï¼ migration/colo.c:546 -> -> > > ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at -> -> > > ï¼ ï¼ migration/colo.c:649 -> -> > > ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from -> -> > > /lib64/libpthread.so.0 -> -> > > ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc.so.6 -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -- -> -> > > ï¼ ï¼ View this message in context: -> -> > > -http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html -> -> > > ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ ï¼ -> -> > > ï¼ -> -> > > ï¼ -- -> -> > > ï¼ Thanks -> -> > > ï¼ Zhang Chen -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > ï¼ -> -> > > -> -> > -> -> -- -> -> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> -> -> . -> -> -> --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - diff --git a/results/classifier/02/other/66743673 b/results/classifier/02/other/66743673 deleted file mode 100644 index dffa1fc10..000000000 --- a/results/classifier/02/other/66743673 +++ /dev/null @@ -1,365 +0,0 @@ -other: 0.967 -semantic: 0.951 -boot: 0.938 -instruction: 0.930 -mistranslation: 0.855 - -[Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT / CMP_LEG bits - -Hi Community, - -This email contains 3 bugs appear to share the same root cause. - -[1] We ran into the following warnings when running QEMU v10.0.0 in TCG mode: - -qemu-system-x86_64 \ - -machine q35 \ - -m 4G -smp 4 \ - -kernel ./arch/x86/boot/bzImage \ - -bios /usr/share/ovmf/OVMF.fd \ - -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ - -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ - -nographic \ - -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.01H:EDX.ht [bit 28] -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.80000001H:ECX.cmp-legacy [bit 1] -(repeats 4 times, once per vCPU) -Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up CPUID_HT in -x86_cpu_expand_features() instead of cpu_x86_cpuid()" is what introduced the -warnings. -Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) and -CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT support, these -bits trigger the warnings above. -[2] Also, Zhao pointed me to a similar report on GitLab: -https://gitlab.com/qemu-project/qemu/-/issues/2894 -The symptoms there look identical to what we're seeing. -By convention we file one issue per email, but these two appear to share the -same root cause, so I'm describing them together here. -[3] My colleague Alan noticed what appears to be a related problem: if we launch -a guest with '-cpu <model>,-ht --enable-kvm', which means explicitly removing -the ht flag, but the guest still reports HT(cat /proc/cpuinfo in linux guest) -enabled. In other words, under KVM the ht bit seems to be forced on even when -the user tries to disable it. -Best regards, -Ewan - -On 4/29/25 11:02 AM, Ewan Hai wrote: -Hi Community, - -This email contains 3 bugs appear to share the same root cause. - -[1] We ran into the following warnings when running QEMU v10.0.0 in TCG mode: - -qemu-system-x86_64 \ -  -machine q35 \ -  -m 4G -smp 4 \ -  -kernel ./arch/x86/boot/bzImage \ -  -bios /usr/share/ovmf/OVMF.fd \ -  -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ -  -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ -  -nographic \ -  -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.01H:EDX.ht [bit 28] -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.80000001H:ECX.cmp-legacy [bit 1] -(repeats 4 times, once per vCPU) -Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up CPUID_HT in -x86_cpu_expand_features() instead of cpu_x86_cpuid()" is what introduced the -warnings. -Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) and -CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT support, these -bits trigger the warnings above. -[2] Also, Zhao pointed me to a similar report on GitLab: -https://gitlab.com/qemu-project/qemu/-/issues/2894 -The symptoms there look identical to what we're seeing. -By convention we file one issue per email, but these two appear to share the -same root cause, so I'm describing them together here. -[3] My colleague Alan noticed what appears to be a related problem: if we launch -a guest with '-cpu <model>,-ht --enable-kvm', which means explicitly removing -the ht flag, but the guest still reports HT(cat /proc/cpuinfo in linux guest) -enabled. In other words, under KVM the ht bit seems to be forced on even when -the user tries to disable it. -XiaoYao reminded me that issue [3] stems from a different patch. Please ignore -it for nowâI'll start a separate thread to discuss that one independently. -Best regards, -Ewan - -On 4/29/2025 11:02 AM, Ewan Hai wrote: -Hi Community, - -This email contains 3 bugs appear to share the same root cause. -[1] We ran into the following warnings when running QEMU v10.0.0 in TCG -mode: -qemu-system-x86_64 \ -  -machine q35 \ -  -m 4G -smp 4 \ -  -kernel ./arch/x86/boot/bzImage \ -  -bios /usr/share/ovmf/OVMF.fd \ -  -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ -  -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ -  -nographic \ -  -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.01H:EDX.ht [bit 28] -qemu-system-x86_64: warning: TCG doesn't support requested feature: -CPUID.80000001H:ECX.cmp-legacy [bit 1] -(repeats 4 times, once per vCPU) -Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up -CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is -what introduced the warnings. -Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) -and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT -support, these bits trigger the warnings above. -[2] Also, Zhao pointed me to a similar report on GitLab: -https://gitlab.com/qemu-project/qemu/-/issues/2894 -The symptoms there look identical to what we're seeing. -By convention we file one issue per email, but these two appear to share -the same root cause, so I'm describing them together here. -It was caused by my two patches. I think the fix can be as follow. -If no objection from the community, I can submit the formal patch. - -diff --git a/target/i386/cpu.c b/target/i386/cpu.c -index 1f970aa4daa6..fb95aadd6161 100644 ---- a/target/i386/cpu.c -+++ b/target/i386/cpu.c -@@ -776,11 +776,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t -vendor1, -CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \ - CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \ - CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \ -- CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE) -+ CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE | \ -+ CPUID_HT) - /* partly implemented: - CPUID_MTRR, CPUID_MCA, CPUID_CLFLUSH (needed for Win64) */ - /* missing: -- CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */ -+ CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_TM, CPUID_PBE */ - - /* - * Kernel-only features that can be shown to usermode programs even if -@@ -848,7 +849,8 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t -vendor1, -#define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \ - CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A | \ -- CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES) -+ CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES | \ -+ CPUID_EXT3_CMP_LEG) - - #define TCG_EXT4_FEATURES 0 -[3] My colleague Alan noticed what appears to be a related problem: if -we launch a guest with '-cpu <model>,-ht --enable-kvm', which means -explicitly removing the ht flag, but the guest still reports HT(cat / -proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht -bit seems to be forced on even when the user tries to disable it. -This has been the behavior of QEMU for many years, not some regression -introduced by my patches. We can discuss how to address it separately. -Best regards, -Ewan - -On Tue, Apr 29, 2025 at 01:55:59PM +0800, Xiaoyao Li wrote: -> -Date: Tue, 29 Apr 2025 13:55:59 +0800 -> -From: Xiaoyao Li <xiaoyao.li@intel.com> -> -Subject: Re: [Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT / -> -CMP_LEG bits -> -> -On 4/29/2025 11:02 AM, Ewan Hai wrote: -> -> Hi Community, -> -> -> -> This email contains 3 bugs appear to share the same root cause. -> -> -> -> [1] We ran into the following warnings when running QEMU v10.0.0 in TCG -> -> mode: -> -> -> -> qemu-system-x86_64 \ -> ->  -machine q35 \ -> ->  -m 4G -smp 4 \ -> ->  -kernel ./arch/x86/boot/bzImage \ -> ->  -bios /usr/share/ovmf/OVMF.fd \ -> ->  -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ -> ->  -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ -> ->  -nographic \ -> ->  -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' -> -> -> -> qemu-system-x86_64: warning: TCG doesn't support requested feature: -> -> CPUID.01H:EDX.ht [bit 28] -> -> qemu-system-x86_64: warning: TCG doesn't support requested feature: -> -> CPUID.80000001H:ECX.cmp-legacy [bit 1] -> -> (repeats 4 times, once per vCPU) -> -> -> -> Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up -> -> CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is -> -> what introduced the warnings. -> -> -> -> Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) -> -> and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT -> -> support, these bits trigger the warnings above. -> -> -> -> [2] Also, Zhao pointed me to a similar report on GitLab: -> -> -https://gitlab.com/qemu-project/qemu/-/issues/2894 -> -> The symptoms there look identical to what we're seeing. -> -> -> -> By convention we file one issue per email, but these two appear to share -> -> the same root cause, so I'm describing them together here. -> -> -It was caused by my two patches. I think the fix can be as follow. -> -If no objection from the community, I can submit the formal patch. -> -> -diff --git a/target/i386/cpu.c b/target/i386/cpu.c -> -index 1f970aa4daa6..fb95aadd6161 100644 -> ---- a/target/i386/cpu.c -> -+++ b/target/i386/cpu.c -> -@@ -776,11 +776,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t -> -vendor1, -> -CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \ -> -CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \ -> -CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \ -> -- CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE) -> -+ CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE | \ -> -+ CPUID_HT) -> -/* partly implemented: -> -CPUID_MTRR, CPUID_MCA, CPUID_CLFLUSH (needed for Win64) */ -> -/* missing: -> -- CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */ -> -+ CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_TM, CPUID_PBE */ -> -> -/* -> -* Kernel-only features that can be shown to usermode programs even if -> -@@ -848,7 +849,8 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t -> -vendor1, -> -> -#define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \ -> -CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A | \ -> -- CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES) -> -+ CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES | \ -> -+ CPUID_EXT3_CMP_LEG) -> -> -#define TCG_EXT4_FEATURES 0 -This fix is fine for me...at least from SDM, HTT depends on topology and -it should exist when user sets "-smp 4". - -> -> [3] My colleague Alan noticed what appears to be a related problem: if -> -> we launch a guest with '-cpu <model>,-ht --enable-kvm', which means -> -> explicitly removing the ht flag, but the guest still reports HT(cat -> -> /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht -> -> bit seems to be forced on even when the user tries to disable it. -> -> -XiaoYao reminded me that issue [3] stems from a different patch. Please -> -ignore it for nowâI'll start a separate thread to discuss that one -> -independently. -I haven't found any other thread :-). - -By the way, just curious, in what cases do you need to disbale the HT -flag? "-smp 4" means 4 cores with 1 thread per core, and is it not -enough? - -As for the â-htâ behavior, I'm also unsure whether this should be fixed -or not - one possible consideration is whether â-htâ would be useful. - -On 5/8/25 5:04 PM, Zhao Liu wrote: -[3] My colleague Alan noticed what appears to be a related problem: if -we launch a guest with '-cpu <model>,-ht --enable-kvm', which means -explicitly removing the ht flag, but the guest still reports HT(cat -/proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht -bit seems to be forced on even when the user tries to disable it. -XiaoYao reminded me that issue [3] stems from a different patch. Please -ignore it for nowâI'll start a separate thread to discuss that one -independently. -I haven't found any other thread :-). -Please refer to -https://lore.kernel.org/all/db6ae3bb-f4e5-4719-9beb-623fcff56af2@zhaoxin.com/ -. -By the way, just curious, in what cases do you need to disbale the HT -flag? "-smp 4" means 4 cores with 1 thread per core, and is it not -enough? - -As for the â-htâ behavior, I'm also unsure whether this should be fixed -or not - one possible consideration is whether â-htâ would be useful. -I wasn't trying to target any specific use case, using "-ht" was simply a way to -check how the ht feature behaves under both KVM and TCG. There's no special -workload behind it; I just wanted to confirm that the flag is respected (or not) -in each mode. - diff --git a/results/classifier/02/other/68897003 b/results/classifier/02/other/68897003 deleted file mode 100644 index 889461974..000000000 --- a/results/classifier/02/other/68897003 +++ /dev/null @@ -1,717 +0,0 @@ -other: 0.714 -semantic: 0.671 -instruction: 0.641 -boot: 0.569 -mistranslation: 0.535 - -[Qemu-devel] [BUG] VM abort after migration - -Hi guys, - -We found a qemu core in our testing environment, the assertion -'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -the bus->irq_count[i] is '-1'. - -Through analysis, it was happened after VM migration and we think -it was caused by the following sequence: - -*Migration Source* -1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -2. save E1000: - e1000_pre_save - e1000_mit_timer - set_interrupt_cause - pci_set_irq --> update pci_dev->irq_state to 1 and - update bus->irq_count[x] to 1 ( new ) - the irq_state sent to dest. - -*Migration Dest* -1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -2. If the e1000 need change irqline , it would call to pci_irq_handler(), - the irq_state maybe change to 0 and bus->irq_count[x] will become - -1 in this situation. -3. do VM reboot then the assertion will be triggered. - -We also found some guys faced the similar problem: -[1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -[2] -https://bugs.launchpad.net/qemu/+bug/1702621 -Is there some patches to fix this problem ? -Can we save pcibus state after all the pci devs are saved ? - -Thanks, -Longpeng(Mike) - -* longpeng (address@hidden) wrote: -> -Hi guys, -> -> -We found a qemu core in our testing environment, the assertion -> -'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -> -the bus->irq_count[i] is '-1'. -> -> -Through analysis, it was happened after VM migration and we think -> -it was caused by the following sequence: -> -> -*Migration Source* -> -1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -> -2. save E1000: -> -e1000_pre_save -> -e1000_mit_timer -> -set_interrupt_cause -> -pci_set_irq --> update pci_dev->irq_state to 1 and -> -update bus->irq_count[x] to 1 ( new ) -> -the irq_state sent to dest. -> -> -*Migration Dest* -> -1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -> -2. If the e1000 need change irqline , it would call to pci_irq_handler(), -> -the irq_state maybe change to 0 and bus->irq_count[x] will become -> --1 in this situation. -> -3. do VM reboot then the assertion will be triggered. -> -> -We also found some guys faced the similar problem: -> -[1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -> -[2] -https://bugs.launchpad.net/qemu/+bug/1702621 -> -> -Is there some patches to fix this problem ? -I don't remember any. - -> -Can we save pcibus state after all the pci devs are saved ? -Does this problem only happen with e1000? I think so. -If it's only e1000 I think we should fix it - I think once the VM is -stopped for doing the device migration it shouldn't be raising -interrupts. - -Dave - -> -Thanks, -> -Longpeng(Mike) --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -* longpeng (address@hidden) wrote: -Hi guys, - -We found a qemu core in our testing environment, the assertion -'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -the bus->irq_count[i] is '-1'. - -Through analysis, it was happened after VM migration and we think -it was caused by the following sequence: - -*Migration Source* -1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -2. save E1000: - e1000_pre_save - e1000_mit_timer - set_interrupt_cause - pci_set_irq --> update pci_dev->irq_state to 1 and - update bus->irq_count[x] to 1 ( new ) - the irq_state sent to dest. - -*Migration Dest* -1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -2. If the e1000 need change irqline , it would call to pci_irq_handler(), - the irq_state maybe change to 0 and bus->irq_count[x] will become - -1 in this situation. -3. do VM reboot then the assertion will be triggered. - -We also found some guys faced the similar problem: -[1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -[2] -https://bugs.launchpad.net/qemu/+bug/1702621 -Is there some patches to fix this problem ? -I don't remember any. -Can we save pcibus state after all the pci devs are saved ? -Does this problem only happen with e1000? I think so. -If it's only e1000 I think we should fix it - I think once the VM is -stopped for doing the device migration it shouldn't be raising -interrupts. -I wonder maybe we can simply fix this by no setting ICS on pre_save() -but scheduling mit timer unconditionally in post_load(). -Thanks -Dave -Thanks, -Longpeng(Mike) --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -å¨ 2019/7/10 11:25, Jason Wang åé: -> -> -On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -> -> * longpeng (address@hidden) wrote: -> ->> Hi guys, -> ->> -> ->> We found a qemu core in our testing environment, the assertion -> ->> 'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -> ->> the bus->irq_count[i] is '-1'. -> ->> -> ->> Through analysis, it was happened after VM migration and we think -> ->> it was caused by the following sequence: -> ->> -> ->> *Migration Source* -> ->> 1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -> ->> 2. save E1000: -> ->>    e1000_pre_save -> ->>     e1000_mit_timer -> ->>      set_interrupt_cause -> ->>       pci_set_irq --> update pci_dev->irq_state to 1 and -> ->>                   update bus->irq_count[x] to 1 ( new ) -> ->>     the irq_state sent to dest. -> ->> -> ->> *Migration Dest* -> ->> 1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -> ->> 2. If the e1000 need change irqline , it would call to pci_irq_handler(), -> ->>   the irq_state maybe change to 0 and bus->irq_count[x] will become -> ->>   -1 in this situation. -> ->> 3. do VM reboot then the assertion will be triggered. -> ->> -> ->> We also found some guys faced the similar problem: -> ->> [1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -> ->> [2] -https://bugs.launchpad.net/qemu/+bug/1702621 -> ->> -> ->> Is there some patches to fix this problem ? -> -> I don't remember any. -> -> -> ->> Can we save pcibus state after all the pci devs are saved ? -> -> Does this problem only happen with e1000? I think so. -> -> If it's only e1000 I think we should fix it - I think once the VM is -> -> stopped for doing the device migration it shouldn't be raising -> -> interrupts. -> -> -> -I wonder maybe we can simply fix this by no setting ICS on pre_save() but -> -scheduling mit timer unconditionally in post_load(). -> -I also think this is a bug of e1000 because we find more cores with the same -frame thease days. - -I'm not familiar with e1000 so hope someone could fix it, thanks. :) - -> -Thanks -> -> -> -> -> -> Dave -> -> -> ->> Thanks, -> ->> Longpeng(Mike) -> -> -- -> -> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> -> -. -> --- -Regards, -Longpeng(Mike) - -On 2019/7/10 ä¸å11:36, Longpeng (Mike) wrote: -å¨ 2019/7/10 11:25, Jason Wang åé: -On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -* longpeng (address@hidden) wrote: -Hi guys, - -We found a qemu core in our testing environment, the assertion -'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -the bus->irq_count[i] is '-1'. - -Through analysis, it was happened after VM migration and we think -it was caused by the following sequence: - -*Migration Source* -1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -2. save E1000: -    e1000_pre_save -     e1000_mit_timer -      set_interrupt_cause -       pci_set_irq --> update pci_dev->irq_state to 1 and -                   update bus->irq_count[x] to 1 ( new ) -     the irq_state sent to dest. - -*Migration Dest* -1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -2. If the e1000 need change irqline , it would call to pci_irq_handler(), -   the irq_state maybe change to 0 and bus->irq_count[x] will become -   -1 in this situation. -3. do VM reboot then the assertion will be triggered. - -We also found some guys faced the similar problem: -[1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -[2] -https://bugs.launchpad.net/qemu/+bug/1702621 -Is there some patches to fix this problem ? -I don't remember any. -Can we save pcibus state after all the pci devs are saved ? -Does this problem only happen with e1000? I think so. -If it's only e1000 I think we should fix it - I think once the VM is -stopped for doing the device migration it shouldn't be raising -interrupts. -I wonder maybe we can simply fix this by no setting ICS on pre_save() but -scheduling mit timer unconditionally in post_load(). -I also think this is a bug of e1000 because we find more cores with the same -frame thease days. - -I'm not familiar with e1000 so hope someone could fix it, thanks. :) -Draft a path in attachment, please test. - -Thanks -Thanks -Dave -Thanks, -Longpeng(Mike) --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK -. -0001-e1000-don-t-raise-interrupt-in-pre_save.patch -Description: -Text Data - -å¨ 2019/7/10 11:57, Jason Wang åé: -> -> -On 2019/7/10 ä¸å11:36, Longpeng (Mike) wrote: -> -> å¨ 2019/7/10 11:25, Jason Wang åé: -> ->> On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -> ->>> * longpeng (address@hidden) wrote: -> ->>>> Hi guys, -> ->>>> -> ->>>> We found a qemu core in our testing environment, the assertion -> ->>>> 'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -> ->>>> the bus->irq_count[i] is '-1'. -> ->>>> -> ->>>> Through analysis, it was happened after VM migration and we think -> ->>>> it was caused by the following sequence: -> ->>>> -> ->>>> *Migration Source* -> ->>>> 1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -> ->>>> 2. save E1000: -> ->>>>     e1000_pre_save -> ->>>>      e1000_mit_timer -> ->>>>       set_interrupt_cause -> ->>>>        pci_set_irq --> update pci_dev->irq_state to 1 and -> ->>>>                    update bus->irq_count[x] to 1 ( new ) -> ->>>>      the irq_state sent to dest. -> ->>>> -> ->>>> *Migration Dest* -> ->>>> 1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is -> ->>>> 1. -> ->>>> 2. If the e1000 need change irqline , it would call to pci_irq_handler(), -> ->>>>    the irq_state maybe change to 0 and bus->irq_count[x] will become -> ->>>>    -1 in this situation. -> ->>>> 3. do VM reboot then the assertion will be triggered. -> ->>>> -> ->>>> We also found some guys faced the similar problem: -> ->>>> [1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -> ->>>> [2] -https://bugs.launchpad.net/qemu/+bug/1702621 -> ->>>> -> ->>>> Is there some patches to fix this problem ? -> ->>> I don't remember any. -> ->>> -> ->>>> Can we save pcibus state after all the pci devs are saved ? -> ->>> Does this problem only happen with e1000? I think so. -> ->>> If it's only e1000 I think we should fix it - I think once the VM is -> ->>> stopped for doing the device migration it shouldn't be raising -> ->>> interrupts. -> ->> -> ->> I wonder maybe we can simply fix this by no setting ICS on pre_save() but -> ->> scheduling mit timer unconditionally in post_load(). -> ->> -> -> I also think this is a bug of e1000 because we find more cores with the same -> -> frame thease days. -> -> -> -> I'm not familiar with e1000 so hope someone could fix it, thanks. :) -> -> -> -> -Draft a path in attachment, please test. -> -Thanks. We'll test it for a few weeks and then give you the feedback. :) - -> -Thanks -> -> -> ->> Thanks -> ->> -> ->> -> ->>> Dave -> ->>> -> ->>>> Thanks, -> ->>>> Longpeng(Mike) -> ->>> -- -> ->>> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> ->> . -> ->> --- -Regards, -Longpeng(Mike) - -å¨ 2019/7/10 11:57, Jason Wang åé: -> -> -On 2019/7/10 ä¸å11:36, Longpeng (Mike) wrote: -> -> å¨ 2019/7/10 11:25, Jason Wang åé: -> ->> On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -> ->>> * longpeng (address@hidden) wrote: -> ->>>> Hi guys, -> ->>>> -> ->>>> We found a qemu core in our testing environment, the assertion -> ->>>> 'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -> ->>>> the bus->irq_count[i] is '-1'. -> ->>>> -> ->>>> Through analysis, it was happened after VM migration and we think -> ->>>> it was caused by the following sequence: -> ->>>> -> ->>>> *Migration Source* -> ->>>> 1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -> ->>>> 2. save E1000: -> ->>>>     e1000_pre_save -> ->>>>      e1000_mit_timer -> ->>>>       set_interrupt_cause -> ->>>>        pci_set_irq --> update pci_dev->irq_state to 1 and -> ->>>>                    update bus->irq_count[x] to 1 ( new ) -> ->>>>      the irq_state sent to dest. -> ->>>> -> ->>>> *Migration Dest* -> ->>>> 1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is -> ->>>> 1. -> ->>>> 2. If the e1000 need change irqline , it would call to pci_irq_handler(), -> ->>>>    the irq_state maybe change to 0 and bus->irq_count[x] will become -> ->>>>    -1 in this situation. -> ->>>> 3. do VM reboot then the assertion will be triggered. -> ->>>> -> ->>>> We also found some guys faced the similar problem: -> ->>>> [1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -> ->>>> [2] -https://bugs.launchpad.net/qemu/+bug/1702621 -> ->>>> -> ->>>> Is there some patches to fix this problem ? -> ->>> I don't remember any. -> ->>> -> ->>>> Can we save pcibus state after all the pci devs are saved ? -> ->>> Does this problem only happen with e1000? I think so. -> ->>> If it's only e1000 I think we should fix it - I think once the VM is -> ->>> stopped for doing the device migration it shouldn't be raising -> ->>> interrupts. -> ->> -> ->> I wonder maybe we can simply fix this by no setting ICS on pre_save() but -> ->> scheduling mit timer unconditionally in post_load(). -> ->> -> -> I also think this is a bug of e1000 because we find more cores with the same -> -> frame thease days. -> -> -> -> I'm not familiar with e1000 so hope someone could fix it, thanks. :) -> -> -> -> -Draft a path in attachment, please test. -> -Hi Jason, - -We've tested the patch for about two weeks, everything went well, thanks! - -Feel free to add my: -Reported-and-tested-by: Longpeng <address@hidden> - -> -Thanks -> -> -> ->> Thanks -> ->> -> ->> -> ->>> Dave -> ->>> -> ->>>> Thanks, -> ->>>> Longpeng(Mike) -> ->>> -- -> ->>> Dr. David Alan Gilbert / address@hidden / Manchester, UK -> ->> . -> ->> --- -Regards, -Longpeng(Mike) - -On 2019/7/27 ä¸å2:10, Longpeng (Mike) wrote: -å¨ 2019/7/10 11:57, Jason Wang åé: -On 2019/7/10 ä¸å11:36, Longpeng (Mike) wrote: -å¨ 2019/7/10 11:25, Jason Wang åé: -On 2019/7/8 ä¸å5:47, Dr. David Alan Gilbert wrote: -* longpeng (address@hidden) wrote: -Hi guys, - -We found a qemu core in our testing environment, the assertion -'assert(bus->irq_count[i] == 0)' in pcibus_reset() was triggered and -the bus->irq_count[i] is '-1'. - -Through analysis, it was happened after VM migration and we think -it was caused by the following sequence: - -*Migration Source* -1. save bus pci.0 state, including irq_count[x] ( =0 , old ) -2. save E1000: -     e1000_pre_save -      e1000_mit_timer -       set_interrupt_cause -        pci_set_irq --> update pci_dev->irq_state to 1 and -                    update bus->irq_count[x] to 1 ( new ) -      the irq_state sent to dest. - -*Migration Dest* -1. Receive the irq_count[x] of pci.0 is 0 , but the irq_state of e1000 is 1. -2. If the e1000 need change irqline , it would call to pci_irq_handler(), -    the irq_state maybe change to 0 and bus->irq_count[x] will become -    -1 in this situation. -3. do VM reboot then the assertion will be triggered. - -We also found some guys faced the similar problem: -[1] -https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02525.html -[2] -https://bugs.launchpad.net/qemu/+bug/1702621 -Is there some patches to fix this problem ? -I don't remember any. -Can we save pcibus state after all the pci devs are saved ? -Does this problem only happen with e1000? I think so. -If it's only e1000 I think we should fix it - I think once the VM is -stopped for doing the device migration it shouldn't be raising -interrupts. -I wonder maybe we can simply fix this by no setting ICS on pre_save() but -scheduling mit timer unconditionally in post_load(). -I also think this is a bug of e1000 because we find more cores with the same -frame thease days. - -I'm not familiar with e1000 so hope someone could fix it, thanks. :) -Draft a path in attachment, please test. -Hi Jason, - -We've tested the patch for about two weeks, everything went well, thanks! - -Feel free to add my: -Reported-and-tested-by: Longpeng <address@hidden> -Applied. - -Thanks -Thanks -Thanks -Dave -Thanks, -Longpeng(Mike) --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK -. - diff --git a/results/classifier/02/other/70021271 b/results/classifier/02/other/70021271 deleted file mode 100644 index cacc2c918..000000000 --- a/results/classifier/02/other/70021271 +++ /dev/null @@ -1,7449 +0,0 @@ -other: 0.963 -semantic: 0.946 -mistranslation: 0.929 -instruction: 0.880 -boot: 0.872 - -[Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug - -Hi all, - -In our test, we configured VM with several pci-bridges and a virtio-net nic -been attached with bus 4, -After VM is startup, We ping this nic from host to judge if it is working -normally. Then, we hot add pci devices to this VM with bus 0. -We found the virtio-net NIC in bus 4 is not working (can not connect) -occasionally, as it kick virtio backend failure with error below: - Unassigned mem write 00000000fc803004 = 0x1 - -memory-region: pci_bridge_pci - 0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci - 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci - 00000000fc800000-00000000fc800fff (prio 0, RW): virtio-pci-common - 00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr - 00000000fc802000-00000000fc802fff (prio 0, RW): virtio-pci-device - 00000000fc803000-00000000fc803fff (prio 0, RW): virtio-pci-notify <- io -mem unassigned - ⦠- -We caught an exceptional address changing while this problem happened, show as -follow: -Before pci_bridge_update_mappingsï¼ - 00000000fc000000-00000000fc1fffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fc000000-00000000fc1fffff - 00000000fc200000-00000000fc3fffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fc200000-00000000fc3fffff - 00000000fc400000-00000000fc5fffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fc400000-00000000fc5fffff - 00000000fc600000-00000000fc7fffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fc600000-00000000fc7fffff - 00000000fc800000-00000000fc9fffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fc800000-00000000fc9fffff <- correct Adress Spce - 00000000fca00000-00000000fcbfffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fca00000-00000000fcbfffff - 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fcc00000-00000000fcdfffff - 00000000fce00000-00000000fcffffff (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci 00000000fce00000-00000000fcffffff - -After pci_bridge_update_mappingsï¼ - 00000000fda00000-00000000fdbfffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fda00000-00000000fdbfffff - 00000000fdc00000-00000000fddfffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fdc00000-00000000fddfffff - 00000000fde00000-00000000fdffffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fde00000-00000000fdffffff - 00000000fe000000-00000000fe1fffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fe000000-00000000fe1fffff - 00000000fe200000-00000000fe3fffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fe200000-00000000fe3fffff - 00000000fe400000-00000000fe5fffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fe400000-00000000fe5fffff - 00000000fe600000-00000000fe7fffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fe600000-00000000fe7fffff - 00000000fe800000-00000000fe9fffff (prio 1, RW): alias pci_bridge_mem -@pci_bridge_pci 00000000fe800000-00000000fe9fffff - fffffffffc800000-fffffffffc800000 (prio 1, RW): alias pci_bridge_pref_mem -@pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional Adress Space - -We have figured out why this address becomes this value, according to pci -spec, pci driver can get BAR address size by writing 0xffffffff to -the pci register firstly, and then read back the value from this register. -We didn't handle this value specially while process pci write in qemu, the -function call stack is: -Pci_bridge_dev_write_config --> pci_bridge_write_config --> pci_default_write_config (we update the config[address] value here to -fffffffffc800000, which should be 0xfc800000 ) --> pci_bridge_update_mappings - ->pci_bridge_region_del(br, br->windows); --> pci_bridge_region_init - -->pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong value -fffffffffc800000) - -> -memory_region_transaction_commit - -So, as we can see, we use the wrong base address in qemu to update the memory -regions, though, we update the base address to -The correct value after pci driver in VM write the original value back, the -virtio NIC in bus 4 may still sends net packets concurrently with -The wrong memory region address. - -We have tried to skip the memory region update action in qemu while detect pci -write with 0xffffffff value, and it does work, but -This seems to be not gently. - -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -index b2e50c3..84b405d 100644 ---- a/hw/pci/pci_bridge.c -+++ b/hw/pci/pci_bridge.c -@@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d, - pci_default_write_config(d, address, val, len); -- if (ranges_overlap(address, len, PCI_COMMAND, 2) || -+ if ( (val != 0xffffffff) && -+ (ranges_overlap(address, len, PCI_COMMAND, 2) || - /* io base/limit */ - ranges_overlap(address, len, PCI_IO_BASE, 2) || -@@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d, - ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || - /* vga enable */ -- ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -+ ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) { - pci_bridge_update_mappings(s); - } - -Thinks, -Xu - -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -Hi all, -> -> -> -> -In our test, we configured VM with several pci-bridges and a virtio-net nic -> -been attached with bus 4, -> -> -After VM is startup, We ping this nic from host to judge if it is working -> -normally. Then, we hot add pci devices to this VM with bus 0. -> -> -We found the virtio-net NIC in bus 4 is not working (can not connect) -> -occasionally, as it kick virtio backend failure with error below: -> -> -Unassigned mem write 00000000fc803004 = 0x1 -Thanks for the report. Which guest was used to produce this problem? - --- -MST - -n Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> Hi all, -> -> -> -> -> -> -> -> In our test, we configured VM with several pci-bridges and a -> -> virtio-net nic been attached with bus 4, -> -> -> -> After VM is startup, We ping this nic from host to judge if it is -> -> working normally. Then, we hot add pci devices to this VM with bus 0. -> -> -> -> We found the virtio-net NIC in bus 4 is not working (can not connect) -> -> occasionally, as it kick virtio backend failure with error below: -> -> -> -> Unassigned mem write 00000000fc803004 = 0x1 -> -> -Thanks for the report. Which guest was used to produce this problem? -> -> --- -> -MST -I was seeing this problem when I hotplug a VFIO device to guest CentOS 7.4, -after that I compiled the latest Linux kernel and it also contains this problem. - -Thinks, -Xu - -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -Hi all, -> -> -> -> -In our test, we configured VM with several pci-bridges and a virtio-net nic -> -been attached with bus 4, -> -> -After VM is startup, We ping this nic from host to judge if it is working -> -normally. Then, we hot add pci devices to this VM with bus 0. -> -> -We found the virtio-net NIC in bus 4 is not working (can not connect) -> -occasionally, as it kick virtio backend failure with error below: -> -> -Unassigned mem write 00000000fc803004 = 0x1 -> -> -> -> -memory-region: pci_bridge_pci -> -> -0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci -> -> -00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> -00000000fc800000-00000000fc800fff (prio 0, RW): virtio-pci-common -> -> -00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr -> -> -00000000fc802000-00000000fc802fff (prio 0, RW): virtio-pci-device -> -> -00000000fc803000-00000000fc803fff (prio 0, RW): virtio-pci-notify <- io -> -mem unassigned -> -> -⦠-> -> -> -> -We caught an exceptional address changing while this problem happened, show as -> -follow: -> -> -Before pci_bridge_update_mappingsï¼ -> -> -00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fc000000-00000000fc1fffff -> -> -00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fc200000-00000000fc3fffff -> -> -00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fc400000-00000000fc5fffff -> -> -00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fc600000-00000000fc7fffff -> -> -00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fc800000-00000000fc9fffff <- correct Adress Spce -> -> -00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fca00000-00000000fcbfffff -> -> -00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fcc00000-00000000fcdfffff -> -> -00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci 00000000fce00000-00000000fcffffff -> -> -> -> -After pci_bridge_update_mappingsï¼ -> -> -00000000fda00000-00000000fdbfffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fda00000-00000000fdbfffff -> -> -00000000fdc00000-00000000fddfffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fdc00000-00000000fddfffff -> -> -00000000fde00000-00000000fdffffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fde00000-00000000fdffffff -> -> -00000000fe000000-00000000fe1fffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fe000000-00000000fe1fffff -> -> -00000000fe200000-00000000fe3fffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fe200000-00000000fe3fffff -> -> -00000000fe400000-00000000fe5fffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fe400000-00000000fe5fffff -> -> -00000000fe600000-00000000fe7fffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fe600000-00000000fe7fffff -> -> -00000000fe800000-00000000fe9fffff (prio 1, RW): alias pci_bridge_mem -> -@pci_bridge_pci 00000000fe800000-00000000fe9fffff -> -> -fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -pci_bridge_pref_mem -> -@pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional Adress -> -Space -This one is empty though right? - -> -> -> -We have figured out why this address becomes this value, according to pci -> -spec, pci driver can get BAR address size by writing 0xffffffff to -> -> -the pci register firstly, and then read back the value from this register. -OK however as you show below the BAR being sized is the BAR -if a bridge. Are you then adding a bridge device by hotplug? - - - -> -We didn't handle this value specially while process pci write in qemu, the -> -function call stack is: -> -> -Pci_bridge_dev_write_config -> -> --> pci_bridge_write_config -> -> --> pci_default_write_config (we update the config[address] value here to -> -fffffffffc800000, which should be 0xfc800000 ) -> -> --> pci_bridge_update_mappings -> -> -->pci_bridge_region_del(br, br->windows); -> -> --> pci_bridge_region_init -> -> --> -> -pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong value -> -fffffffffc800000) -> -> --> -> -memory_region_transaction_commit -> -> -> -> -So, as we can see, we use the wrong base address in qemu to update the memory -> -regions, though, we update the base address to -> -> -The correct value after pci driver in VM write the original value back, the -> -virtio NIC in bus 4 may still sends net packets concurrently with -> -> -The wrong memory region address. -> -> -> -> -We have tried to skip the memory region update action in qemu while detect pci -> -write with 0xffffffff value, and it does work, but -> -> -This seems to be not gently. -For sure. But I'm still puzzled as to why does Linux try to -size the BAR of the bridge while a device behind it is -used. - -Can you pls post your QEMU command line? - - - -> -> -> -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -> -> -index b2e50c3..84b405d 100644 -> -> ---- a/hw/pci/pci_bridge.c -> -> -+++ b/hw/pci/pci_bridge.c -> -> -@@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d, -> -> -pci_default_write_config(d, address, val, len); -> -> -- if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -+ if ( (val != 0xffffffff) && -> -> -+ (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -/* io base/limit */ -> -> -ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> -@@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d, -> -> -ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> -/* vga enable */ -> -> -- ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> -+ ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) { -> -> -pci_bridge_update_mappings(s); -> -> -} -> -> -> -> -Thinks, -> -> -Xu -> - -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> Hi all, -> -> -> -> -> -> -> -> In our test, we configured VM with several pci-bridges and a -> -> virtio-net nic been attached with bus 4, -> -> -> -> After VM is startup, We ping this nic from host to judge if it is -> -> working normally. Then, we hot add pci devices to this VM with bus 0. -> -> -> -> We found the virtio-net NIC in bus 4 is not working (can not connect) -> -> occasionally, as it kick virtio backend failure with error below: -> -> -> -> Unassigned mem write 00000000fc803004 = 0x1 -> -> -> -> -> -> -> -> memory-region: pci_bridge_pci -> -> -> -> 0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci -> -> -> -> 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> -> -> 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> virtio-pci-common -> -> -> -> 00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr -> -> -> -> 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> virtio-pci-device -> -> -> -> 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> virtio-pci-notify <- io mem unassigned -> -> -> -> ⦠-> -> -> -> -> -> -> -> We caught an exceptional address changing while this problem happened, -> -> show as -> -> follow: -> -> -> -> Before pci_bridge_update_mappingsï¼ -> -> -> -> 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fc000000-00000000fc1fffff -> -> -> -> 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fc200000-00000000fc3fffff -> -> -> -> 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fc400000-00000000fc5fffff -> -> -> -> 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fc600000-00000000fc7fffff -> -> -> -> 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fc800000-00000000fc9fffff -> -> <- correct Adress Spce -> -> -> -> 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fca00000-00000000fcbfffff -> -> -> -> 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fcc00000-00000000fcdfffff -> -> -> -> 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> pci_bridge_pref_mem @pci_bridge_pci 00000000fce00000-00000000fcffffff -> -> -> -> -> -> -> -> After pci_bridge_update_mappingsï¼ -> -> -> -> 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff -> -> -> -> 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff -> -> -> -> 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff -> -> -> -> 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff -> -> -> -> 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff -> -> -> -> 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff -> -> -> -> 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff -> -> -> -> 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff -> -> -> -> fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -> pci_bridge_pref_mem -> -> @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional Adress -> -Space -> -> -This one is empty though right? -> -> -> -> -> -> -> We have figured out why this address becomes this value, according to -> -> pci spec, pci driver can get BAR address size by writing 0xffffffff -> -> to -> -> -> -> the pci register firstly, and then read back the value from this register. -> -> -> -OK however as you show below the BAR being sized is the BAR if a bridge. Are -> -you then adding a bridge device by hotplug? -No, I just simply hot plugged a VFIO device to Bus 0, another interesting -phenomenon is -If I hot plug the device to other bus, this doesn't happened. - -> -> -> -> We didn't handle this value specially while process pci write in -> -> qemu, the function call stack is: -> -> -> -> Pci_bridge_dev_write_config -> -> -> -> -> pci_bridge_write_config -> -> -> -> -> pci_default_write_config (we update the config[address] value here -> -> -> to -> -> fffffffffc800000, which should be 0xfc800000 ) -> -> -> -> -> pci_bridge_update_mappings -> -> -> -> ->pci_bridge_region_del(br, br->windows); -> -> -> -> -> pci_bridge_region_init -> -> -> -> -> -> -> pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong -> -> value -> -> fffffffffc800000) -> -> -> -> -> -> -> memory_region_transaction_commit -> -> -> -> -> -> -> -> So, as we can see, we use the wrong base address in qemu to update the -> -> memory regions, though, we update the base address to -> -> -> -> The correct value after pci driver in VM write the original value -> -> back, the virtio NIC in bus 4 may still sends net packets concurrently -> -> with -> -> -> -> The wrong memory region address. -> -> -> -> -> -> -> -> We have tried to skip the memory region update action in qemu while -> -> detect pci write with 0xffffffff value, and it does work, but -> -> -> -> This seems to be not gently. -> -> -For sure. But I'm still puzzled as to why does Linux try to size the BAR of -> -the -> -bridge while a device behind it is used. -> -> -Can you pls post your QEMU command line? -My QEMU command line: -/root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -object -secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-Linux/master-key.aes - -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -numa -node,nodeid=1,cpus=5-9,mem=1024 -numa node,nodeid=2,cpus=10-14,mem=1024 -numa -node,nodeid=3,cpus=15-19,mem=1024 -uuid 34a588c7-b0f2-4952-b39c-47fae3411439 --no-user-config -nodefaults -chardev -socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/monitor.sock,server,nowait - -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet --global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -device -pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-virtio-disk0,cache=none - -device -virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 - -drive if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -tap,fd=35,id=hostnet0 -device -virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4,addr=0x1 --chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 --device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on - -I am also very curious about this issue, in the linux kernel code, maybe double -check in function pci_bridge_check_ranges triggered this problem. - - -> -> -> -> -> -> -> -> -> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -> -> -> -> index b2e50c3..84b405d 100644 -> -> -> -> --- a/hw/pci/pci_bridge.c -> -> -> -> +++ b/hw/pci/pci_bridge.c -> -> -> -> @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d, -> -> -> -> pci_default_write_config(d, address, val, len); -> -> -> -> - if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -> -> + if ( (val != 0xffffffff) && -> -> -> -> + (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -> -> /* io base/limit */ -> -> -> -> ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> -> -> @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d, -> -> -> -> ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> -> -> /* vga enable */ -> -> -> -> - ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> -> -> + ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) { -> -> -> -> pci_bridge_update_mappings(s); -> -> -> -> } -> -> -> -> -> -> -> -> Thinks, -> -> -> -> Xu -> -> - -On Mon, Dec 10, 2018 at 03:12:53AM +0000, xuyandong wrote: -> -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > Hi all, -> -> > -> -> > -> -> > -> -> > In our test, we configured VM with several pci-bridges and a -> -> > virtio-net nic been attached with bus 4, -> -> > -> -> > After VM is startup, We ping this nic from host to judge if it is -> -> > working normally. Then, we hot add pci devices to this VM with bus 0. -> -> > -> -> > We found the virtio-net NIC in bus 4 is not working (can not connect) -> -> > occasionally, as it kick virtio backend failure with error below: -> -> > -> -> > Unassigned mem write 00000000fc803004 = 0x1 -> -> > -> -> > -> -> > -> -> > memory-region: pci_bridge_pci -> -> > -> -> > 0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci -> -> > -> -> > 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> > -> -> > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > virtio-pci-common -> -> > -> -> > 00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr -> -> > -> -> > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > virtio-pci-device -> -> > -> -> > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > virtio-pci-notify <- io mem unassigned -> -> > -> -> > ⦠-> -> > -> -> > -> -> > -> -> > We caught an exceptional address changing while this problem happened, -> -> > show as -> -> > follow: -> -> > -> -> > Before pci_bridge_update_mappingsï¼ -> -> > -> -> > 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc000000-00000000fc1fffff -> -> > -> -> > 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc200000-00000000fc3fffff -> -> > -> -> > 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc400000-00000000fc5fffff -> -> > -> -> > 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc600000-00000000fc7fffff -> -> > -> -> > 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc800000-00000000fc9fffff -> -> > <- correct Adress Spce -> -> > -> -> > 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fca00000-00000000fcbfffff -> -> > -> -> > 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fcc00000-00000000fcdfffff -> -> > -> -> > 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> > pci_bridge_pref_mem @pci_bridge_pci 00000000fce00000-00000000fcffffff -> -> > -> -> > -> -> > -> -> > After pci_bridge_update_mappingsï¼ -> -> > -> -> > 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff -> -> > -> -> > 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff -> -> > -> -> > 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff -> -> > -> -> > 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff -> -> > -> -> > 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff -> -> > -> -> > 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff -> -> > -> -> > 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff -> -> > -> -> > 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff -> -> > -> -> > fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -> > pci_bridge_pref_mem -> -> > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional Adress -> -> Space -> -> -> -> This one is empty though right? -> -> -> -> > -> -> > -> -> > We have figured out why this address becomes this value, according to -> -> > pci spec, pci driver can get BAR address size by writing 0xffffffff -> -> > to -> -> > -> -> > the pci register firstly, and then read back the value from this register. -> -> -> -> -> -> OK however as you show below the BAR being sized is the BAR if a bridge. Are -> -> you then adding a bridge device by hotplug? -> -> -No, I just simply hot plugged a VFIO device to Bus 0, another interesting -> -phenomenon is -> -If I hot plug the device to other bus, this doesn't happened. -> -> -> -> -> -> -> > We didn't handle this value specially while process pci write in -> -> > qemu, the function call stack is: -> -> > -> -> > Pci_bridge_dev_write_config -> -> > -> -> > -> pci_bridge_write_config -> -> > -> -> > -> pci_default_write_config (we update the config[address] value here -> -> > -> to -> -> > fffffffffc800000, which should be 0xfc800000 ) -> -> > -> -> > -> pci_bridge_update_mappings -> -> > -> -> > ->pci_bridge_region_del(br, br->windows); -> -> > -> -> > -> pci_bridge_region_init -> -> > -> -> > -> -> -> > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong -> -> > value -> -> > fffffffffc800000) -> -> > -> -> > -> -> -> > memory_region_transaction_commit -> -> > -> -> > -> -> > -> -> > So, as we can see, we use the wrong base address in qemu to update the -> -> > memory regions, though, we update the base address to -> -> > -> -> > The correct value after pci driver in VM write the original value -> -> > back, the virtio NIC in bus 4 may still sends net packets concurrently -> -> > with -> -> > -> -> > The wrong memory region address. -> -> > -> -> > -> -> > -> -> > We have tried to skip the memory region update action in qemu while -> -> > detect pci write with 0xffffffff value, and it does work, but -> -> > -> -> > This seems to be not gently. -> -> -> -> For sure. But I'm still puzzled as to why does Linux try to size the BAR of -> -> the -> -> bridge while a device behind it is used. -> -> -> -> Can you pls post your QEMU command line? -> -> -My QEMU command line: -> -/root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -object -> -secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-Linux/master-key.aes -> --machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -> -20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -numa -> -node,nodeid=1,cpus=5-9,mem=1024 -numa node,nodeid=2,cpus=10-14,mem=1024 -numa -> -node,nodeid=3,cpus=15-19,mem=1024 -uuid 34a588c7-b0f2-4952-b39c-47fae3411439 -> --no-user-config -nodefaults -chardev -> -socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/monitor.sock,server,nowait -> --mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -> --global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -device -> -pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-virtio-disk0,cache=none -> --device -> -virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -> --drive if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -> -tap,fd=35,id=hostnet0 -device -> -virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4,addr=0x1 -> --chardev pty,id=charserial0 -device -> -isa-serial,chardev=charserial0,id=serial0 -device -> -usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on -> -> -I am also very curious about this issue, in the linux kernel code, maybe -> -double check in function pci_bridge_check_ranges triggered this problem. -If you can get the stacktrace in Linux when it tries to write this -fffff value, that would be quite helpful. - - -> -> -> -> -> -> -> -> -> > -> -> > -> -> > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -> -> > -> -> > index b2e50c3..84b405d 100644 -> -> > -> -> > --- a/hw/pci/pci_bridge.c -> -> > -> -> > +++ b/hw/pci/pci_bridge.c -> -> > -> -> > @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d, -> -> > -> -> > pci_default_write_config(d, address, val, len); -> -> > -> -> > - if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> > -> -> > + if ( (val != 0xffffffff) && -> -> > -> -> > + (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> > -> -> > /* io base/limit */ -> -> > -> -> > ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> > -> -> > @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d, -> -> > -> -> > ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> > -> -> > /* vga enable */ -> -> > -> -> > - ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> > -> -> > + ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) { -> -> > -> -> > pci_bridge_update_mappings(s); -> -> > -> -> > } -> -> > -> -> > -> -> > -> -> > Thinks, -> -> > -> -> > Xu -> -> > - -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > Hi all, -> -> > > -> -> > > -> -> > > -> -> > > In our test, we configured VM with several pci-bridges and a -> -> > > virtio-net nic been attached with bus 4, -> -> > > -> -> > > After VM is startup, We ping this nic from host to judge if it is -> -> > > working normally. Then, we hot add pci devices to this VM with bus 0. -> -> > > -> -> > > We found the virtio-net NIC in bus 4 is not working (can not -> -> > > connect) occasionally, as it kick virtio backend failure with error -> -> > > below: -> -> > > -> -> > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > -> -> > > -> -> > > -> -> > > memory-region: pci_bridge_pci -> -> > > -> -> > > 0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci -> -> > > -> -> > > 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> > > -> -> > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > virtio-pci-common -> -> > > -> -> > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > virtio-pci-isr -> -> > > -> -> > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > virtio-pci-device -> -> > > -> -> > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > virtio-pci-notify <- io mem unassigned -> -> > > -> -> > > ⦠-> -> > > -> -> > > -> -> > > -> -> > > We caught an exceptional address changing while this problem -> -> > > happened, show as -> -> > > follow: -> -> > > -> -> > > Before pci_bridge_update_mappingsï¼ -> -> > > -> -> > > 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fc000000-00000000fc1fffff -> -> > > -> -> > > 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fc200000-00000000fc3fffff -> -> > > -> -> > > 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fc400000-00000000fc5fffff -> -> > > -> -> > > 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fc600000-00000000fc7fffff -> -> > > -> -> > > 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fc800000-00000000fc9fffff -> -> > > <- correct Adress Spce -> -> > > -> -> > > 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fca00000-00000000fcbfffff -> -> > > -> -> > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fcc00000-00000000fcdfffff -> -> > > -> -> > > 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > 00000000fce00000-00000000fcffffff -> -> > > -> -> > > -> -> > > -> -> > > After pci_bridge_update_mappingsï¼ -> -> > > -> -> > > 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff -> -> > > -> -> > > 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff -> -> > > -> -> > > 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff -> -> > > -> -> > > 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff -> -> > > -> -> > > 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff -> -> > > -> -> > > 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff -> -> > > -> -> > > 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff -> -> > > -> -> > > 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> > > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff -> -> > > -> -> > > fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -pci_bridge_pref_mem -> -> > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional -> -> > > Adress -> -> > Space -> -> > -> -> > This one is empty though right? -> -> > -> -> > > -> -> > > -> -> > > We have figured out why this address becomes this value, -> -> > > according to pci spec, pci driver can get BAR address size by -> -> > > writing 0xffffffff to -> -> > > -> -> > > the pci register firstly, and then read back the value from this -> -> > > register. -> -> > -> -> > -> -> > OK however as you show below the BAR being sized is the BAR if a -> -> > bridge. Are you then adding a bridge device by hotplug? -> -> -> -> No, I just simply hot plugged a VFIO device to Bus 0, another -> -> interesting phenomenon is If I hot plug the device to other bus, this -> -> doesn't -> -happened. -> -> -> -> > -> -> > -> -> > > We didn't handle this value specially while process pci write in -> -> > > qemu, the function call stack is: -> -> > > -> -> > > Pci_bridge_dev_write_config -> -> > > -> -> > > -> pci_bridge_write_config -> -> > > -> -> > > -> pci_default_write_config (we update the config[address] value -> -> > > -> here to -> -> > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > -> -> > > -> pci_bridge_update_mappings -> -> > > -> -> > > ->pci_bridge_region_del(br, br->windows); -> -> > > -> -> > > -> pci_bridge_region_init -> -> > > -> -> > > -> -> -> > > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong -> -> > > value -> -> > > fffffffffc800000) -> -> > > -> -> > > -> -> -> > > memory_region_transaction_commit -> -> > > -> -> > > -> -> > > -> -> > > So, as we can see, we use the wrong base address in qemu to update -> -> > > the memory regions, though, we update the base address to -> -> > > -> -> > > The correct value after pci driver in VM write the original value -> -> > > back, the virtio NIC in bus 4 may still sends net packets -> -> > > concurrently with -> -> > > -> -> > > The wrong memory region address. -> -> > > -> -> > > -> -> > > -> -> > > We have tried to skip the memory region update action in qemu -> -> > > while detect pci write with 0xffffffff value, and it does work, -> -> > > but -> -> > > -> -> > > This seems to be not gently. -> -> > -> -> > For sure. But I'm still puzzled as to why does Linux try to size the -> -> > BAR of the bridge while a device behind it is used. -> -> > -> -> > Can you pls post your QEMU command line? -> -> -> -> My QEMU command line: -> -> /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -> -> -object -> -> secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194- -> -> Linux/master-key.aes -machine -> -> pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -> -> 20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -> -> -numa node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults -> -> -chardev -> -> socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/moni -> -> tor.sock,server,nowait -mon -> -> chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -> -> -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -> -> -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-v -> -> irtio-disk0,cache=none -device -> -> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id -> -> =virtio-disk0,bootindex=1 -drive -> -> if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -> -> tap,fd=35,id=hostnet0 -device -> -> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4 -> -> ,addr=0x1 -chardev pty,id=charserial0 -device -> -> isa-serial,chardev=charserial0,id=serial0 -device -> -> usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on -> -> -> -> I am also very curious about this issue, in the linux kernel code, maybe -> -> double -> -check in function pci_bridge_check_ranges triggered this problem. -> -> -If you can get the stacktrace in Linux when it tries to write this fffff -> -value, that -> -would be quite helpful. -> -After I add mdelay(100) in function pci_bridge_check_ranges, this phenomenon is -easier to reproduce, below is my modify in kernel: -diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c -index cb389277..86e232d 100644 ---- a/drivers/pci/setup-bus.c -+++ b/drivers/pci/setup-bus.c -@@ -27,7 +27,7 @@ - #include <linux/slab.h> - #include <linux/acpi.h> - #include "pci.h" -- -+#include <linux/delay.h> - unsigned int pci_flags; - - struct pci_dev_resource { -@@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus *bus) - pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, - 0xffffffff); - pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp); -+ mdelay(100); -+ printk(KERN_ERR "sleep\n"); -+ dump_stack(); - if (!tmp) - b_res[2].flags &= ~IORESOURCE_MEM_64; - pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, - -After hot plugging, we get the following log: - -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:14.0: BAR 0: assigned [mem -0xc2360000-0xc237ffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:14.0: BAR 3: assigned [mem -0xc2328000-0xc232bfff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:18 uefi-linux kernel: sleep -Dec 11 09:28:18 uefi-linux kernel: CPU: 16 PID: 502 Comm: kworker/u40:1 Not -tainted 4.11.0-rc3+ #11 -Dec 11 09:28:18 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + -PIIX, 1996), BIOS 0.0.0 02/06/2015 -Dec 11 09:28:18 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn -Dec 11 09:28:18 uefi-linux kernel: Call Trace: -Dec 11 09:28:18 uefi-linux kernel: dump_stack+0x63/0x87 -Dec 11 09:28:18 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960 -Dec 11 09:28:18 uefi-linux kernel: ? dev_printk+0x4d/0x50 -Dec 11 09:28:18 uefi-linux kernel: enable_slot+0x140/0x2f0 -Dec 11 09:28:18 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80 -Dec 11 09:28:18 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:18 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120 -Dec 11 09:28:18 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0 -Dec 11 09:28:18 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0 -Dec 11 09:28:18 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3 -Dec 11 09:28:18 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29 -Dec 11 09:28:18 uefi-linux kernel: process_one_work+0x165/0x410 -Dec 11 09:28:18 uefi-linux kernel: worker_thread+0x137/0x4c0 -Dec 11 09:28:18 uefi-linux kernel: kthread+0x101/0x140 -Dec 11 09:28:18 uefi-linux kernel: ? rescuer_thread+0x380/0x380 -Dec 11 09:28:18 uefi-linux kernel: ? kthread_park+0x90/0x90 -Dec 11 09:28:18 uefi-linux kernel: ret_from_fork+0x2c/0x40 -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:18 uefi-linux kernel: sleep -Dec 11 09:28:18 uefi-linux kernel: CPU: 16 PID: 502 Comm: kworker/u40:1 Not -tainted 4.11.0-rc3+ #11 -Dec 11 09:28:18 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + -PIIX, 1996), BIOS 0.0.0 02/06/2015 -Dec 11 09:28:18 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn -Dec 11 09:28:18 uefi-linux kernel: Call Trace: -Dec 11 09:28:18 uefi-linux kernel: dump_stack+0x63/0x87 -Dec 11 09:28:18 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960 -Dec 11 09:28:18 uefi-linux kernel: ? dev_printk+0x4d/0x50 -Dec 11 09:28:18 uefi-linux kernel: enable_slot+0x140/0x2f0 -Dec 11 09:28:18 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80 -Dec 11 09:28:18 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:18 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120 -Dec 11 09:28:18 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0 -Dec 11 09:28:18 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0 -Dec 11 09:28:18 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3 -Dec 11 09:28:18 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29 -Dec 11 09:28:18 uefi-linux kernel: process_one_work+0x165/0x410 -Dec 11 09:28:18 uefi-linux kernel: worker_thread+0x137/0x4c0 -Dec 11 09:28:18 uefi-linux kernel: kthread+0x101/0x140 -Dec 11 09:28:18 uefi-linux kernel: ? rescuer_thread+0x380/0x380 -Dec 11 09:28:18 uefi-linux kernel: ? kthread_park+0x90/0x90 -Dec 11 09:28:18 uefi-linux kernel: ret_from_fork+0x2c/0x40 -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:19 uefi-linux kernel: sleep -Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not -tainted 4.11.0-rc3+ #11 -Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + -PIIX, 1996), BIOS 0.0.0 02/06/2015 -Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn -Dec 11 09:28:19 uefi-linux kernel: Call Trace: -Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87 -Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960 -Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50 -Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0 -Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80 -Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0 -Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0 -Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3 -Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29 -Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410 -Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0 -Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140 -Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380 -Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90 -Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40 -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:19 uefi-linux kernel: sleep -Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not -tainted 4.11.0-rc3+ #11 -Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + -PIIX, 1996), BIOS 0.0.0 02/06/2015 -Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn -Dec 11 09:28:19 uefi-linux kernel: Call Trace: -Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87 -Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960 -Dec 11 09:28:19 uefi-linux kernel: ? pci_conf1_read+0xba/0x100 -Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0xe9/0x960 -Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50 -Dec 11 09:28:19 uefi-linux kernel: ? pcibios_allocate_rom_resources+0x45/0x80 -Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0 -Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80 -Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0 -Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0 -Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3 -Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29 -Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410 -Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0 -Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140 -Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380 -Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90 -Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40 -Dec 11 09:28:19 uefi-linux kernel: sleep -Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not -tainted 4.11.0-rc3+ #11 -Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + -PIIX, 1996), BIOS 0.0.0 02/06/2015 -Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn -Dec 11 09:28:19 uefi-linux kernel: Call Trace: -Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87 -Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960 -Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50 -Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0 -Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80 -Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120 -Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0 -Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0 -Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3 -Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29 -Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410 -Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0 -Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140 -Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380 -Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90 -Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40 -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 lost sync at byte 1 -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at -isa0060/serio1/input0 - driver resynced. -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0: bridge window [io -0xf000-0xffff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2800000-0xc29fffff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0: bridge window [mem -0xc2b00000-0xc2cfffff 64bit pref] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0: bridge window [io -0xe000-0xefff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2600000-0xc27fffff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0: bridge window [mem -0xc2d00000-0xc2efffff 64bit pref] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0: bridge window [io -0xd000-0xdfff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2400000-0xc25fffff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0: bridge window [mem -0xc2f00000-0xc30fffff 64bit pref] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0: bridge window [io -0xc000-0xcfff] -Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0: bridge window [mem -0xc2000000-0xc21fffff] - -> -> -> -> > -> -> > -> -> > -> -> > > -> -> > > -> -> > > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -> -> > > -> -> > > index b2e50c3..84b405d 100644 -> -> > > -> -> > > --- a/hw/pci/pci_bridge.c -> -> > > -> -> > > +++ b/hw/pci/pci_bridge.c -> -> > > -> -> > > @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d, -> -> > > -> -> > > pci_default_write_config(d, address, val, len); -> -> > > -> -> > > - if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> > > -> -> > > + if ( (val != 0xffffffff) && -> -> > > -> -> > > + (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> > > -> -> > > /* io base/limit */ -> -> > > -> -> > > ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> > > -> -> > > @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d, -> -> > > -> -> > > ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> > > -> -> > > /* vga enable */ -> -> > > -> -> > > - ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> > > -> -> > > + ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) { -> -> > > -> -> > > pci_bridge_update_mappings(s); -> -> > > -> -> > > } -> -> > > -> -> > > -> -> > > -> -> > > Thinks, -> -> > > -> -> > > Xu -> -> > > - -On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote: -> -On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > > Hi all, -> -> > > > -> -> > > > -> -> > > > -> -> > > > In our test, we configured VM with several pci-bridges and a -> -> > > > virtio-net nic been attached with bus 4, -> -> > > > -> -> > > > After VM is startup, We ping this nic from host to judge if it is -> -> > > > working normally. Then, we hot add pci devices to this VM with bus 0. -> -> > > > -> -> > > > We found the virtio-net NIC in bus 4 is not working (can not -> -> > > > connect) occasionally, as it kick virtio backend failure with error -> -> > > > below: -> -> > > > -> -> > > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > > -> -> > > > -> -> > > > -> -> > > > memory-region: pci_bridge_pci -> -> > > > -> -> > > > 0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci -> -> > > > -> -> > > > 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> > > > -> -> > > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > > virtio-pci-common -> -> > > > -> -> > > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > > virtio-pci-isr -> -> > > > -> -> > > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > > virtio-pci-device -> -> > > > -> -> > > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > > virtio-pci-notify <- io mem unassigned -> -> > > > -> -> > > > ⦠-> -> > > > -> -> > > > -> -> > > > -> -> > > > We caught an exceptional address changing while this problem -> -> > > > happened, show as -> -> > > > follow: -> -> > > > -> -> > > > Before pci_bridge_update_mappingsï¼ -> -> > > > -> -> > > > 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fc000000-00000000fc1fffff -> -> > > > -> -> > > > 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fc200000-00000000fc3fffff -> -> > > > -> -> > > > 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fc400000-00000000fc5fffff -> -> > > > -> -> > > > 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fc600000-00000000fc7fffff -> -> > > > -> -> > > > 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fc800000-00000000fc9fffff -> -> > > > <- correct Adress Spce -> -> > > > -> -> > > > 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fca00000-00000000fcbfffff -> -> > > > -> -> > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fcc00000-00000000fcdfffff -> -> > > > -> -> > > > 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > 00000000fce00000-00000000fcffffff -> -> > > > -> -> > > > -> -> > > > -> -> > > > After pci_bridge_update_mappingsï¼ -> -> > > > -> -> > > > 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff -> -> > > > -> -> > > > 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff -> -> > > > -> -> > > > 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff -> -> > > > -> -> > > > 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff -> -> > > > -> -> > > > 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff -> -> > > > -> -> > > > 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff -> -> > > > -> -> > > > 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff -> -> > > > -> -> > > > 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> > > > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff -> -> > > > -> -> > > > fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -> pci_bridge_pref_mem -> -> > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional -> -> > > > Adress -> -> > > Space -> -> > > -> -> > > This one is empty though right? -> -> > > -> -> > > > -> -> > > > -> -> > > > We have figured out why this address becomes this value, -> -> > > > according to pci spec, pci driver can get BAR address size by -> -> > > > writing 0xffffffff to -> -> > > > -> -> > > > the pci register firstly, and then read back the value from this -> -> > > > register. -> -> > > -> -> > > -> -> > > OK however as you show below the BAR being sized is the BAR if a -> -> > > bridge. Are you then adding a bridge device by hotplug? -> -> > -> -> > No, I just simply hot plugged a VFIO device to Bus 0, another -> -> > interesting phenomenon is If I hot plug the device to other bus, this -> -> > doesn't -> -> happened. -> -> > -> -> > > -> -> > > -> -> > > > We didn't handle this value specially while process pci write in -> -> > > > qemu, the function call stack is: -> -> > > > -> -> > > > Pci_bridge_dev_write_config -> -> > > > -> -> > > > -> pci_bridge_write_config -> -> > > > -> -> > > > -> pci_default_write_config (we update the config[address] value -> -> > > > -> here to -> -> > > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > > -> -> > > > -> pci_bridge_update_mappings -> -> > > > -> -> > > > ->pci_bridge_region_del(br, br->windows); -> -> > > > -> -> > > > -> pci_bridge_region_init -> -> > > > -> -> > > > -> -> -> > > > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong -> -> > > > value -> -> > > > fffffffffc800000) -> -> > > > -> -> > > > -> -> -> > > > memory_region_transaction_commit -> -> > > > -> -> > > > -> -> > > > -> -> > > > So, as we can see, we use the wrong base address in qemu to update -> -> > > > the memory regions, though, we update the base address to -> -> > > > -> -> > > > The correct value after pci driver in VM write the original value -> -> > > > back, the virtio NIC in bus 4 may still sends net packets -> -> > > > concurrently with -> -> > > > -> -> > > > The wrong memory region address. -> -> > > > -> -> > > > -> -> > > > -> -> > > > We have tried to skip the memory region update action in qemu -> -> > > > while detect pci write with 0xffffffff value, and it does work, -> -> > > > but -> -> > > > -> -> > > > This seems to be not gently. -> -> > > -> -> > > For sure. But I'm still puzzled as to why does Linux try to size the -> -> > > BAR of the bridge while a device behind it is used. -> -> > > -> -> > > Can you pls post your QEMU command line? -> -> > -> -> > My QEMU command line: -> -> > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -> -> > -object -> -> > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194- -> -> > Linux/master-key.aes -machine -> -> > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -> -> > 20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -> -> > -numa node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> > node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> > node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults -> -> > -chardev -> -> > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/moni -> -> > tor.sock,server,nowait -mon -> -> > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -> -> > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -> -> > -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-v -> -> > irtio-disk0,cache=none -device -> -> > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id -> -> > =virtio-disk0,bootindex=1 -drive -> -> > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -> -> > tap,fd=35,id=hostnet0 -device -> -> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4 -> -> > ,addr=0x1 -chardev pty,id=charserial0 -device -> -> > isa-serial,chardev=charserial0,id=serial0 -device -> -> > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on -> -> > -> -> > I am also very curious about this issue, in the linux kernel code, maybe -> -> > double -> -> check in function pci_bridge_check_ranges triggered this problem. -> -> -> -> If you can get the stacktrace in Linux when it tries to write this fffff -> -> value, that -> -> would be quite helpful. -> -> -> -> -After I add mdelay(100) in function pci_bridge_check_ranges, this phenomenon -> -is -> -easier to reproduce, below is my modify in kernel: -> -diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c -> -index cb389277..86e232d 100644 -> ---- a/drivers/pci/setup-bus.c -> -+++ b/drivers/pci/setup-bus.c -> -@@ -27,7 +27,7 @@ -> -#include <linux/slab.h> -> -#include <linux/acpi.h> -> -#include "pci.h" -> -- -> -+#include <linux/delay.h> -> -unsigned int pci_flags; -> -> -struct pci_dev_resource { -> -@@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus *bus) -> -pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -0xffffffff); -> -pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp); -> -+ mdelay(100); -> -+ printk(KERN_ERR "sleep\n"); -> -+ dump_stack(); -> -if (!tmp) -> -b_res[2].flags &= ~IORESOURCE_MEM_64; -> -pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -OK! -I just sent a Linux patch that should help. -I would appreciate it if you will give it a try -and if that helps reply to it with -a Tested-by: tag. - --- -MST - -On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote: -> -> On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > > > Hi all, -> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > In our test, we configured VM with several pci-bridges and a -> -> > > > > virtio-net nic been attached with bus 4, -> -> > > > > -> -> > > > > After VM is startup, We ping this nic from host to judge if it -> -> > > > > is working normally. Then, we hot add pci devices to this VM with -> -> > > > > bus -> -0. -> -> > > > > -> -> > > > > We found the virtio-net NIC in bus 4 is not working (can not -> -> > > > > connect) occasionally, as it kick virtio backend failure with error -> -> > > > > below: -> -> > > > > -> -> > > > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > memory-region: pci_bridge_pci -> -> > > > > -> -> > > > > 0000000000000000-ffffffffffffffff (prio 0, RW): -> -> > > > > pci_bridge_pci -> -> > > > > -> -> > > > > 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> > > > > -> -> > > > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > > > virtio-pci-common -> -> > > > > -> -> > > > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > > > virtio-pci-isr -> -> > > > > -> -> > > > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > > > virtio-pci-device -> -> > > > > -> -> > > > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > > > virtio-pci-notify <- io mem unassigned -> -> > > > > -> -> > > > > ⦠-> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > We caught an exceptional address changing while this problem -> -> > > > > happened, show as -> -> > > > > follow: -> -> > > > > -> -> > > > > Before pci_bridge_update_mappingsï¼ -> -> > > > > -> -> > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fc000000-00000000fc1fffff -> -> > > > > -> -> > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fc200000-00000000fc3fffff -> -> > > > > -> -> > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fc400000-00000000fc5fffff -> -> > > > > -> -> > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fc600000-00000000fc7fffff -> -> > > > > -> -> > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fc800000-00000000fc9fffff -> -> > > > > <- correct Adress Spce -> -> > > > > -> -> > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fca00000-00000000fcbfffff -> -> > > > > -> -> > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fcc00000-00000000fcdfffff -> -> > > > > -> -> > > > > 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > 00000000fce00000-00000000fcffffff -> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > After pci_bridge_update_mappingsï¼ -> -> > > > > -> -> > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fda00000-00000000fdbfffff -> -> > > > > -> -> > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fdc00000-00000000fddfffff -> -> > > > > -> -> > > > > 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fde00000-00000000fdffffff -> -> > > > > -> -> > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fe000000-00000000fe1fffff -> -> > > > > -> -> > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fe200000-00000000fe3fffff -> -> > > > > -> -> > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fe400000-00000000fe5fffff -> -> > > > > -> -> > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fe600000-00000000fe7fffff -> -> > > > > -> -> > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > 00000000fe800000-00000000fe9fffff -> -> > > > > -> -> > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -> > pci_bridge_pref_mem -> -> > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional -> -Adress -> -> > > > Space -> -> > > > -> -> > > > This one is empty though right? -> -> > > > -> -> > > > > -> -> > > > > -> -> > > > > We have figured out why this address becomes this value, -> -> > > > > according to pci spec, pci driver can get BAR address size by -> -> > > > > writing 0xffffffff to -> -> > > > > -> -> > > > > the pci register firstly, and then read back the value from this -> -> > > > > register. -> -> > > > -> -> > > > -> -> > > > OK however as you show below the BAR being sized is the BAR if a -> -> > > > bridge. Are you then adding a bridge device by hotplug? -> -> > > -> -> > > No, I just simply hot plugged a VFIO device to Bus 0, another -> -> > > interesting phenomenon is If I hot plug the device to other bus, -> -> > > this doesn't -> -> > happened. -> -> > > -> -> > > > -> -> > > > -> -> > > > > We didn't handle this value specially while process pci write -> -> > > > > in qemu, the function call stack is: -> -> > > > > -> -> > > > > Pci_bridge_dev_write_config -> -> > > > > -> -> > > > > -> pci_bridge_write_config -> -> > > > > -> -> > > > > -> pci_default_write_config (we update the config[address] -> -> > > > > -> value here to -> -> > > > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > > > -> -> > > > > -> pci_bridge_update_mappings -> -> > > > > -> -> > > > > ->pci_bridge_region_del(br, br->windows); -> -> > > > > -> -> > > > > -> pci_bridge_region_init -> -> > > > > -> -> > > > > -> -> > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use the -> -> > > > > wrong value -> -> > > > > fffffffffc800000) -> -> > > > > -> -> > > > > -> -> -> > > > > memory_region_transaction_commit -> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > So, as we can see, we use the wrong base address in qemu to -> -> > > > > update the memory regions, though, we update the base address -> -> > > > > to -> -> > > > > -> -> > > > > The correct value after pci driver in VM write the original -> -> > > > > value back, the virtio NIC in bus 4 may still sends net -> -> > > > > packets concurrently with -> -> > > > > -> -> > > > > The wrong memory region address. -> -> > > > > -> -> > > > > -> -> > > > > -> -> > > > > We have tried to skip the memory region update action in qemu -> -> > > > > while detect pci write with 0xffffffff value, and it does -> -> > > > > work, but -> -> > > > > -> -> > > > > This seems to be not gently. -> -> > > > -> -> > > > For sure. But I'm still puzzled as to why does Linux try to size -> -> > > > the BAR of the bridge while a device behind it is used. -> -> > > > -> -> > > > Can you pls post your QEMU command line? -> -> > > -> -> > > My QEMU command line: -> -> > > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -> -> > > -object -> -> > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain- -> -> > > 194- -> -> > > Linux/master-key.aes -machine -> -> > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -> -> > > 20,sockets=20,cores=1,threads=1 -numa -> -> > > node,nodeid=0,cpus=0-4,mem=1024 -numa -> -> > > node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> > > node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> > > node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults -> -> > > -chardev -> -> > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/ -> -> > > moni -> -> > > tor.sock,server,nowait -mon -> -> > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -> -> > > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot -> -> > > strict=on -device -> -> > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=dri -> -> > > ve-v -> -> > > irtio-disk0,cache=none -device -> -> > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk -> -> > > 0,id -> -> > > =virtio-disk0,bootindex=1 -drive -> -> > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -> -> > > tap,fd=35,id=hostnet0 -device -> -> > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=p -> -> > > ci.4 -> -> > > ,addr=0x1 -chardev pty,id=charserial0 -device -> -> > > isa-serial,chardev=charserial0,id=serial0 -device -> -> > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg -> -> > > timestamp=on -> -> > > -> -> > > I am also very curious about this issue, in the linux kernel code, -> -> > > maybe double -> -> > check in function pci_bridge_check_ranges triggered this problem. -> -> > -> -> > If you can get the stacktrace in Linux when it tries to write this -> -> > fffff value, that would be quite helpful. -> -> > -> -> -> -> After I add mdelay(100) in function pci_bridge_check_ranges, this -> -> phenomenon is easier to reproduce, below is my modify in kernel: -> -> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index -> -> cb389277..86e232d 100644 -> -> --- a/drivers/pci/setup-bus.c -> -> +++ b/drivers/pci/setup-bus.c -> -> @@ -27,7 +27,7 @@ -> -> #include <linux/slab.h> -> -> #include <linux/acpi.h> -> -> #include "pci.h" -> -> - -> -> +#include <linux/delay.h> -> -> unsigned int pci_flags; -> -> -> -> struct pci_dev_resource { -> -> @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus -> -*bus) -> -> pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> 0xffffffff); -> -> pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> &tmp); -> -> + mdelay(100); -> -> + printk(KERN_ERR "sleep\n"); -> -> + dump_stack(); -> -> if (!tmp) -> -> b_res[2].flags &= ~IORESOURCE_MEM_64; -> -> pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> -> -> -OK! -> -I just sent a Linux patch that should help. -> -I would appreciate it if you will give it a try and if that helps reply to it -> -with a -> -Tested-by: tag. -> -I tested this patch and it works fine on my machine. - -But I have another question, if we only fix this problem in the kernel, the -Linux -version that has been released does not work well on the virtualization -platform. -Is there a way to fix this problem in the backend? - -> --- -> -MST - -On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote: -> -On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote: -> -> > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > > > > Hi all, -> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > In our test, we configured VM with several pci-bridges and a -> -> > > > > > virtio-net nic been attached with bus 4, -> -> > > > > > -> -> > > > > > After VM is startup, We ping this nic from host to judge if it -> -> > > > > > is working normally. Then, we hot add pci devices to this VM with -> -> > > > > > bus -> -> 0. -> -> > > > > > -> -> > > > > > We found the virtio-net NIC in bus 4 is not working (can not -> -> > > > > > connect) occasionally, as it kick virtio backend failure with -> -> > > > > > error below: -> -> > > > > > -> -> > > > > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > memory-region: pci_bridge_pci -> -> > > > > > -> -> > > > > > 0000000000000000-ffffffffffffffff (prio 0, RW): -> -> > > > > > pci_bridge_pci -> -> > > > > > -> -> > > > > > 00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci -> -> > > > > > -> -> > > > > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > > > > virtio-pci-common -> -> > > > > > -> -> > > > > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > > > > virtio-pci-isr -> -> > > > > > -> -> > > > > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > > > > virtio-pci-device -> -> > > > > > -> -> > > > > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > > > > virtio-pci-notify <- io mem unassigned -> -> > > > > > -> -> > > > > > ⦠-> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > We caught an exceptional address changing while this problem -> -> > > > > > happened, show as -> -> > > > > > follow: -> -> > > > > > -> -> > > > > > Before pci_bridge_update_mappingsï¼ -> -> > > > > > -> -> > > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fc000000-00000000fc1fffff -> -> > > > > > -> -> > > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fc200000-00000000fc3fffff -> -> > > > > > -> -> > > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fc400000-00000000fc5fffff -> -> > > > > > -> -> > > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fc600000-00000000fc7fffff -> -> > > > > > -> -> > > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fc800000-00000000fc9fffff -> -> > > > > > <- correct Adress Spce -> -> > > > > > -> -> > > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fca00000-00000000fcbfffff -> -> > > > > > -> -> > > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fcc00000-00000000fcdfffff -> -> > > > > > -> -> > > > > > 00000000fce00000-00000000fcffffff (prio 1, RW): alias -> -> > > > > > pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > 00000000fce00000-00000000fcffffff -> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > After pci_bridge_update_mappingsï¼ -> -> > > > > > -> -> > > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fda00000-00000000fdbfffff -> -> > > > > > -> -> > > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fdc00000-00000000fddfffff -> -> > > > > > -> -> > > > > > 00000000fde00000-00000000fdffffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fde00000-00000000fdffffff -> -> > > > > > -> -> > > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fe000000-00000000fe1fffff -> -> > > > > > -> -> > > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fe200000-00000000fe3fffff -> -> > > > > > -> -> > > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fe400000-00000000fe5fffff -> -> > > > > > -> -> > > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fe600000-00000000fe7fffff -> -> > > > > > -> -> > > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW): alias -> -> > > > > > pci_bridge_mem @pci_bridge_pci -> -> > > > > > 00000000fe800000-00000000fe9fffff -> -> > > > > > -> -> > > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW): alias -> -> > > pci_bridge_pref_mem -> -> > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional -> -> Adress -> -> > > > > Space -> -> > > > > -> -> > > > > This one is empty though right? -> -> > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > We have figured out why this address becomes this value, -> -> > > > > > according to pci spec, pci driver can get BAR address size by -> -> > > > > > writing 0xffffffff to -> -> > > > > > -> -> > > > > > the pci register firstly, and then read back the value from this -> -> > > > > > register. -> -> > > > > -> -> > > > > -> -> > > > > OK however as you show below the BAR being sized is the BAR if a -> -> > > > > bridge. Are you then adding a bridge device by hotplug? -> -> > > > -> -> > > > No, I just simply hot plugged a VFIO device to Bus 0, another -> -> > > > interesting phenomenon is If I hot plug the device to other bus, -> -> > > > this doesn't -> -> > > happened. -> -> > > > -> -> > > > > -> -> > > > > -> -> > > > > > We didn't handle this value specially while process pci write -> -> > > > > > in qemu, the function call stack is: -> -> > > > > > -> -> > > > > > Pci_bridge_dev_write_config -> -> > > > > > -> -> > > > > > -> pci_bridge_write_config -> -> > > > > > -> -> > > > > > -> pci_default_write_config (we update the config[address] -> -> > > > > > -> value here to -> -> > > > > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > > > > -> -> > > > > > -> pci_bridge_update_mappings -> -> > > > > > -> -> > > > > > ->pci_bridge_region_del(br, br->windows); -> -> > > > > > -> -> > > > > > -> pci_bridge_region_init -> -> > > > > > -> -> > > > > > -> -> > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use the -> -> > > > > > wrong value -> -> > > > > > fffffffffc800000) -> -> > > > > > -> -> > > > > > -> -> -> > > > > > memory_region_transaction_commit -> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > So, as we can see, we use the wrong base address in qemu to -> -> > > > > > update the memory regions, though, we update the base address -> -> > > > > > to -> -> > > > > > -> -> > > > > > The correct value after pci driver in VM write the original -> -> > > > > > value back, the virtio NIC in bus 4 may still sends net -> -> > > > > > packets concurrently with -> -> > > > > > -> -> > > > > > The wrong memory region address. -> -> > > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > We have tried to skip the memory region update action in qemu -> -> > > > > > while detect pci write with 0xffffffff value, and it does -> -> > > > > > work, but -> -> > > > > > -> -> > > > > > This seems to be not gently. -> -> > > > > -> -> > > > > For sure. But I'm still puzzled as to why does Linux try to size -> -> > > > > the BAR of the bridge while a device behind it is used. -> -> > > > > -> -> > > > > Can you pls post your QEMU command line? -> -> > > > -> -> > > > My QEMU command line: -> -> > > > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -> -> > > > -object -> -> > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain- -> -> > > > 194- -> -> > > > Linux/master-key.aes -machine -> -> > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp -> -> > > > 20,sockets=20,cores=1,threads=1 -numa -> -> > > > node,nodeid=0,cpus=0-4,mem=1024 -numa -> -> > > > node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> > > > node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults -> -> > > > -chardev -> -> > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/ -> -> > > > moni -> -> > > > tor.sock,server,nowait -mon -> -> > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -> -> > > > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot -> -> > > > strict=on -device -> -> > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=dri -> -> > > > ve-v -> -> > > > irtio-disk0,cache=none -device -> -> > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk -> -> > > > 0,id -> -> > > > =virtio-disk0,bootindex=1 -drive -> -> > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev -> -> > > > tap,fd=35,id=hostnet0 -device -> -> > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=p -> -> > > > ci.4 -> -> > > > ,addr=0x1 -chardev pty,id=charserial0 -device -> -> > > > isa-serial,chardev=charserial0,id=serial0 -device -> -> > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg -> -> > > > timestamp=on -> -> > > > -> -> > > > I am also very curious about this issue, in the linux kernel code, -> -> > > > maybe double -> -> > > check in function pci_bridge_check_ranges triggered this problem. -> -> > > -> -> > > If you can get the stacktrace in Linux when it tries to write this -> -> > > fffff value, that would be quite helpful. -> -> > > -> -> > -> -> > After I add mdelay(100) in function pci_bridge_check_ranges, this -> -> > phenomenon is easier to reproduce, below is my modify in kernel: -> -> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index -> -> > cb389277..86e232d 100644 -> -> > --- a/drivers/pci/setup-bus.c -> -> > +++ b/drivers/pci/setup-bus.c -> -> > @@ -27,7 +27,7 @@ -> -> > #include <linux/slab.h> -> -> > #include <linux/acpi.h> -> -> > #include "pci.h" -> -> > - -> -> > +#include <linux/delay.h> -> -> > unsigned int pci_flags; -> -> > -> -> > struct pci_dev_resource { -> -> > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus -> -> *bus) -> -> > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> > 0xffffffff); -> -> > pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> > &tmp); -> -> > + mdelay(100); -> -> > + printk(KERN_ERR "sleep\n"); -> -> > + dump_stack(); -> -> > if (!tmp) -> -> > b_res[2].flags &= ~IORESOURCE_MEM_64; -> -> > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> > -> -> -> -> OK! -> -> I just sent a Linux patch that should help. -> -> I would appreciate it if you will give it a try and if that helps reply to -> -> it with a -> -> Tested-by: tag. -> -> -> -> -I tested this patch and it works fine on my machine. -> -> -But I have another question, if we only fix this problem in the kernel, the -> -Linux -> -version that has been released does not work well on the virtualization -> -platform. -> -Is there a way to fix this problem in the backend? -There could we a way to work around this. -Does below help? - -diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c -index 236a20eaa8..7834cac4b0 100644 ---- a/hw/i386/acpi-build.c -+++ b/hw/i386/acpi-build.c -@@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, -PCIBus *bus, - - aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM"))); - aml_append(method, -- aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check */) -+ aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device Check -Light */) - ); - aml_append(method, - aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request */) - -> -On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote: -> -> On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote: -> -> > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > > > > > Hi all, -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > In our test, we configured VM with several pci-bridges and -> -> > > > > > > a virtio-net nic been attached with bus 4, -> -> > > > > > > -> -> > > > > > > After VM is startup, We ping this nic from host to judge -> -> > > > > > > if it is working normally. Then, we hot add pci devices to -> -> > > > > > > this VM with bus -> -> > 0. -> -> > > > > > > -> -> > > > > > > We found the virtio-net NIC in bus 4 is not working (can -> -> > > > > > > not -> -> > > > > > > connect) occasionally, as it kick virtio backend failure with -> -> > > > > > > error -> -below: -> -> > > > > > > -> -> > > > > > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > memory-region: pci_bridge_pci -> -> > > > > > > -> -> > > > > > > 0000000000000000-ffffffffffffffff (prio 0, RW): -> -> > > > > > > pci_bridge_pci -> -> > > > > > > -> -> > > > > > > 00000000fc800000-00000000fc803fff (prio 1, RW): -> -> > > > > > > virtio-pci -> -> > > > > > > -> -> > > > > > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > > > > > virtio-pci-common -> -> > > > > > > -> -> > > > > > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > > > > > virtio-pci-isr -> -> > > > > > > -> -> > > > > > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > > > > > virtio-pci-device -> -> > > > > > > -> -> > > > > > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > > > > > virtio-pci-notify <- io mem unassigned -> -> > > > > > > -> -> > > > > > > ⦠-> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > We caught an exceptional address changing while this -> -> > > > > > > problem happened, show as -> -> > > > > > > follow: -> -> > > > > > > -> -> > > > > > > Before pci_bridge_update_mappingsï¼ -> -> > > > > > > -> -> > > > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fc000000-00000000fc1fffff -> -> > > > > > > -> -> > > > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fc200000-00000000fc3fffff -> -> > > > > > > -> -> > > > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fc400000-00000000fc5fffff -> -> > > > > > > -> -> > > > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fc600000-00000000fc7fffff -> -> > > > > > > -> -> > > > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fc800000-00000000fc9fffff -> -> > > > > > > <- correct Adress Spce -> -> > > > > > > -> -> > > > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fca00000-00000000fcbfffff -> -> > > > > > > -> -> > > > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fcc00000-00000000fcdfffff -> -> > > > > > > -> -> > > > > > > 00000000fce00000-00000000fcffffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > 00000000fce00000-00000000fcffffff -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > After pci_bridge_update_mappingsï¼ -> -> > > > > > > -> -> > > > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fda00000-00000000fdbfffff -> -> > > > > > > -> -> > > > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fdc00000-00000000fddfffff -> -> > > > > > > -> -> > > > > > > 00000000fde00000-00000000fdffffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fde00000-00000000fdffffff -> -> > > > > > > -> -> > > > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fe000000-00000000fe1fffff -> -> > > > > > > -> -> > > > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fe200000-00000000fe3fffff -> -> > > > > > > -> -> > > > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fe400000-00000000fe5fffff -> -> > > > > > > -> -> > > > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fe600000-00000000fe7fffff -> -> > > > > > > -> -> > > > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW): -> -> > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > 00000000fe800000-00000000fe9fffff -> -> > > > > > > -> -> > > > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW): -> -> > > > > > > alias -> -> > > > pci_bridge_pref_mem -> -> > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- -> -> > > > > > > Exceptional -> -> > Adress -> -> > > > > > Space -> -> > > > > > -> -> > > > > > This one is empty though right? -> -> > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > We have figured out why this address becomes this value, -> -> > > > > > > according to pci spec, pci driver can get BAR address -> -> > > > > > > size by writing 0xffffffff to -> -> > > > > > > -> -> > > > > > > the pci register firstly, and then read back the value from this -> -register. -> -> > > > > > -> -> > > > > > -> -> > > > > > OK however as you show below the BAR being sized is the BAR -> -> > > > > > if a bridge. Are you then adding a bridge device by hotplug? -> -> > > > > -> -> > > > > No, I just simply hot plugged a VFIO device to Bus 0, another -> -> > > > > interesting phenomenon is If I hot plug the device to other -> -> > > > > bus, this doesn't -> -> > > > happened. -> -> > > > > -> -> > > > > > -> -> > > > > > -> -> > > > > > > We didn't handle this value specially while process pci -> -> > > > > > > write in qemu, the function call stack is: -> -> > > > > > > -> -> > > > > > > Pci_bridge_dev_write_config -> -> > > > > > > -> -> > > > > > > -> pci_bridge_write_config -> -> > > > > > > -> -> > > > > > > -> pci_default_write_config (we update the config[address] -> -> > > > > > > -> value here to -> -> > > > > > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > > > > > -> -> > > > > > > -> pci_bridge_update_mappings -> -> > > > > > > -> -> > > > > > > ->pci_bridge_region_del(br, br->windows); -> -> > > > > > > -> -> > > > > > > -> pci_bridge_region_init -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use -> -> > > > > > > -> the -> -> > > > > > > wrong value -> -> > > > > > > fffffffffc800000) -> -> > > > > > > -> -> > > > > > > -> -> -> > > > > > > memory_region_transaction_commit -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > So, as we can see, we use the wrong base address in qemu -> -> > > > > > > to update the memory regions, though, we update the base -> -> > > > > > > address to -> -> > > > > > > -> -> > > > > > > The correct value after pci driver in VM write the -> -> > > > > > > original value back, the virtio NIC in bus 4 may still -> -> > > > > > > sends net packets concurrently with -> -> > > > > > > -> -> > > > > > > The wrong memory region address. -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > We have tried to skip the memory region update action in -> -> > > > > > > qemu while detect pci write with 0xffffffff value, and it -> -> > > > > > > does work, but -> -> > > > > > > -> -> > > > > > > This seems to be not gently. -> -> > > > > > -> -> > > > > > For sure. But I'm still puzzled as to why does Linux try to -> -> > > > > > size the BAR of the bridge while a device behind it is used. -> -> > > > > > -> -> > > > > > Can you pls post your QEMU command line? -> -> > > > > -> -> > > > > My QEMU command line: -> -> > > > > /root/xyd/qemu-system-x86_64 -name -> -> > > > > guest=Linux,debug-threads=on -S -object -> -> > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom -> -> > > > > ain- -> -> > > > > 194- -> -> > > > > Linux/master-key.aes -machine -> -> > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -> -> > > > > -smp -> -> > > > > 20,sockets=20,cores=1,threads=1 -numa -> -> > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa -> -> > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -> -> > > > > -nodefaults -chardev -> -> > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li -> -> > > > > nux/ -> -> > > > > moni -> -> > > > > tor.sock,server,nowait -mon -> -> > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -> -> > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown -> -> > > > > -boot strict=on -device -> -> > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id -> -> > > > > =dri -> -> > > > > ve-v -> -> > > > > irtio-disk0,cache=none -device -> -> > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio- -> -> > > > > disk -> -> > > > > 0,id -> -> > > > > =virtio-disk0,bootindex=1 -drive -> -> > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -> -> > > > > -netdev -> -> > > > > tap,fd=35,id=hostnet0 -device -> -> > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b -> -> > > > > us=p -> -> > > > > ci.4 -> -> > > > > ,addr=0x1 -chardev pty,id=charserial0 -device -> -> > > > > isa-serial,chardev=charserial0,id=serial0 -device -> -> > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg -> -> > > > > timestamp=on -> -> > > > > -> -> > > > > I am also very curious about this issue, in the linux kernel -> -> > > > > code, maybe double -> -> > > > check in function pci_bridge_check_ranges triggered this problem. -> -> > > > -> -> > > > If you can get the stacktrace in Linux when it tries to write -> -> > > > this fffff value, that would be quite helpful. -> -> > > > -> -> > > -> -> > > After I add mdelay(100) in function pci_bridge_check_ranges, this -> -> > > phenomenon is easier to reproduce, below is my modify in kernel: -> -> > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c -> -> > > index cb389277..86e232d 100644 -> -> > > --- a/drivers/pci/setup-bus.c -> -> > > +++ b/drivers/pci/setup-bus.c -> -> > > @@ -27,7 +27,7 @@ -> -> > > #include <linux/slab.h> -> -> > > #include <linux/acpi.h> -> -> > > #include "pci.h" -> -> > > - -> -> > > +#include <linux/delay.h> -> -> > > unsigned int pci_flags; -> -> > > -> -> > > struct pci_dev_resource { -> -> > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct -> -> > > pci_bus -> -> > *bus) -> -> > > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> > > 0xffffffff); -> -> > > pci_read_config_dword(bridge, -> -> > > PCI_PREF_BASE_UPPER32, &tmp); -> -> > > + mdelay(100); -> -> > > + printk(KERN_ERR "sleep\n"); -> -> > > + dump_stack(); -> -> > > if (!tmp) -> -> > > b_res[2].flags &= ~IORESOURCE_MEM_64; -> -> > > pci_write_config_dword(bridge, -> -> > > PCI_PREF_BASE_UPPER32, -> -> > > -> -> > -> -> > OK! -> -> > I just sent a Linux patch that should help. -> -> > I would appreciate it if you will give it a try and if that helps -> -> > reply to it with a -> -> > Tested-by: tag. -> -> > -> -> -> -> I tested this patch and it works fine on my machine. -> -> -> -> But I have another question, if we only fix this problem in the -> -> kernel, the Linux version that has been released does not work well on the -> -virtualization platform. -> -> Is there a way to fix this problem in the backend? -> -> -There could we a way to work around this. -> -Does below help? -I am sorry to tell you, I tested this patch and it doesn't work fine. - -> -> -diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -236a20eaa8..7834cac4b0 100644 -> ---- a/hw/i386/acpi-build.c -> -+++ b/hw/i386/acpi-build.c -> -@@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -*parent_scope, PCIBus *bus, -> -> -aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM"))); -> -aml_append(method, -> -- aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check -> -*/) -> -+ aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device -> -+ Check Light */) -> -); -> -aml_append(method, -> -aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request -> -*/) - -On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote: -> -> There could we a way to work around this. -> -> Does below help? -> -> -I am sorry to tell you, I tested this patch and it doesn't work fine. -What happens? - -> -> -> -> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -> 236a20eaa8..7834cac4b0 100644 -> -> --- a/hw/i386/acpi-build.c -> -> +++ b/hw/i386/acpi-build.c -> -> @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -> *parent_scope, PCIBus *bus, -> -> -> -> aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM"))); -> -> aml_append(method, -> -> - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check -> -> */) -> -> + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device -> -> + Check Light */) -> -> ); -> -> aml_append(method, -> -> aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request -> -> */) - -On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote: -> -> On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote: -> -> > On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote: -> -> > > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote: -> -> > > > > > > > Hi all, -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > In our test, we configured VM with several pci-bridges and -> -> > > > > > > > a virtio-net nic been attached with bus 4, -> -> > > > > > > > -> -> > > > > > > > After VM is startup, We ping this nic from host to judge -> -> > > > > > > > if it is working normally. Then, we hot add pci devices to -> -> > > > > > > > this VM with bus -> -> > > 0. -> -> > > > > > > > -> -> > > > > > > > We found the virtio-net NIC in bus 4 is not working (can -> -> > > > > > > > not -> -> > > > > > > > connect) occasionally, as it kick virtio backend failure with -> -> > > > > > > > error -> -> below: -> -> > > > > > > > -> -> > > > > > > > Unassigned mem write 00000000fc803004 = 0x1 -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > memory-region: pci_bridge_pci -> -> > > > > > > > -> -> > > > > > > > 0000000000000000-ffffffffffffffff (prio 0, RW): -> -> > > > > > > > pci_bridge_pci -> -> > > > > > > > -> -> > > > > > > > 00000000fc800000-00000000fc803fff (prio 1, RW): -> -> > > > > > > > virtio-pci -> -> > > > > > > > -> -> > > > > > > > 00000000fc800000-00000000fc800fff (prio 0, RW): -> -> > > > > > > > virtio-pci-common -> -> > > > > > > > -> -> > > > > > > > 00000000fc801000-00000000fc801fff (prio 0, RW): -> -> > > > > > > > virtio-pci-isr -> -> > > > > > > > -> -> > > > > > > > 00000000fc802000-00000000fc802fff (prio 0, RW): -> -> > > > > > > > virtio-pci-device -> -> > > > > > > > -> -> > > > > > > > 00000000fc803000-00000000fc803fff (prio 0, RW): -> -> > > > > > > > virtio-pci-notify <- io mem unassigned -> -> > > > > > > > -> -> > > > > > > > ⦠-> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > We caught an exceptional address changing while this -> -> > > > > > > > problem happened, show as -> -> > > > > > > > follow: -> -> > > > > > > > -> -> > > > > > > > Before pci_bridge_update_mappingsï¼ -> -> > > > > > > > -> -> > > > > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fc000000-00000000fc1fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fc200000-00000000fc3fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fc400000-00000000fc5fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fc600000-00000000fc7fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fc800000-00000000fc9fffff -> -> > > > > > > > <- correct Adress Spce -> -> > > > > > > > -> -> > > > > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fca00000-00000000fcbfffff -> -> > > > > > > > -> -> > > > > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fcc00000-00000000fcdfffff -> -> > > > > > > > -> -> > > > > > > > 00000000fce00000-00000000fcffffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci -> -> > > > > > > > 00000000fce00000-00000000fcffffff -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > After pci_bridge_update_mappingsï¼ -> -> > > > > > > > -> -> > > > > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fda00000-00000000fdbfffff -> -> > > > > > > > -> -> > > > > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fdc00000-00000000fddfffff -> -> > > > > > > > -> -> > > > > > > > 00000000fde00000-00000000fdffffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fde00000-00000000fdffffff -> -> > > > > > > > -> -> > > > > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fe000000-00000000fe1fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fe200000-00000000fe3fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fe400000-00000000fe5fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fe600000-00000000fe7fffff -> -> > > > > > > > -> -> > > > > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW): -> -> > > > > > > > alias pci_bridge_mem @pci_bridge_pci -> -> > > > > > > > 00000000fe800000-00000000fe9fffff -> -> > > > > > > > -> -> > > > > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW): -> -> > > > > > > > alias -> -> > > > > pci_bridge_pref_mem -> -> > > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- -> -> > > > > > > > Exceptional -> -> > > Adress -> -> > > > > > > Space -> -> > > > > > > -> -> > > > > > > This one is empty though right? -> -> > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > We have figured out why this address becomes this value, -> -> > > > > > > > according to pci spec, pci driver can get BAR address -> -> > > > > > > > size by writing 0xffffffff to -> -> > > > > > > > -> -> > > > > > > > the pci register firstly, and then read back the value from -> -> > > > > > > > this -> -> register. -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > OK however as you show below the BAR being sized is the BAR -> -> > > > > > > if a bridge. Are you then adding a bridge device by hotplug? -> -> > > > > > -> -> > > > > > No, I just simply hot plugged a VFIO device to Bus 0, another -> -> > > > > > interesting phenomenon is If I hot plug the device to other -> -> > > > > > bus, this doesn't -> -> > > > > happened. -> -> > > > > > -> -> > > > > > > -> -> > > > > > > -> -> > > > > > > > We didn't handle this value specially while process pci -> -> > > > > > > > write in qemu, the function call stack is: -> -> > > > > > > > -> -> > > > > > > > Pci_bridge_dev_write_config -> -> > > > > > > > -> -> > > > > > > > -> pci_bridge_write_config -> -> > > > > > > > -> -> > > > > > > > -> pci_default_write_config (we update the config[address] -> -> > > > > > > > -> value here to -> -> > > > > > > > fffffffffc800000, which should be 0xfc800000 ) -> -> > > > > > > > -> -> > > > > > > > -> pci_bridge_update_mappings -> -> > > > > > > > -> -> > > > > > > > ->pci_bridge_region_del(br, br->windows); -> -> > > > > > > > -> -> > > > > > > > -> pci_bridge_region_init -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use -> -> > > > > > > > -> the -> -> > > > > > > > wrong value -> -> > > > > > > > fffffffffc800000) -> -> > > > > > > > -> -> > > > > > > > -> -> -> > > > > > > > memory_region_transaction_commit -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > So, as we can see, we use the wrong base address in qemu -> -> > > > > > > > to update the memory regions, though, we update the base -> -> > > > > > > > address to -> -> > > > > > > > -> -> > > > > > > > The correct value after pci driver in VM write the -> -> > > > > > > > original value back, the virtio NIC in bus 4 may still -> -> > > > > > > > sends net packets concurrently with -> -> > > > > > > > -> -> > > > > > > > The wrong memory region address. -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > -> -> > > > > > > > We have tried to skip the memory region update action in -> -> > > > > > > > qemu while detect pci write with 0xffffffff value, and it -> -> > > > > > > > does work, but -> -> > > > > > > > -> -> > > > > > > > This seems to be not gently. -> -> > > > > > > -> -> > > > > > > For sure. But I'm still puzzled as to why does Linux try to -> -> > > > > > > size the BAR of the bridge while a device behind it is used. -> -> > > > > > > -> -> > > > > > > Can you pls post your QEMU command line? -> -> > > > > > -> -> > > > > > My QEMU command line: -> -> > > > > > /root/xyd/qemu-system-x86_64 -name -> -> > > > > > guest=Linux,debug-threads=on -S -object -> -> > > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom -> -> > > > > > ain- -> -> > > > > > 194- -> -> > > > > > Linux/master-key.aes -machine -> -> > > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu -> -> > > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m -> -> > > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -> -> > > > > > -smp -> -> > > > > > 20,sockets=20,cores=1,threads=1 -numa -> -> > > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa -> -> > > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa -> -> > > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa -> -> > > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid -> -> > > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -> -> > > > > > -nodefaults -chardev -> -> > > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li -> -> > > > > > nux/ -> -> > > > > > moni -> -> > > > > > tor.sock,server,nowait -mon -> -> > > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -> -> > > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown -> -> > > > > > -boot strict=on -device -> -> > > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device -> -> > > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device -> -> > > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device -> -> > > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device -> -> > > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device -> -> > > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device -> -> > > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device -> -> > > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device -> -> > > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device -> -> > > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device -> -> > > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device -> -> > > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device -> -> > > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive -> -> > > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id -> -> > > > > > =dri -> -> > > > > > ve-v -> -> > > > > > irtio-disk0,cache=none -device -> -> > > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio- -> -> > > > > > disk -> -> > > > > > 0,id -> -> > > > > > =virtio-disk0,bootindex=1 -drive -> -> > > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device -> -> > > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -> -> > > > > > -netdev -> -> > > > > > tap,fd=35,id=hostnet0 -device -> -> > > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b -> -> > > > > > us=p -> -> > > > > > ci.4 -> -> > > > > > ,addr=0x1 -chardev pty,id=charserial0 -device -> -> > > > > > isa-serial,chardev=charserial0,id=serial0 -device -> -> > > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device -> -> > > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device -> -> > > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg -> -> > > > > > timestamp=on -> -> > > > > > -> -> > > > > > I am also very curious about this issue, in the linux kernel -> -> > > > > > code, maybe double -> -> > > > > check in function pci_bridge_check_ranges triggered this problem. -> -> > > > > -> -> > > > > If you can get the stacktrace in Linux when it tries to write -> -> > > > > this fffff value, that would be quite helpful. -> -> > > > > -> -> > > > -> -> > > > After I add mdelay(100) in function pci_bridge_check_ranges, this -> -> > > > phenomenon is easier to reproduce, below is my modify in kernel: -> -> > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c -> -> > > > index cb389277..86e232d 100644 -> -> > > > --- a/drivers/pci/setup-bus.c -> -> > > > +++ b/drivers/pci/setup-bus.c -> -> > > > @@ -27,7 +27,7 @@ -> -> > > > #include <linux/slab.h> -> -> > > > #include <linux/acpi.h> -> -> > > > #include "pci.h" -> -> > > > - -> -> > > > +#include <linux/delay.h> -> -> > > > unsigned int pci_flags; -> -> > > > -> -> > > > struct pci_dev_resource { -> -> > > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct -> -> > > > pci_bus -> -> > > *bus) -> -> > > > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, -> -> > > > 0xffffffff); -> -> > > > pci_read_config_dword(bridge, -> -> > > > PCI_PREF_BASE_UPPER32, &tmp); -> -> > > > + mdelay(100); -> -> > > > + printk(KERN_ERR "sleep\n"); -> -> > > > + dump_stack(); -> -> > > > if (!tmp) -> -> > > > b_res[2].flags &= ~IORESOURCE_MEM_64; -> -> > > > pci_write_config_dword(bridge, -> -> > > > PCI_PREF_BASE_UPPER32, -> -> > > > -> -> > > -> -> > > OK! -> -> > > I just sent a Linux patch that should help. -> -> > > I would appreciate it if you will give it a try and if that helps -> -> > > reply to it with a -> -> > > Tested-by: tag. -> -> > > -> -> > -> -> > I tested this patch and it works fine on my machine. -> -> > -> -> > But I have another question, if we only fix this problem in the -> -> > kernel, the Linux version that has been released does not work well on the -> -> virtualization platform. -> -> > Is there a way to fix this problem in the backend? -> -> -> -> There could we a way to work around this. -> -> Does below help? -> -> -I am sorry to tell you, I tested this patch and it doesn't work fine. -> -> -> -> -> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -> 236a20eaa8..7834cac4b0 100644 -> -> --- a/hw/i386/acpi-build.c -> -> +++ b/hw/i386/acpi-build.c -> -> @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -> *parent_scope, PCIBus *bus, -> -> -> -> aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM"))); -> -> aml_append(method, -> -> - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check -> -> */) -> -> + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device -> -> + Check Light */) -> -> ); -> -> aml_append(method, -> -> aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request -> -> */) -Oh I see, another bug: - - case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: - acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT -event\n"); - /* TBD: Exactly what does 'light' mean? */ - break; - -And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type) -and friends all just ignore this event type. - - - --- -MST - -> -> > > > > > > > > Hi all, -> -> > > > > > > > > -> -> > > > > > > > > -> -> > > > > > > > > -> -> > > > > > > > > In our test, we configured VM with several pci-bridges -> -> > > > > > > > > and a virtio-net nic been attached with bus 4, -> -> > > > > > > > > -> -> > > > > > > > > After VM is startup, We ping this nic from host to -> -> > > > > > > > > judge if it is working normally. Then, we hot add pci -> -> > > > > > > > > devices to this VM with bus -> -> > > > 0. -> -> > > > > > > > > -> -> > > > > > > > > We found the virtio-net NIC in bus 4 is not working -> -> > > > > > > > > (can not -> -> > > > > > > > > connect) occasionally, as it kick virtio backend -> -> > > > > > > > > failure with error -> -> > > But I have another question, if we only fix this problem in the -> -> > > kernel, the Linux version that has been released does not work -> -> > > well on the -> -> > virtualization platform. -> -> > > Is there a way to fix this problem in the backend? -> -> > -> -> > There could we a way to work around this. -> -> > Does below help? -> -> -> -> I am sorry to tell you, I tested this patch and it doesn't work fine. -> -> -> -> > -> -> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -> > 236a20eaa8..7834cac4b0 100644 -> -> > --- a/hw/i386/acpi-build.c -> -> > +++ b/hw/i386/acpi-build.c -> -> > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -> > *parent_scope, PCIBus *bus, -> -> > -> -> > aml_append(method, aml_store(aml_int(bsel_val), -> -aml_name("BNUM"))); -> -> > aml_append(method, -> -> > - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device -> -> > Check -> -*/) -> -> > + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* -> -> > + Device Check Light */) -> -> > ); -> -> > aml_append(method, -> -> > aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject -> -> > Request */) -> -> -> -Oh I see, another bug: -> -> -case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: -> -acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT -> -event\n"); -> -/* TBD: Exactly what does 'light' mean? */ -> -break; -> -> -And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type) -> -and friends all just ignore this event type. -> -> -> -> --- -> -MST -Hi Michael, - -If we want to fix this problem on the backend, it is not enough to consider -only PCI -device hot plugging, because I found that if we use a command like -"echo 1 > /sys/bus/pci/rescan" in guest, this problem is very easy to reproduce. - -From the perspective of device emulation, when guest writes 0xffffffff to the -BAR, -guest just want to get the size of the region but not really updating the -address space. -So I made the following patch to avoid update pci mapping. - -Do you think this make sense? - -[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR - -When guest writes 0xffffffff to the BAR, guest just want to get the size of the -region -but not really updating the address space. -So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings -or pci_bridge_update_mappings. - -Signed-off-by: xuyandong <address@hidden> ---- - hw/pci/pci.c | 6 ++++-- - hw/pci/pci_bridge.c | 8 +++++--- - 2 files changed, 9 insertions(+), 5 deletions(-) - -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -index 56b13b3..ef368e1 100644 ---- a/hw/pci/pci.c -+++ b/hw/pci/pci.c -@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t -addr, uint32_t val_in, int - { - int i, was_irq_disabled = pci_irq_disabled(d); - uint32_t val = val_in; -+ uint64_t barmask = (1 << l*8) - 1; - - for (i = 0; i < l; val >>= 8, ++i) { - uint8_t wmask = d->wmask[addr + i]; -@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t -addr, uint32_t val_in, int - d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); - d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */ - } -- if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -+ if ((val_in != barmask && -+ (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || - ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -- ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -+ ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || - range_covers_byte(addr, l, PCI_COMMAND)) - pci_update_mappings(d); - -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -index ee9dff2..f2bad79 100644 ---- a/hw/pci/pci_bridge.c -+++ b/hw/pci/pci_bridge.c -@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d, - PCIBridge *s = PCI_BRIDGE(d); - uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); - uint16_t newctl; -+ uint64_t barmask = (1 << len * 8) - 1; - - pci_default_write_config(d, address, val, len); - - if (ranges_overlap(address, len, PCI_COMMAND, 2) || - -- /* io base/limit */ -- ranges_overlap(address, len, PCI_IO_BASE, 2) || -+ (val != barmask && -+ /* io base/limit */ -+ (ranges_overlap(address, len, PCI_IO_BASE, 2) || - - /* memory base/limit, prefetchable base/limit and - io base/limit upper 16 */ -- ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -+ ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) || - - /* vga enable */ - ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { --- -1.8.3.1 - -On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote: -> -> > > > > > > > > > Hi all, -> -> > > > > > > > > > -> -> > > > > > > > > > -> -> > > > > > > > > > -> -> > > > > > > > > > In our test, we configured VM with several pci-bridges -> -> > > > > > > > > > and a virtio-net nic been attached with bus 4, -> -> > > > > > > > > > -> -> > > > > > > > > > After VM is startup, We ping this nic from host to -> -> > > > > > > > > > judge if it is working normally. Then, we hot add pci -> -> > > > > > > > > > devices to this VM with bus -> -> > > > > 0. -> -> > > > > > > > > > -> -> > > > > > > > > > We found the virtio-net NIC in bus 4 is not working -> -> > > > > > > > > > (can not -> -> > > > > > > > > > connect) occasionally, as it kick virtio backend -> -> > > > > > > > > > failure with error -> -> -> > > > But I have another question, if we only fix this problem in the -> -> > > > kernel, the Linux version that has been released does not work -> -> > > > well on the -> -> > > virtualization platform. -> -> > > > Is there a way to fix this problem in the backend? -> -> > > -> -> > > There could we a way to work around this. -> -> > > Does below help? -> -> > -> -> > I am sorry to tell you, I tested this patch and it doesn't work fine. -> -> > -> -> > > -> -> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -> > > 236a20eaa8..7834cac4b0 100644 -> -> > > --- a/hw/i386/acpi-build.c -> -> > > +++ b/hw/i386/acpi-build.c -> -> > > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -> > > *parent_scope, PCIBus *bus, -> -> > > -> -> > > aml_append(method, aml_store(aml_int(bsel_val), -> -> aml_name("BNUM"))); -> -> > > aml_append(method, -> -> > > - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device -> -> > > Check -> -> */) -> -> > > + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* -> -> > > + Device Check Light */) -> -> > > ); -> -> > > aml_append(method, -> -> > > aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject -> -> > > Request */) -> -> -> -> -> -> Oh I see, another bug: -> -> -> -> case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: -> -> acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT -> -> event\n"); -> -> /* TBD: Exactly what does 'light' mean? */ -> -> break; -> -> -> -> And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type) -> -> and friends all just ignore this event type. -> -> -> -> -> -> -> -> -- -> -> MST -> -> -Hi Michael, -> -> -If we want to fix this problem on the backend, it is not enough to consider -> -only PCI -> -device hot plugging, because I found that if we use a command like -> -"echo 1 > /sys/bus/pci/rescan" in guest, this problem is very easy to -> -reproduce. -> -> -From the perspective of device emulation, when guest writes 0xffffffff to the -> -BAR, -> -guest just want to get the size of the region but not really updating the -> -address space. -> -So I made the following patch to avoid update pci mapping. -> -> -Do you think this make sense? -> -> -[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR -> -> -When guest writes 0xffffffff to the BAR, guest just want to get the size of -> -the region -> -but not really updating the address space. -> -So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings -> -or pci_bridge_update_mappings. -> -> -Signed-off-by: xuyandong <address@hidden> -I see how that will address the common case however there are a bunch of -issues here. First of all it's easy to trigger the update by some other -action like VM migration. More importantly it's just possible that -guest actually does want to set the low 32 bit of the address to all -ones. For example, that is clearly listed as a way to disable all -devices behind the bridge in the pci to pci bridge spec. - -Given upstream is dragging it's feet I'm open to adding a flag -that will help keep guests going as a temporary measure. -We will need to think about ways to restrict this as much as -we can. - - -> ---- -> -hw/pci/pci.c | 6 ++++-- -> -hw/pci/pci_bridge.c | 8 +++++--- -> -2 files changed, 9 insertions(+), 5 deletions(-) -> -> -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -> -index 56b13b3..ef368e1 100644 -> ---- a/hw/pci/pci.c -> -+++ b/hw/pci/pci.c -> -@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t -> -addr, uint32_t val_in, int -> -{ -> -int i, was_irq_disabled = pci_irq_disabled(d); -> -uint32_t val = val_in; -> -+ uint64_t barmask = (1 << l*8) - 1; -> -> -for (i = 0; i < l; val >>= 8, ++i) { -> -uint8_t wmask = d->wmask[addr + i]; -> -@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t -> -addr, uint32_t val_in, int -> -d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); -> -d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */ -> -} -> -- if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -+ if ((val_in != barmask && -> -+ (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -> -- ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -> -+ ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || -> -range_covers_byte(addr, l, PCI_COMMAND)) -> -pci_update_mappings(d); -> -> -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -> -index ee9dff2..f2bad79 100644 -> ---- a/hw/pci/pci_bridge.c -> -+++ b/hw/pci/pci_bridge.c -> -@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d, -> -PCIBridge *s = PCI_BRIDGE(d); -> -uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); -> -uint16_t newctl; -> -+ uint64_t barmask = (1 << len * 8) - 1; -> -> -pci_default_write_config(d, address, val, len); -> -> -if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -- /* io base/limit */ -> -- ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -+ (val != barmask && -> -+ /* io base/limit */ -> -+ (ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> -/* memory base/limit, prefetchable base/limit and -> -io base/limit upper 16 */ -> -- ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -+ ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) || -> -> -/* vga enable */ -> -ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> --- -> -1.8.3.1 -> -> -> - -> ------Original Message----- -> -From: Michael S. Tsirkin [ -mailto:address@hidden -> -Sent: Monday, January 07, 2019 11:06 PM -> -To: xuyandong <address@hidden> -> -Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu- -> -address@hidden; Zhanghailiang <address@hidden>; -> -wangxin (U) <address@hidden>; Huangweidong (C) -> -<address@hidden> -> -Subject: Re: [BUG]Unassigned mem write during pci device hot-plug -> -> -On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote: -> -> > > > > > > > > > > Hi all, -> -> > > > > > > > > > > -> -> > > > > > > > > > > -> -> > > > > > > > > > > -> -> > > > > > > > > > > In our test, we configured VM with several -> -> > > > > > > > > > > pci-bridges and a virtio-net nic been attached -> -> > > > > > > > > > > with bus 4, -> -> > > > > > > > > > > -> -> > > > > > > > > > > After VM is startup, We ping this nic from host to -> -> > > > > > > > > > > judge if it is working normally. Then, we hot add -> -> > > > > > > > > > > pci devices to this VM with bus -> -> > > > > > 0. -> -> > > > > > > > > > > -> -> > > > > > > > > > > We found the virtio-net NIC in bus 4 is not -> -> > > > > > > > > > > working (can not -> -> > > > > > > > > > > connect) occasionally, as it kick virtio backend -> -> > > > > > > > > > > failure with error -> -> -> -> > > > > But I have another question, if we only fix this problem in -> -> > > > > the kernel, the Linux version that has been released does not -> -> > > > > work well on the -> -> > > > virtualization platform. -> -> > > > > Is there a way to fix this problem in the backend? -> -> -> -> Hi Michael, -> -> -> -> If we want to fix this problem on the backend, it is not enough to -> -> consider only PCI device hot plugging, because I found that if we use -> -> a command like "echo 1 > /sys/bus/pci/rescan" in guest, this problem is very -> -easy to reproduce. -> -> -> -> From the perspective of device emulation, when guest writes 0xffffffff -> -> to the BAR, guest just want to get the size of the region but not really -> -updating the address space. -> -> So I made the following patch to avoid update pci mapping. -> -> -> -> Do you think this make sense? -> -> -> -> [PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR -> -> -> -> When guest writes 0xffffffff to the BAR, guest just want to get the -> -> size of the region but not really updating the address space. -> -> So when guest writes 0xffffffff to BAR, we need avoid -> -> pci_update_mappings or pci_bridge_update_mappings. -> -> -> -> Signed-off-by: xuyandong <address@hidden> -> -> -I see how that will address the common case however there are a bunch of -> -issues here. First of all it's easy to trigger the update by some other -> -action like -> -VM migration. More importantly it's just possible that guest actually does -> -want -> -to set the low 32 bit of the address to all ones. For example, that is -> -clearly -> -listed as a way to disable all devices behind the bridge in the pci to pci -> -bridge -> -spec. -Ok, I see. If I only skip upate when guest writing 0xFFFFFFFF to Prefetcable -Base Upper 32 Bits -to meet the kernel double check problem. -Do you think there is still risk? - -> -> -Given upstream is dragging it's feet I'm open to adding a flag that will help -> -keep guests going as a temporary measure. -> -We will need to think about ways to restrict this as much as we can. -> -> -> -> --- -> -> hw/pci/pci.c | 6 ++++-- -> -> hw/pci/pci_bridge.c | 8 +++++--- -> -> 2 files changed, 9 insertions(+), 5 deletions(-) -> -> -> -> diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644 -> -> --- a/hw/pci/pci.c -> -> +++ b/hw/pci/pci.c -> -> @@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, -> -> uint32_t addr, uint32_t val_in, int { -> -> int i, was_irq_disabled = pci_irq_disabled(d); -> -> uint32_t val = val_in; -> -> + uint64_t barmask = (1 << l*8) - 1; -> -> -> -> for (i = 0; i < l; val >>= 8, ++i) { -> -> uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@ -> -> void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, -> -int -> -> d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & -> -> wmask); -> -> d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear -> -> */ -> -> } -> -> - if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -> + if ((val_in != barmask && -> -> + (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -> ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -> -> - ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -> -> + ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || -> -> range_covers_byte(addr, l, PCI_COMMAND)) -> -> pci_update_mappings(d); -> -> -> -> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index -> -> ee9dff2..f2bad79 100644 -> -> --- a/hw/pci/pci_bridge.c -> -> +++ b/hw/pci/pci_bridge.c -> -> @@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d, -> -> PCIBridge *s = PCI_BRIDGE(d); -> -> uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); -> -> uint16_t newctl; -> -> + uint64_t barmask = (1 << len * 8) - 1; -> -> -> -> pci_default_write_config(d, address, val, len); -> -> -> -> if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -> -> - /* io base/limit */ -> -> - ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> + (val != barmask && -> -> + /* io base/limit */ -> -> + (ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> -> -> /* memory base/limit, prefetchable base/limit and -> -> io base/limit upper 16 */ -> -> - ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> + ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) || -> -> -> -> /* vga enable */ -> -> ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> -- -> -> 1.8.3.1 -> -> -> -> -> -> - -On Mon, Jan 07, 2019 at 03:28:36PM +0000, xuyandong wrote: -> -> -> -> -----Original Message----- -> -> From: Michael S. Tsirkin [ -mailto:address@hidden -> -> Sent: Monday, January 07, 2019 11:06 PM -> -> To: xuyandong <address@hidden> -> -> Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu- -> -> address@hidden; Zhanghailiang <address@hidden>; -> -> wangxin (U) <address@hidden>; Huangweidong (C) -> -> <address@hidden> -> -> Subject: Re: [BUG]Unassigned mem write during pci device hot-plug -> -> -> -> On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote: -> -> > > > > > > > > > > > Hi all, -> -> > > > > > > > > > > > -> -> > > > > > > > > > > > -> -> > > > > > > > > > > > -> -> > > > > > > > > > > > In our test, we configured VM with several -> -> > > > > > > > > > > > pci-bridges and a virtio-net nic been attached -> -> > > > > > > > > > > > with bus 4, -> -> > > > > > > > > > > > -> -> > > > > > > > > > > > After VM is startup, We ping this nic from host to -> -> > > > > > > > > > > > judge if it is working normally. Then, we hot add -> -> > > > > > > > > > > > pci devices to this VM with bus -> -> > > > > > > 0. -> -> > > > > > > > > > > > -> -> > > > > > > > > > > > We found the virtio-net NIC in bus 4 is not -> -> > > > > > > > > > > > working (can not -> -> > > > > > > > > > > > connect) occasionally, as it kick virtio backend -> -> > > > > > > > > > > > failure with error -> -> > -> -> > > > > > But I have another question, if we only fix this problem in -> -> > > > > > the kernel, the Linux version that has been released does not -> -> > > > > > work well on the -> -> > > > > virtualization platform. -> -> > > > > > Is there a way to fix this problem in the backend? -> -> > -> -> > Hi Michael, -> -> > -> -> > If we want to fix this problem on the backend, it is not enough to -> -> > consider only PCI device hot plugging, because I found that if we use -> -> > a command like "echo 1 > /sys/bus/pci/rescan" in guest, this problem is -> -> > very -> -> easy to reproduce. -> -> > -> -> > From the perspective of device emulation, when guest writes 0xffffffff -> -> > to the BAR, guest just want to get the size of the region but not really -> -> updating the address space. -> -> > So I made the following patch to avoid update pci mapping. -> -> > -> -> > Do you think this make sense? -> -> > -> -> > [PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR -> -> > -> -> > When guest writes 0xffffffff to the BAR, guest just want to get the -> -> > size of the region but not really updating the address space. -> -> > So when guest writes 0xffffffff to BAR, we need avoid -> -> > pci_update_mappings or pci_bridge_update_mappings. -> -> > -> -> > Signed-off-by: xuyandong <address@hidden> -> -> -> -> I see how that will address the common case however there are a bunch of -> -> issues here. First of all it's easy to trigger the update by some other -> -> action like -> -> VM migration. More importantly it's just possible that guest actually does -> -> want -> -> to set the low 32 bit of the address to all ones. For example, that is -> -> clearly -> -> listed as a way to disable all devices behind the bridge in the pci to pci -> -> bridge -> -> spec. -> -> -Ok, I see. If I only skip upate when guest writing 0xFFFFFFFF to Prefetcable -> -Base Upper 32 Bits -> -to meet the kernel double check problem. -> -Do you think there is still risk? -Well it's non zero since spec says such a write should disable all -accesses. Just an idea: why not add an option to disable upper 32 bit? -That is ugly and limits space but spec compliant. - -> -> -> -> Given upstream is dragging it's feet I'm open to adding a flag that will -> -> help -> -> keep guests going as a temporary measure. -> -> We will need to think about ways to restrict this as much as we can. -> -> -> -> -> -> > --- -> -> > hw/pci/pci.c | 6 ++++-- -> -> > hw/pci/pci_bridge.c | 8 +++++--- -> -> > 2 files changed, 9 insertions(+), 5 deletions(-) -> -> > -> -> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644 -> -> > --- a/hw/pci/pci.c -> -> > +++ b/hw/pci/pci.c -> -> > @@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, -> -> > uint32_t addr, uint32_t val_in, int { -> -> > int i, was_irq_disabled = pci_irq_disabled(d); -> -> > uint32_t val = val_in; -> -> > + uint64_t barmask = (1 << l*8) - 1; -> -> > -> -> > for (i = 0; i < l; val >>= 8, ++i) { -> -> > uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@ -> -> > void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t -> -> > val_in, -> -> int -> -> > d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & -> -> > wmask); -> -> > d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to -> -> > Clear */ -> -> > } -> -> > - if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -> > + if ((val_in != barmask && -> -> > + (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -> > ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -> -> > - ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -> -> > + ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || -> -> > range_covers_byte(addr, l, PCI_COMMAND)) -> -> > pci_update_mappings(d); -> -> > -> -> > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index -> -> > ee9dff2..f2bad79 100644 -> -> > --- a/hw/pci/pci_bridge.c -> -> > +++ b/hw/pci/pci_bridge.c -> -> > @@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d, -> -> > PCIBridge *s = PCI_BRIDGE(d); -> -> > uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); -> -> > uint16_t newctl; -> -> > + uint64_t barmask = (1 << len * 8) - 1; -> -> > -> -> > pci_default_write_config(d, address, val, len); -> -> > -> -> > if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> > -> -> > - /* io base/limit */ -> -> > - ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> > + (val != barmask && -> -> > + /* io base/limit */ -> -> > + (ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> > -> -> > /* memory base/limit, prefetchable base/limit and -> -> > io base/limit upper 16 */ -> -> > - ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -> > + ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) || -> -> > -> -> > /* vga enable */ -> -> > ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> -> > -- -> -> > 1.8.3.1 -> -> > -> -> > -> -> > - -> ------Original Message----- -> -From: xuyandong -> -Sent: Monday, January 07, 2019 10:37 PM -> -To: 'Michael S. Tsirkin' <address@hidden> -> -Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu- -> -address@hidden; Zhanghailiang <address@hidden>; -> -wangxin (U) <address@hidden>; Huangweidong (C) -> -<address@hidden> -> -Subject: RE: [BUG]Unassigned mem write during pci device hot-plug -> -> -> > > > > > > > > > Hi all, -> -> > > > > > > > > > -> -> > > > > > > > > > -> -> > > > > > > > > > -> -> > > > > > > > > > In our test, we configured VM with several -> -> > > > > > > > > > pci-bridges and a virtio-net nic been attached with -> -> > > > > > > > > > bus 4, -> -> > > > > > > > > > -> -> > > > > > > > > > After VM is startup, We ping this nic from host to -> -> > > > > > > > > > judge if it is working normally. Then, we hot add -> -> > > > > > > > > > pci devices to this VM with bus -> -> > > > > 0. -> -> > > > > > > > > > -> -> > > > > > > > > > We found the virtio-net NIC in bus 4 is not working -> -> > > > > > > > > > (can not -> -> > > > > > > > > > connect) occasionally, as it kick virtio backend -> -> > > > > > > > > > failure with error -> -> -> > > > But I have another question, if we only fix this problem in the -> -> > > > kernel, the Linux version that has been released does not work -> -> > > > well on the -> -> > > virtualization platform. -> -> > > > Is there a way to fix this problem in the backend? -> -> > > -> -> > > There could we a way to work around this. -> -> > > Does below help? -> -> > -> -> > I am sorry to tell you, I tested this patch and it doesn't work fine. -> -> > -> -> > > -> -> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index -> -> > > 236a20eaa8..7834cac4b0 100644 -> -> > > --- a/hw/i386/acpi-build.c -> -> > > +++ b/hw/i386/acpi-build.c -> -> > > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml -> -> > > *parent_scope, PCIBus *bus, -> -> > > -> -> > > aml_append(method, aml_store(aml_int(bsel_val), -> -> aml_name("BNUM"))); -> -> > > aml_append(method, -> -> > > - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device -> -Check -> -> */) -> -> > > + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* -> -> > > + Device Check Light */) -> -> > > ); -> -> > > aml_append(method, -> -> > > aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* -> -> > > Eject Request */) -> -> -> -> -> -> Oh I see, another bug: -> -> -> -> case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: -> -> acpi_handle_debug(handle, -> -> "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n"); -> -> /* TBD: Exactly what does 'light' mean? */ -> -> break; -> -> -> -> And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 -> -> type) and friends all just ignore this event type. -> -> -> -> -> -> -> -> -- -> -> MST -> -> -Hi Michael, -> -> -If we want to fix this problem on the backend, it is not enough to consider -> -only -> -PCI device hot plugging, because I found that if we use a command like "echo -> -1 > -> -/sys/bus/pci/rescan" in guest, this problem is very easy to reproduce. -> -> -From the perspective of device emulation, when guest writes 0xffffffff to the -> -BAR, guest just want to get the size of the region but not really updating the -> -address space. -> -So I made the following patch to avoid update pci mapping. -> -> -Do you think this make sense? -> -> -[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR -> -> -When guest writes 0xffffffff to the BAR, guest just want to get the size of -> -the -> -region but not really updating the address space. -> -So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings or -> -pci_bridge_update_mappings. -> -> -Signed-off-by: xuyandong <address@hidden> -> ---- -> -hw/pci/pci.c | 6 ++++-- -> -hw/pci/pci_bridge.c | 8 +++++--- -> -2 files changed, 9 insertions(+), 5 deletions(-) -> -> -diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644 -> ---- a/hw/pci/pci.c -> -+++ b/hw/pci/pci.c -> -@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t -> -addr, uint32_t val_in, int { -> -int i, was_irq_disabled = pci_irq_disabled(d); -> -uint32_t val = val_in; -> -+ uint64_t barmask = (1 << l*8) - 1; -> -> -for (i = 0; i < l; val >>= 8, ++i) { -> -uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@ void -> -pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int -> -d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); -> -d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */ -> -} -> -- if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -+ if ((val_in != barmask && -> -+ (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -> -ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -> -- ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -> -+ ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || -> -range_covers_byte(addr, l, PCI_COMMAND)) -> -pci_update_mappings(d); -> -> -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index ee9dff2..f2bad79 -> -100644 -> ---- a/hw/pci/pci_bridge.c -> -+++ b/hw/pci/pci_bridge.c -> -@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d, -> -PCIBridge *s = PCI_BRIDGE(d); -> -uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); -> -uint16_t newctl; -> -+ uint64_t barmask = (1 << len * 8) - 1; -> -> -pci_default_write_config(d, address, val, len); -> -> -if (ranges_overlap(address, len, PCI_COMMAND, 2) || -> -> -- /* io base/limit */ -> -- ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -+ (val != barmask && -> -+ /* io base/limit */ -> -+ (ranges_overlap(address, len, PCI_IO_BASE, 2) || -> -> -/* memory base/limit, prefetchable base/limit and -> -io base/limit upper 16 */ -> -- ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -> -+ ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) || -> -> -/* vga enable */ -> -ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -> --- -> -1.8.3.1 -> -> -Sorry, please ignore the patch above. - -Here is the patch I want to post: - -diff --git a/hw/pci/pci.c b/hw/pci/pci.c -index 56b13b3..38a300f 100644 ---- a/hw/pci/pci.c -+++ b/hw/pci/pci.c -@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t -addr, uint32_t val_in, int - { - int i, was_irq_disabled = pci_irq_disabled(d); - uint32_t val = val_in; -+ uint64_t barmask = ((uint64_t)1 << l*8) - 1; - - for (i = 0; i < l; val >>= 8, ++i) { - uint8_t wmask = d->wmask[addr + i]; -@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t -addr, uint32_t val_in, int - d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); - d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */ - } -- if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || -+ if ((val_in != barmask && -+ (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || - ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || -- ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || -+ ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) || - range_covers_byte(addr, l, PCI_COMMAND)) - pci_update_mappings(d); - -diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c -index ee9dff2..b8f7d48 100644 ---- a/hw/pci/pci_bridge.c -+++ b/hw/pci/pci_bridge.c -@@ -253,20 +253,22 @@ void pci_bridge_write_config(PCIDevice *d, - PCIBridge *s = PCI_BRIDGE(d); - uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); - uint16_t newctl; -+ uint64_t barmask = ((uint64_t)1 << len * 8) - 1; - - pci_default_write_config(d, address, val, len); - - if (ranges_overlap(address, len, PCI_COMMAND, 2) || - -- /* io base/limit */ -- ranges_overlap(address, len, PCI_IO_BASE, 2) || -+ /* vga enable */ -+ ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2) || - -- /* memory base/limit, prefetchable base/limit and -- io base/limit upper 16 */ -- ranges_overlap(address, len, PCI_MEMORY_BASE, 20) || -+ (val != barmask && -+ /* io base/limit */ -+ (ranges_overlap(address, len, PCI_IO_BASE, 2) || - -- /* vga enable */ -- ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) { -+ /* memory base/limit, prefetchable base/limit and -+ io base/limit upper 16 */ -+ ranges_overlap(address, len, PCI_MEMORY_BASE, 20)))) { - pci_bridge_update_mappings(s); - } - --- -1.8.3.1 - diff --git a/results/classifier/02/other/70416488 b/results/classifier/02/other/70416488 deleted file mode 100644 index b13fcb8c8..000000000 --- a/results/classifier/02/other/70416488 +++ /dev/null @@ -1,1180 +0,0 @@ -other: 0.980 -semantic: 0.975 -instruction: 0.947 -mistranslation: 0.942 -boot: 0.897 - -[Bug Report] smmuv3 event 0x10 report when running virtio-blk-pci - -Hi All, - -When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -during kernel booting up. - -qemu command which I use is as below: - -qemu-system-aarch64 -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ --kernel Image -initrd minifs.cpio.gz \ --enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ --append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ --device -pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -\ --device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ --device -virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ --drive file=/home/boot.img,if=none,id=drive0,format=raw - -smmuv3 event 0x10 log: -[...] -[ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -[ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -[ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -[ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks (1.07 -GB/1.00 GiB) -[ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -[ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -[ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -[ 1.968381] clk: Disabling unused clocks -[ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -[ 1.968990] PM: genpd: Disabling unused power domains -[ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.969814] ALSA device list: -[ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.970471] No soundcards found. -[ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -[ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -[ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -[ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -[ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.975005] Freeing unused kernel memory: 10112K -[ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -[ 1.975442] Run init as init process - -Another information is that if "maxcpus=3" is removed from the kernel command -line, -it will be OK. - -I am not sure if there is a bug about vsmmu. It will be very appreciated if -anyone -know this issue or can take a look at it. - -Thanks, -Zhou - -On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote: -> -> -Hi All, -> -> -When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> -during kernel booting up. -Does it still do this if you either: - (1) use the v9.1.0 release (commit fd1952d814da) - (2) use "-machine virt-9.1" instead of "-machine virt" - -? - -My suspicion is that this will have started happening now that -we expose an SMMU with two-stage translation support to the guest -in the "virt" machine type (which we do not if you either -use virt-9.1 or in the v9.1.0 release). - -I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of -the two-stage support). - -> -qemu command which I use is as below: -> -> -qemu-system-aarch64 -machine -> -virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> --kernel Image -initrd minifs.cpio.gz \ -> --enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> --append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> --device -> -pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> -\ -> --device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> --device -> -virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> --drive file=/home/boot.img,if=none,id=drive0,format=raw -> -> -smmuv3 event 0x10 log: -> -[...] -> -[ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> -[ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> -[ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> -[ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> -(1.07 GB/1.00 GiB) -> -[ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> -[ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.968381] clk: Disabling unused clocks -> -[ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.968990] PM: genpd: Disabling unused power domains -> -[ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.969814] ALSA device list: -> -[ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.970471] No soundcards found. -> -[ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.975005] Freeing unused kernel memory: 10112K -> -[ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.975442] Run init as init process -> -> -Another information is that if "maxcpus=3" is removed from the kernel command -> -line, -> -it will be OK. -> -> -I am not sure if there is a bug about vsmmu. It will be very appreciated if -> -anyone -> -know this issue or can take a look at it. -thanks --- PMM - -On 2024/9/9 22:31, Peter Maydell wrote: -> -On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote: -> -> -> -> Hi All, -> -> -> -> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> -> during kernel booting up. -> -> -Does it still do this if you either: -> -(1) use the v9.1.0 release (commit fd1952d814da) -> -(2) use "-machine virt-9.1" instead of "-machine virt" -I tested above two cases, the problem is still there. - -> -> -? -> -> -My suspicion is that this will have started happening now that -> -we expose an SMMU with two-stage translation support to the guest -> -in the "virt" machine type (which we do not if you either -> -use virt-9.1 or in the v9.1.0 release). -> -> -I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of -> -the two-stage support). -> -> -> qemu command which I use is as below: -> -> -> -> qemu-system-aarch64 -machine -> -> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> -> -kernel Image -initrd minifs.cpio.gz \ -> -> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> -> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> -> -device -> -> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> -> \ -> -> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> -> -device -> -> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> -> -drive file=/home/boot.img,if=none,id=drive0,format=raw -> -> -> -> smmuv3 event 0x10 log: -> -> [...] -> -> [ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> -> [ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> -> [ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> -> [ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> -> (1.07 GB/1.00 GiB) -> -> [ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> -> [ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.968381] clk: Disabling unused clocks -> -> [ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.968990] PM: genpd: Disabling unused power domains -> -> [ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.969814] ALSA device list: -> -> [ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.970471] No soundcards found. -> -> [ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.975005] Freeing unused kernel memory: 10112K -> -> [ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.975442] Run init as init process -> -> -> -> Another information is that if "maxcpus=3" is removed from the kernel -> -> command line, -> -> it will be OK. -> -> -> -> I am not sure if there is a bug about vsmmu. It will be very appreciated if -> -> anyone -> -> know this issue or can take a look at it. -> -> -thanks -> --- PMM -> -. - -Hi Zhou, -On 9/10/24 03:24, Zhou Wang via wrote: -> -On 2024/9/9 22:31, Peter Maydell wrote: -> -> On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote: -> ->> Hi All, -> ->> -> ->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> ->> during kernel booting up. -> -> Does it still do this if you either: -> -> (1) use the v9.1.0 release (commit fd1952d814da) -> -> (2) use "-machine virt-9.1" instead of "-machine virt" -> -I tested above two cases, the problem is still there. -Thank you for reporting. I am able to reproduce and effectively the -maxcpus kernel option is triggering the issue. It works without. I will -come back to you asap. - -Eric -> -> -> ? -> -> -> -> My suspicion is that this will have started happening now that -> -> we expose an SMMU with two-stage translation support to the guest -> -> in the "virt" machine type (which we do not if you either -> -> use virt-9.1 or in the v9.1.0 release). -> -> -> -> I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of -> -> the two-stage support). -> -> -> ->> qemu command which I use is as below: -> ->> -> ->> qemu-system-aarch64 -machine -> ->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> ->> -kernel Image -initrd minifs.cpio.gz \ -> ->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> ->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> ->> -device -> ->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> ->> \ -> ->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> ->> -device -> ->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> ->> -drive file=/home/boot.img,if=none,id=drive0,format=raw -> ->> -> ->> smmuv3 event 0x10 log: -> ->> [...] -> ->> [ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> ->> [ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> ->> [ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> ->> [ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> ->> (1.07 GB/1.00 GiB) -> ->> [ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> ->> [ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.968381] clk: Disabling unused clocks -> ->> [ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.968990] PM: genpd: Disabling unused power domains -> ->> [ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.969814] ALSA device list: -> ->> [ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.970471] No soundcards found. -> ->> [ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975005] Freeing unused kernel memory: 10112K -> ->> [ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975442] Run init as init process -> ->> -> ->> Another information is that if "maxcpus=3" is removed from the kernel -> ->> command line, -> ->> it will be OK. -> ->> -> ->> I am not sure if there is a bug about vsmmu. It will be very appreciated if -> ->> anyone -> ->> know this issue or can take a look at it. -> -> thanks -> -> -- PMM -> -> . - -Hi, - -On 9/10/24 03:24, Zhou Wang via wrote: -> -On 2024/9/9 22:31, Peter Maydell wrote: -> -> On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote: -> ->> Hi All, -> ->> -> ->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> ->> during kernel booting up. -> -> Does it still do this if you either: -> -> (1) use the v9.1.0 release (commit fd1952d814da) -> -> (2) use "-machine virt-9.1" instead of "-machine virt" -> -I tested above two cases, the problem is still there. -I have not much progressed yet but I see it comes with -qemu traces. - -smmuv3-iommu-memory-region-0-0 translation failed for iova=0x0 -(SMMU_EVT_F_TRANSLATION) -../.. -qemu-system-aarch64: virtio-blk failed to set guest notifier (-22), -ensure -accel kvm is set. -qemu-system-aarch64: virtio_bus_start_ioeventfd: failed. Fallback to -userspace (slower). - -the PCIe Host bridge seems to cause that translation failure at iova=0 - -Also virtio-iommu has the same issue: -qemu-system-aarch64: virtio_iommu_translate no mapping for 0x0 for sid=1024 -qemu-system-aarch64: virtio-blk failed to set guest notifier (-22), -ensure -accel kvm is set. -qemu-system-aarch64: virtio_bus_start_ioeventfd: failed. Fallback to -userspace (slower). - -Only happens with maxcpus=3. Note the virtio-blk-pci is not protected by -the vIOMMU in your case. - -Thanks - -Eric - -> -> -> ? -> -> -> -> My suspicion is that this will have started happening now that -> -> we expose an SMMU with two-stage translation support to the guest -> -> in the "virt" machine type (which we do not if you either -> -> use virt-9.1 or in the v9.1.0 release). -> -> -> -> I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of -> -> the two-stage support). -> -> -> ->> qemu command which I use is as below: -> ->> -> ->> qemu-system-aarch64 -machine -> ->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> ->> -kernel Image -initrd minifs.cpio.gz \ -> ->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> ->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> ->> -device -> ->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> ->> \ -> ->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> ->> -device -> ->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> ->> -drive file=/home/boot.img,if=none,id=drive0,format=raw -> ->> -> ->> smmuv3 event 0x10 log: -> ->> [...] -> ->> [ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> ->> [ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> ->> [ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> ->> [ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> ->> (1.07 GB/1.00 GiB) -> ->> [ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> ->> [ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.968381] clk: Disabling unused clocks -> ->> [ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.968990] PM: genpd: Disabling unused power domains -> ->> [ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.969814] ALSA device list: -> ->> [ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.970471] No soundcards found. -> ->> [ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975005] Freeing unused kernel memory: 10112K -> ->> [ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975442] Run init as init process -> ->> -> ->> Another information is that if "maxcpus=3" is removed from the kernel -> ->> command line, -> ->> it will be OK. -> ->> -> ->> I am not sure if there is a bug about vsmmu. It will be very appreciated if -> ->> anyone -> ->> know this issue or can take a look at it. -> -> thanks -> -> -- PMM -> -> . - -Hi Zhou, - -On Mon, Sep 9, 2024 at 3:22â¯PM Zhou Wang via <qemu-devel@nongnu.org> wrote: -> -> -Hi All, -> -> -When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> -during kernel booting up. -> -> -qemu command which I use is as below: -> -> -qemu-system-aarch64 -machine -> -virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> --kernel Image -initrd minifs.cpio.gz \ -> --enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> --append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> --device -> -pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> -\ -> --device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> --device -> -virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> --drive file=/home/boot.img,if=none,id=drive0,format=raw -> -> -smmuv3 event 0x10 log: -> -[...] -> -[ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> -[ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> -[ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> -[ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> -(1.07 GB/1.00 GiB) -> -[ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> -[ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.968381] clk: Disabling unused clocks -> -[ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.968990] PM: genpd: Disabling unused power domains -> -[ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.969814] ALSA device list: -> -[ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.970471] No soundcards found. -> -[ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -[ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -[ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -[ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.975005] Freeing unused kernel memory: 10112K -> -[ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -[ 1.975442] Run init as init process -> -> -Another information is that if "maxcpus=3" is removed from the kernel command -> -line, -> -it will be OK. -> -That's interesting, not sure how that would be related. - -> -I am not sure if there is a bug about vsmmu. It will be very appreciated if -> -anyone -> -know this issue or can take a look at it. -> -Can you please provide logs with adding "-d trace:smmu*" to qemu invocation. - -Also if possible, can you please provide which Linux kernel version -you are using, I will see if I can repro. - -Thanks, -Mostafa - -> -Thanks, -> -Zhou -> -> -> - -On 2024/9/9 22:47, Mostafa Saleh wrote: -> -Hi Zhou, -> -> -On Mon, Sep 9, 2024 at 3:22â¯PM Zhou Wang via <qemu-devel@nongnu.org> wrote: -> -> -> -> Hi All, -> -> -> -> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10 -> -> during kernel booting up. -> -> -> -> qemu command which I use is as below: -> -> -> -> qemu-system-aarch64 -machine -> -> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> -> -kernel Image -initrd minifs.cpio.gz \ -> -> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> -> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> -> -device -> -> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> -> \ -> -> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \ -> -> -device -> -> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> -> -drive file=/home/boot.img,if=none,id=drive0,format=raw -> -> -> -> smmuv3 event 0x10 log: -> -> [...] -> -> [ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> -> [ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> -> [ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> -> [ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> -> (1.07 GB/1.00 GiB) -> -> [ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> -> [ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.968381] clk: Disabling unused clocks -> -> [ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.968990] PM: genpd: Disabling unused power domains -> -> [ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.969814] ALSA device list: -> -> [ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.970471] No soundcards found. -> -> [ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> -> [ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> -> [ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> -> [ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.975005] Freeing unused kernel memory: 10112K -> -> [ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> -> [ 1.975442] Run init as init process -> -> -> -> Another information is that if "maxcpus=3" is removed from the kernel -> -> command line, -> -> it will be OK. -> -> -> -> -That's interesting, not sure how that would be related. -> -> -> I am not sure if there is a bug about vsmmu. It will be very appreciated if -> -> anyone -> -> know this issue or can take a look at it. -> -> -> -> -Can you please provide logs with adding "-d trace:smmu*" to qemu invocation. -Sure. Please see the attached log(using above qemu commit and command). - -> -> -Also if possible, can you please provide which Linux kernel version -> -you are using, I will see if I can repro. -I just use the latest mainline kernel(commit b831f83e40a2) with defconfig. - -Thanks, -Zhou - -> -> -Thanks, -> -Mostafa -> -> -> Thanks, -> -> Zhou -> -> -> -> -> -> -> -> -. -qemu_boot_log.txt -Description: -Text document - -On Tue, Sep 10, 2024 at 2:51â¯AM Zhou Wang <wangzhou1@hisilicon.com> wrote: -> -> -On 2024/9/9 22:47, Mostafa Saleh wrote: -> -> Hi Zhou, -> -> -> -> On Mon, Sep 9, 2024 at 3:22â¯PM Zhou Wang via <qemu-devel@nongnu.org> wrote: -> ->> -> ->> Hi All, -> ->> -> ->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event -> ->> 0x10 -> ->> during kernel booting up. -> ->> -> ->> qemu command which I use is as below: -> ->> -> ->> qemu-system-aarch64 -machine -> ->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \ -> ->> -kernel Image -initrd minifs.cpio.gz \ -> ->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \ -> ->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \ -> ->> -device -> ->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 -> ->> \ -> ->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 -> ->> \ -> ->> -device -> ->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \ -> ->> -drive file=/home/boot.img,if=none,id=drive0,format=raw -> ->> -> ->> smmuv3 event 0x10 log: -> ->> [...] -> ->> [ 1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0 -> ->> [ 1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002) -> ->> [ 1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues -> ->> [ 1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks -> ->> (1.07 GB/1.00 GiB) -> ->> [ 1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0 -> ->> [ 1.967478] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.968381] clk: Disabling unused clocks -> ->> [ 1.968677] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.968990] PM: genpd: Disabling unused power domains -> ->> [ 1.969424] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.969814] ALSA device list: -> ->> [ 1.970240] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.970471] No soundcards found. -> ->> [ 1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971600] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.971601] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971602] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received: -> ->> [ 1.971607] arm-smmu-v3 9050000.smmuv3: 0x0000020000000010 -> ->> [ 1.974202] arm-smmu-v3 9050000.smmuv3: 0x0000020000000000 -> ->> [ 1.974634] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975005] Freeing unused kernel memory: 10112K -> ->> [ 1.975062] arm-smmu-v3 9050000.smmuv3: 0x0000000000000000 -> ->> [ 1.975442] Run init as init process -> ->> -> ->> Another information is that if "maxcpus=3" is removed from the kernel -> ->> command line, -> ->> it will be OK. -> ->> -> -> -> -> That's interesting, not sure how that would be related. -> -> -> ->> I am not sure if there is a bug about vsmmu. It will be very appreciated -> ->> if anyone -> ->> know this issue or can take a look at it. -> ->> -> -> -> -> Can you please provide logs with adding "-d trace:smmu*" to qemu invocation. -> -> -Sure. Please see the attached log(using above qemu commit and command). -> -Thanks a lot, it seems the SMMUv3 indeed receives a translation -request with addr 0x0 which causes this event. -I don't see any kind of modification (alignment) of the address in this path. -So my hunch it's not related to the SMMUv3 and the initiator is -issuing bogus addresses. - -> -> -> -> Also if possible, can you please provide which Linux kernel version -> -> you are using, I will see if I can repro. -> -> -I just use the latest mainline kernel(commit b831f83e40a2) with defconfig. -> -I see, I can't repro in my setup which has no "--enable-kvm" and with -"-cpu max" instead of host. -I will try other options and see if I can repro. - -Thanks, -Mostafa -> -Thanks, -> -Zhou -> -> -> -> -> Thanks, -> -> Mostafa -> -> -> ->> Thanks, -> ->> Zhou -> ->> -> ->> -> ->> -> -> -> -> . - diff --git a/results/classifier/02/other/74715356 b/results/classifier/02/other/74715356 deleted file mode 100644 index 0ca35919d..000000000 --- a/results/classifier/02/other/74715356 +++ /dev/null @@ -1,127 +0,0 @@ -other: 0.927 -semantic: 0.916 -instruction: 0.910 -boot: 0.881 -mistranslation: 0.870 - -[Bug] x86 EFLAGS refresh is not happening correctly - -Hello, -I'm posting this here instead of opening an issue as it is not clear to me if this is a bug or not. -The issue is located in function "cpu_compute_eflags" in target/i386/cpu.h -( -https://gitlab.com/qemu-project/qemu/-/blob/master/target/i386/cpu.h#L2071 -) -This function is exectued in an out of cpu loop context. -It is used to synchronize TCG internal eflags registers (CC_OP, CC_SRC, etc...) with the CPU eflags field upon loop exit. -It does: -  eflags -|= -cpu_cc_compute_all -( -env -, -CC_OP -) -| -( -env --> -df -& -DF_MASK -); -Shouldn't it be: -   -eflags -= -cpu_cc_compute_all -( -env -, -CC_OP -) -| -( -env --> -df -& -DF_MASK -); -as eflags is entirely reevaluated by "cpu_cc_compute_all" ? -Thanks, -Kind regards, -Stevie - -On 05/08/21 11:51, Stevie Lavern wrote: -Shouldn't it be: -eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK); -as eflags is entirely reevaluated by "cpu_cc_compute_all" ? -No, both are wrong. env->eflags contains flags other than the -arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved. -The right code is in helper_read_eflags. You can move it into -cpu_compute_eflags, and make helper_read_eflags use it. -Paolo - -On 05/08/21 13:24, Paolo Bonzini wrote: -On 05/08/21 11:51, Stevie Lavern wrote: -Shouldn't it be: -eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK); -as eflags is entirely reevaluated by "cpu_cc_compute_all" ? -No, both are wrong. env->eflags contains flags other than the -arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved. -The right code is in helper_read_eflags. You can move it into -cpu_compute_eflags, and make helper_read_eflags use it. -Ah, actually the two are really the same, the TF/VM bits do not apply to -cpu_compute_eflags so it's correct. -What seems wrong is migration of the EFLAGS register. There should be -code in cpu_pre_save and cpu_post_load to special-case it and setup -CC_DST/CC_OP as done in cpu_load_eflags. -Also, cpu_load_eflags should assert that update_mask does not include -any of the arithmetic flags. -Paolo - -Thank for your reply! -It's still a bit cryptic for me. -I think i need to precise that I'm using a x86_64 custom user-mode,base on linux user-mode, that i'm developing (unfortunately i cannot share the code) with modifications in the translation loop (I've added cpu loop exits on specific instructions which are not control flow instructions). -If my understanding is correct, in the user-mode case 'cpu_compute_eflags' is called directly by 'x86_cpu_exec_exit' with the intention of synchronizing the CPU env->eflags field with its real value (represented by the CC_* fields). -I'm not sure how 'cpu_pre_save' and 'cpu_post_load' are involved in this case. - -As you said in your first email, 'helper_read_eflags' seems to be the correct way to go. -Here is some detail about my current experimentation/understanding of this "issue": -With the current implementation -     -eflags |= cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK); -if I exit the loop with a CC_OP different from CC_OP_EFLAGS, I found that the resulting env->eflags may be invalid. -In my test case, the loop was exiting with eflags = 0x44 and CC_OP = CC_OP_SUBL with CC_DST=1, CC_SRC=258, CC_SRC2=0. -While 'cpu_cc_compute_all' computes the correct flags (ZF:0, PF:0), the result will still be 0x44 (ZF:1, PF:1) due to the 'or' operation, thus leading to an incorrect eflags value loaded into the CPU env. -In my case, after loop reentry, it led to an invalid branch to be taken. -Thanks for your time! -Regards -Stevie - -On Thu, Aug 5, 2021 at 1:33 PM Paolo Bonzini < -pbonzini@redhat.com -> wrote: -On 05/08/21 13:24, Paolo Bonzini wrote: -> On 05/08/21 11:51, Stevie Lavern wrote: ->> ->> Shouldn't it be: ->> eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK); ->> as eflags is entirely reevaluated by "cpu_cc_compute_all" ? -> -> No, both are wrong. env->eflags contains flags other than the -> arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved. -> -> The right code is in helper_read_eflags. You can move it into -> cpu_compute_eflags, and make helper_read_eflags use it. -Ah, actually the two are really the same, the TF/VM bits do not apply to -cpu_compute_eflags so it's correct. -What seems wrong is migration of the EFLAGS register. There should be -code in cpu_pre_save and cpu_post_load to special-case it and setup -CC_DST/CC_OP as done in cpu_load_eflags. -Also, cpu_load_eflags should assert that update_mask does not include -any of the arithmetic flags. -Paolo - diff --git a/results/classifier/02/other/79834768 b/results/classifier/02/other/79834768 deleted file mode 100644 index 2fe8ea560..000000000 --- a/results/classifier/02/other/79834768 +++ /dev/null @@ -1,410 +0,0 @@ -other: 0.943 -semantic: 0.920 -instruction: 0.911 -boot: 0.880 -mistranslation: 0.877 - -[Qemu-devel] [BUG] Windows 7 got stuck easily while run PCMark10 application - -Hiï¼ - -We hit a bug in our test while run PCMark 10 in a windows 7 VM, -The VM got stuck and the wallclock was hang after several minutes running -PCMark 10 in it. -It is quite easily to reproduce the bug with the upstream KVM and Qemu. - -We found that KVM can not inject any RTC irq to VM after it was hang, it fails -to -Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr. - -static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq, - int irq_level, bool line_status) -{ -⦠⦠- if (!irq_level) { - ioapic->irr &= ~mask; - ret = 1; - goto out; - } -⦠⦠- if ((edge && old_irr == ioapic->irr) || - (!edge && entry.fields.remote_irr)) { - ret = 0; - goto out; - } - -According to RTC spec, after RTC injects a High level irq, OS will read CMOSâs -register C to to clear the irq flag, and pull down the irq electric pin. - -For Qemu, we will emulate the reading operation in cmos_ioport_read(), -but Guest OS will fire a write operation before to tell which register will be -read -after this write, where we use s->cmos_index to record the following register -to read. - -But in our test, we found that there is a possible situation that Vcpu fails to -read -RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading -registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C, -so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C, -but before it tries to read register C, another vcpu1 is going to read RTC_YEAR, -it changes s->cmos_index to RTC_YEAR by a writing action. -The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we -will miss -calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never -inject RTC irq, -and Windows VM will hang. -static void cmos_ioport_write(void *opaque, hwaddr addr, - uint64_t data, unsigned size) -{ - RTCState *s = opaque; - - if ((addr & 1) == 0) { - s->cmos_index = data & 0x7f; - } -â¦â¦ -static uint64_t cmos_ioport_read(void *opaque, hwaddr addr, - unsigned size) -{ - RTCState *s = opaque; - int ret; - if ((addr & 1) == 0) { - return 0xff; - } else { - switch(s->cmos_index) { - -According to CMOS spec, âany write to PROT 0070h should be followed by an -action to PROT 0071h or the RTC -Will be RTC will be left in an unknown stateâ, but it seems that we can not -ensure this sequence in qemu/kvm. - -Any ideas ? - -Thanks, -Hailiang - -Pls see the trace of kvm_pio: - - CPU 1/KVM-15567 [003] .... 209311.762579: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762582: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x89 - CPU 1/KVM-15567 [003] .... 209311.762590: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x17 - CPU 0/KVM-15566 [005] .... 209311.762611: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0xc - CPU 1/KVM-15567 [003] .... 209311.762615: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762619: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x88 - CPU 1/KVM-15567 [003] .... 209311.762627: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x12 - CPU 0/KVM-15566 [005] .... 209311.762632: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x12 - CPU 1/KVM-15567 [003] .... 209311.762633: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 0/KVM-15566 [005] .... 209311.762634: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0xc <--- Firstly write to 0x70, cmo_index = 0xc & -0x7f = 0xc - CPU 1/KVM-15567 [003] .... 209311.762636: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x86 <-- Secondly write to 0x70, cmo_index = 0x86 & -0x7f = 0x6, cover the cmo_index result of first time - CPU 0/KVM-15566 [005] .... 209311.762641: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x6 <-- vcpu0 read 0x6 because cmo_index is 0x6 now - CPU 1/KVM-15567 [003] .... 209311.762644: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x6 <- vcpu1 read 0x6 - CPU 1/KVM-15567 [003] .... 209311.762649: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762669: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x87 - CPU 1/KVM-15567 [003] .... 209311.762678: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x1 - CPU 1/KVM-15567 [003] .... 209311.762683: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762686: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x84 - CPU 1/KVM-15567 [003] .... 209311.762693: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x10 - CPU 1/KVM-15567 [003] .... 209311.762699: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762702: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x82 - CPU 1/KVM-15567 [003] .... 209311.762709: kvm_pio: pio_read at 0x71 size -1 count 1 val 0x25 - CPU 1/KVM-15567 [003] .... 209311.762714: kvm_pio: pio_read at 0x70 size -1 count 1 val 0xff - CPU 1/KVM-15567 [003] .... 209311.762717: kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x80 - - -Regards, --Gonglei - -From: Zhanghailiang -Sent: Friday, December 01, 2017 3:03 AM -To: address@hidden; address@hidden; Paolo Bonzini -Cc: Huangweidong (C); Gonglei (Arei); wangxin (U); Xiexiangyou -Subject: [BUG] Windows 7 got stuck easily while run PCMark10 application - -Hiï¼ - -We hit a bug in our test while run PCMark 10 in a windows 7 VM, -The VM got stuck and the wallclock was hang after several minutes running -PCMark 10 in it. -It is quite easily to reproduce the bug with the upstream KVM and Qemu. - -We found that KVM can not inject any RTC irq to VM after it was hang, it fails -to -Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr. - -static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq, - int irq_level, bool line_status) -{ -⦠⦠- if (!irq_level) { - ioapic->irr &= ~mask; - ret = 1; - goto out; - } -⦠⦠- if ((edge && old_irr == ioapic->irr) || - (!edge && entry.fields.remote_irr)) { - ret = 0; - goto out; - } - -According to RTC spec, after RTC injects a High level irq, OS will read CMOSâs -register C to to clear the irq flag, and pull down the irq electric pin. - -For Qemu, we will emulate the reading operation in cmos_ioport_read(), -but Guest OS will fire a write operation before to tell which register will be -read -after this write, where we use s->cmos_index to record the following register -to read. - -But in our test, we found that there is a possible situation that Vcpu fails to -read -RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading -registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C, -so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C, -but before it tries to read register C, another vcpu1 is going to read RTC_YEAR, -it changes s->cmos_index to RTC_YEAR by a writing action. -The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we -will miss -calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never -inject RTC irq, -and Windows VM will hang. -static void cmos_ioport_write(void *opaque, hwaddr addr, - uint64_t data, unsigned size) -{ - RTCState *s = opaque; - - if ((addr & 1) == 0) { - s->cmos_index = data & 0x7f; - } -â¦â¦ -static uint64_t cmos_ioport_read(void *opaque, hwaddr addr, - unsigned size) -{ - RTCState *s = opaque; - int ret; - if ((addr & 1) == 0) { - return 0xff; - } else { - switch(s->cmos_index) { - -According to CMOS spec, âany write to PROT 0070h should be followed by an -action to PROT 0071h or the RTC -Will be RTC will be left in an unknown stateâ, but it seems that we can not -ensure this sequence in qemu/kvm. - -Any ideas ? - -Thanks, -Hailiang - -On 01/12/2017 08:08, Gonglei (Arei) wrote: -> -First write to 0x70, cmos_index = 0xc & 0x7f = 0xc -> -      CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc> -> -Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6>       CPU 1/KVM-15567 -> -kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because -> -cmos_index is 0x6 now:>       CPU 0/KVM-15566 kvm_pio: pio_read at 0x71 size -> -1 count 1 val 0x6> vcpu1 read 0x6:>       CPU 1/KVM-15567 kvm_pio: pio_read -> -at 0x71 size 1 count 1 val 0x6 -This seems to be a Windows bug. The easiest workaround that I -can think of is to clear the interrupts already when 0xc is written, -without waiting for the read (because REG_C can only be read). - -What do you think? - -Thanks, - -Paolo - -I also think it's windows bug, the problem is that it doesn't occur on xen -platform. And there are some other works need to be done while reading REG_C. -So I wrote that patch. - -Thanks, -Gonglei -å件人ï¼Paolo Bonzini -æ¶ä»¶äººï¼é¾ç£,å¼ æµ·äº®,qemu-devel,Michael S. Tsirkin -æéï¼é»ä¼æ ,çæ¬£,谢祥æ -æ¶é´ï¼2017-12-02 01:10:08 -主é¢:Re: [BUG] Windows 7 got stuck easily while run PCMark10 application - -On 01/12/2017 08:08, Gonglei (Arei) wrote: -> -First write to 0x70, cmos_index = 0xc & 0x7f = 0xc -> -CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc> -> -Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6> CPU 1/KVM-15567 -> -kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because -> -cmos_index is 0x6 now:> CPU 0/KVM-15566 kvm_pio: pio_read at 0x71 size -> -1 count 1 val 0x6> vcpu1 read 0x6:> CPU 1/KVM-15567 kvm_pio: pio_read -> -at 0x71 size 1 count 1 val 0x6 -This seems to be a Windows bug. The easiest workaround that I -can think of is to clear the interrupts already when 0xc is written, -without waiting for the read (because REG_C can only be read). - -What do you think? - -Thanks, - -Paolo - -On 01/12/2017 18:45, Gonglei (Arei) wrote: -> -I also think it's windows bug, the problem is that it doesn't occur on -> -xen platform. -It's a race, it may just be that RTC PIO is faster in Xen because it's -implemented in the hypervisor. - -I will try reporting it to Microsoft. - -Thanks, - -Paolo - -> -Thanks, -> -Gonglei -> -*å件人ï¼*Paolo Bonzini -> -*æ¶ä»¶äººï¼*é¾ç£,å¼ æµ·äº®,qemu-devel,Michael S. Tsirkin -> -*æéï¼*é»ä¼æ ,çæ¬£,谢祥æ -> -*æ¶é´ï¼*2017-12-02 01:10:08 -> -*主é¢:*Re: [BUG] Windows 7 got stuck easily while run PCMark10 application -> -> -On 01/12/2017 08:08, Gonglei (Arei) wrote: -> -> First write to 0x70, cmos_index = 0xc & 0x7f = 0xc -> ->       CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc> -> -> Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6>       CPU 1/KVM-15567 -> -> kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because -> -> cmos_index is 0x6 now:>       CPU 0/KVM-15566 kvm_pio: pio_read at 0x71 -> -> size 1 count 1 val 0x6> vcpu1 -> -read 0x6:>       CPU 1/KVM-15567 kvm_pio: pio_read at 0x71 size 1 count -> -1 val 0x6 -> -This seems to be a Windows bug. The easiest workaround that I -> -can think of is to clear the interrupts already when 0xc is written, -> -without waiting for the read (because REG_C can only be read). -> -> -What do you think? -> -> -Thanks, -> -> -Paolo - -On 2017/12/2 2:37, Paolo Bonzini wrote: -On 01/12/2017 18:45, Gonglei (Arei) wrote: -I also think it's windows bug, the problem is that it doesn't occur on -xen platform. -It's a race, it may just be that RTC PIO is faster in Xen because it's -implemented in the hypervisor. -No, In Xen, it does not has such problem because it injects the RTC irq without -checking whether its previous irq been cleared or not, which we do has such -checking -in KVM. - -static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq, - int irq_level, bool line_status) -{ - ... ... - if (!irq_level) { - ioapic->irr &= ~mask; -->clear the RTC irq in irr, Or we will can not -inject RTC irq. - ret = 1; - goto out; - } - -I agree that we move the operation of clearing RTC irq from cmos_ioport_read() -to -cmos_ioport_write() to ensure the action been done. - -Thanks, -Hailiang -I will try reporting it to Microsoft. - -Thanks, - -Paolo -Thanks, -Gonglei -*å件人ï¼*Paolo Bonzini -*æ¶ä»¶äººï¼*é¾ç£,å¼ æµ·äº®,qemu-devel,Michael S. Tsirkin -*æéï¼*é»ä¼æ ,çæ¬£,谢祥æ -*æ¶é´ï¼*2017-12-02 01:10:08 -*主é¢:*Re: [BUG] Windows 7 got stuck easily while run PCMark10 application - -On 01/12/2017 08:08, Gonglei (Arei) wrote: -First write to 0x70, cmos_index = 0xc & 0x7f = 0xc - CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc> Second write to -0x70, cmos_index = 0x86 & 0x7f = 0x6> CPU 1/KVM-15567 kvm_pio: pio_write at 0x70 -size 1 count 1 val 0x86> vcpu0 read 0x6 because cmos_index is 0x6 now:> CPU -0/KVM-15566 kvm_pio: pio_read at 0x71 size 1 count 1 val 0x6> vcpu1 -read 0x6:> CPU 1/KVM-15567 kvm_pio: pio_read at 0x71 size 1 count -1 val 0x6 -This seems to be a Windows bug. The easiest workaround that I -can think of is to clear the interrupts already when 0xc is written, -without waiting for the read (because REG_C can only be read). - -What do you think? - -Thanks, - -Paolo -. - diff --git a/results/classifier/02/other/81775929 b/results/classifier/02/other/81775929 deleted file mode 100644 index 265dbe7f3..000000000 --- a/results/classifier/02/other/81775929 +++ /dev/null @@ -1,236 +0,0 @@ -other: 0.877 -semantic: 0.825 -instruction: 0.816 -mistranslation: 0.811 -boot: 0.742 - -[Qemu-devel] [BUG] Monitor QMP is broken ? - -Hello! - - I have updated my qemu to the recent version and it seems to have lost -compatibility with -libvirt. The error message is: ---- cut --- -internal error: unable to execute QEMU command 'qmp_capabilities': QMP input -object member -'id' is unexpected ---- cut --- - What does it mean? Is it intentional or not? - -Kind regards, -Pavel Fedin -Expert Engineer -Samsung Electronics Research center Russia - -Hello! - -> -I have updated my qemu to the recent version and it seems to have lost -> -compatibility -with -> -libvirt. The error message is: -> ---- cut --- -> -internal error: unable to execute QEMU command 'qmp_capabilities': QMP input -> -object -> -member -> -'id' is unexpected -> ---- cut --- -> -What does it mean? Is it intentional or not? -I have found the problem. It is caused by commit -65207c59d99f2260c5f1d3b9c491146616a522aa. libvirt does not seem to use the -removed -asynchronous interface but it still feeds in JSONs with 'id' field set to -something. So i -think the related fragment in qmp_check_input_obj() function should be brought -back - -Kind regards, -Pavel Fedin -Expert Engineer -Samsung Electronics Research center Russia - -On Fri, Jun 05, 2015 at 04:58:46PM +0300, Pavel Fedin wrote: -> -Hello! -> -> -> I have updated my qemu to the recent version and it seems to have lost -> -> compatibility -> -with -> -> libvirt. The error message is: -> -> --- cut --- -> -> internal error: unable to execute QEMU command 'qmp_capabilities': QMP -> -> input object -> -> member -> -> 'id' is unexpected -> -> --- cut --- -> -> What does it mean? Is it intentional or not? -> -> -I have found the problem. It is caused by commit -> -65207c59d99f2260c5f1d3b9c491146616a522aa. libvirt does not seem to use the -> -removed -> -asynchronous interface but it still feeds in JSONs with 'id' field set to -> -something. So i -> -think the related fragment in qmp_check_input_obj() function should be -> -brought back -If QMP is rejecting the 'id' parameter that is a regression bug. - -[quote] -The QMP spec says - -2.3 Issuing Commands --------------------- - -The format for command execution is: - -{ "execute": json-string, "arguments": json-object, "id": json-value } - - Where, - -- The "execute" member identifies the command to be executed by the Server -- The "arguments" member is used to pass any arguments required for the - execution of the command, it is optional when no arguments are - required. Each command documents what contents will be considered - valid when handling the json-argument -- The "id" member is a transaction identification associated with the - command execution, it is optional and will be part of the response if - provided. The "id" member can be any json-value, although most - clients merely use a json-number incremented for each successive - command - - -2.4 Commands Responses ----------------------- - -There are two possible responses which the Server will issue as the result -of a command execution: success or error. - -2.4.1 success -------------- - -The format of a success response is: - -{ "return": json-value, "id": json-value } - - Where, - -- The "return" member contains the data returned by the command, which - is defined on a per-command basis (usually a json-object or - json-array of json-objects, but sometimes a json-number, json-string, - or json-array of json-strings); it is an empty json-object if the - command does not return data -- The "id" member contains the transaction identification associated - with the command execution if issued by the Client - -[/quote] - -And as such, libvirt chose to /always/ send an 'id' parameter in all -commands it issues. - -We don't however validate the id in the reply, though arguably we -should have done so. - -Regards, -Daniel --- -|: -http://berrange.com --o- -http://www.flickr.com/photos/dberrange/ -:| -|: -http://libvirt.org --o- -http://virt-manager.org -:| -|: -http://autobuild.org --o- -http://search.cpan.org/~danberr/ -:| -|: -http://entangle-photo.org --o- -http://live.gnome.org/gtk-vnc -:| - -"Daniel P. Berrange" <address@hidden> writes: - -> -On Fri, Jun 05, 2015 at 04:58:46PM +0300, Pavel Fedin wrote: -> -> Hello! -> -> -> -> > I have updated my qemu to the recent version and it seems to have -> -> > lost compatibility -> -> with -> -> > libvirt. The error message is: -> -> > --- cut --- -> -> > internal error: unable to execute QEMU command 'qmp_capabilities': -> -> > QMP input object -> -> > member -> -> > 'id' is unexpected -> -> > --- cut --- -> -> > What does it mean? Is it intentional or not? -> -> -> -> I have found the problem. It is caused by commit -> -> 65207c59d99f2260c5f1d3b9c491146616a522aa. libvirt does not seem to -> -> use the removed -> -> asynchronous interface but it still feeds in JSONs with 'id' field -> -> set to something. So i -> -> think the related fragment in qmp_check_input_obj() function should -> -> be brought back -> -> -If QMP is rejecting the 'id' parameter that is a regression bug. -It is definitely a regression, my fault, and I'll get it fixed a.s.a.p. - -[...] - diff --git a/results/classifier/02/other/85542195 b/results/classifier/02/other/85542195 deleted file mode 100644 index 5ac980df2..000000000 --- a/results/classifier/02/other/85542195 +++ /dev/null @@ -1,121 +0,0 @@ -other: 0.944 -semantic: 0.941 -instruction: 0.935 -boot: 0.932 -mistranslation: 0.907 - -[Qemu-devel] [Bug in qemu-system-ppc running Mac OS 9 on Windows 10] - -Hi all, - -I've been experiencing issues when installing Mac OS 9.x using -qemu-system-ppc.exe in Windows 10. After booting from CD image, -partitioning a fresh disk image often hangs Qemu. When using a -pre-partitioned disk image, the OS installation process halts -somewhere during the process. The issues can be resolved by setting -qemu-system-ppc.exe to run in Windows 7 compatibility mode. -AFAIK all Qemu builds for Windows since Mac OS 9 became available as -guest are affected. -The issue is reproducible by installing Qemu for Windows from Stephan -Weil on Windows 10 and boot/install Mac OS 9.x - -Best regards and thanks for looking into this, -Howard - -On Nov 25, 2016, at 9:26 AM, address@hidden wrote: -Hi all, - -I've been experiencing issues when installing Mac OS 9.x using -qemu-system-ppc.exe in Windows 10. After booting from CD image, -partitioning a fresh disk image often hangs Qemu. When using a -pre-partitioned disk image, the OS installation process halts -somewhere during the process. The issues can be resolved by setting -qemu-system-ppc.exe to run in Windows 7 compatibility mode. -AFAIK all Qemu builds for Windows since Mac OS 9 became available as -guest are affected. -The issue is reproducible by installing Qemu for Windows from Stephan -Weil on Windows 10 and boot/install Mac OS 9.x - -Best regards and thanks for looking into this, -Howard -I assume there was some kind of behavior change for some of the -Windows API between Windows 7 and Windows 10, that is my guess as to -why the compatibility mode works. Could you run 'make check' on your -system, once in Windows 7 and once in Windows 10. Maybe the tests -will tell us something. I'm hoping that one of the tests succeeds in -Windows 7 and fails in Windows 10. That would help us pinpoint what -the problem is. -What I mean by run in Windows 7 is set the mingw environment to run -in Windows 7 compatibility mode (if possible). If you have Windows 7 -on another partition you could boot from, that would be better. -Good luck. -p.s. use 'make check -k' to allow all the tests to run (even if one -or more of the tests fails). - -> -> Hi all, -> -> -> -> I've been experiencing issues when installing Mac OS 9.x using -> -> qemu-system-ppc.exe in Windows 10. After booting from CD image, -> -> partitioning a fresh disk image often hangs Qemu. When using a -> -> pre-partitioned disk image, the OS installation process halts -> -> somewhere during the process. The issues can be resolved by setting -> -> qemu-system-ppc.exe to run in Windows 7 compatibility mode. -> -> AFAIK all Qemu builds for Windows since Mac OS 9 became available as -> -> guest are affected. -> -> The issue is reproducible by installing Qemu for Windows from Stephan -> -> Weil on Windows 10 and boot/install Mac OS 9.x -> -> -> -> Best regards and thanks for looking into this, -> -> Howard -> -> -> -I assume there was some kind of behavior change for some of the Windows API -> -between Windows 7 and Windows 10, that is my guess as to why the -> -compatibility mode works. Could you run 'make check' on your system, once in -> -Windows 7 and once in Windows 10. Maybe the tests will tell us something. -> -I'm hoping that one of the tests succeeds in Windows 7 and fails in Windows -> -10. That would help us pinpoint what the problem is. -> -> -What I mean by run in Windows 7 is set the mingw environment to run in -> -Windows 7 compatibility mode (if possible). If you have Windows 7 on another -> -partition you could boot from, that would be better. -> -> -Good luck. -> -> -p.s. use 'make check -k' to allow all the tests to run (even if one or more -> -of the tests fails). -Hi, - -Thank you for you suggestion, but I have no means to run the check you -suggest. I cross-compile from Linux. - -Best regards, -Howard - diff --git a/results/classifier/02/other/88225572 b/results/classifier/02/other/88225572 deleted file mode 100644 index 7e67a16f9..000000000 --- a/results/classifier/02/other/88225572 +++ /dev/null @@ -1,2901 +0,0 @@ -other: 0.987 -semantic: 0.976 -instruction: 0.974 -boot: 0.969 -mistranslation: 0.942 - -[BUG qemu 4.0] segfault when unplugging virtio-blk-pci device - -Hi, - -I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -think it's because io completion hits use-after-free when device is -already gone. Is this a known bug that has been fixed? (I went through -the git log but didn't find anything obvious). - -gdb backtrace is: - -Core was generated by `/usr/local/libexec/qemu-kvm -name -sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -Program terminated with signal 11, Segmentation fault. -#0 object_get_class (obj=obj@entry=0x0) at -/usr/src/debug/qemu-4.0/qom/object.c:903 -903 return obj->class; -(gdb) bt -#0 object_get_class (obj=obj@entry=0x0) at -/usr/src/debug/qemu-4.0/qom/object.c:903 -#1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -  vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -#2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -  opaque=0x558a2f2fd420, ret=0) -  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -#3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -#4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -  i1=<optimized out>) at /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -#5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -#6  0x00007fff9ed75780 in ?? () -#7  0x0000000000000000 in ?? () - -It seems like qemu was completing a discard/write_zero request, but -parent BusState was already freed & set to NULL. - -Do we need to drain all pending request before unrealizing virtio-blk -device? Like the following patch proposed? -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -If more info is needed, please let me know. - -Thanks, -Eryu - -On Tue, 31 Dec 2019 18:34:34 +0800 -Eryu Guan <address@hidden> wrote: - -> -Hi, -> -> -I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -think it's because io completion hits use-after-free when device is -> -already gone. Is this a known bug that has been fixed? (I went through -> -the git log but didn't find anything obvious). -> -> -gdb backtrace is: -> -> -Core was generated by `/usr/local/libexec/qemu-kvm -name -> -sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -Program terminated with signal 11, Segmentation fault. -> -#0 object_get_class (obj=obj@entry=0x0) at -> -/usr/src/debug/qemu-4.0/qom/object.c:903 -> -903 return obj->class; -> -(gdb) bt -> -#0 object_get_class (obj=obj@entry=0x0) at -> -/usr/src/debug/qemu-4.0/qom/object.c:903 -> -#1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -  vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -#2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -  opaque=0x558a2f2fd420, ret=0) -> -  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -#3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -#4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -  i1=<optimized out>) at -> -/usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -#5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -#6  0x00007fff9ed75780 in ?? () -> -#7  0x0000000000000000 in ?? () -> -> -It seems like qemu was completing a discard/write_zero request, but -> -parent BusState was already freed & set to NULL. -> -> -Do we need to drain all pending request before unrealizing virtio-blk -> -device? Like the following patch proposed? -> -> -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> -If more info is needed, please let me know. -may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -Thanks, -> -Eryu -> - -On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -On Tue, 31 Dec 2019 18:34:34 +0800 -> -Eryu Guan <address@hidden> wrote: -> -> -> Hi, -> -> -> -> I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> think it's because io completion hits use-after-free when device is -> -> already gone. Is this a known bug that has been fixed? (I went through -> -> the git log but didn't find anything obvious). -> -> -> -> gdb backtrace is: -> -> -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 object_get_class (obj=obj@entry=0x0) at -> -> /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> 903 return obj->class; -> -> (gdb) bt -> -> #0 object_get_class (obj=obj@entry=0x0) at -> -> /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> ->   vector=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> ->   opaque=0x558a2f2fd420, ret=0) -> ->   at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> ->   at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> ->   i1=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> #6  0x00007fff9ed75780 in ?? () -> -> #7  0x0000000000000000 in ?? () -> -> -> -> It seems like qemu was completing a discard/write_zero request, but -> -> parent BusState was already freed & set to NULL. -> -> -> -> Do we need to drain all pending request before unrealizing virtio-blk -> -> device? Like the following patch proposed? -> -> -> -> -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> -> -> If more info is needed, please let me know. -> -> -may be this will help: -https://patchwork.kernel.org/patch/11213047/ -Yeah, this looks promising! I'll try it out (though it's a one-time -crash for me). Thanks! - -Eryu - -On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> On Tue, 31 Dec 2019 18:34:34 +0800 -> -> Eryu Guan <address@hidden> wrote: -> -> -> -> > Hi, -> -> > -> -> > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > think it's because io completion hits use-after-free when device is -> -> > already gone. Is this a known bug that has been fixed? (I went through -> -> > the git log but didn't find anything obvious). -> -> > -> -> > gdb backtrace is: -> -> > -> -> > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > Program terminated with signal 11, Segmentation fault. -> -> > #0 object_get_class (obj=obj@entry=0x0) at -> -> > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > 903 return obj->class; -> -> > (gdb) bt -> -> > #0 object_get_class (obj=obj@entry=0x0) at -> -> > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> >   vector=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> >   opaque=0x558a2f2fd420, ret=0) -> -> >   at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> >   at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> >   i1=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > #6  0x00007fff9ed75780 in ?? () -> -> > #7  0x0000000000000000 in ?? () -> -> > -> -> > It seems like qemu was completing a discard/write_zero request, but -> -> > parent BusState was already freed & set to NULL. -> -> > -> -> > Do we need to drain all pending request before unrealizing virtio-blk -> -> > device? Like the following patch proposed? -> -> > -> -> > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > -> -> > If more info is needed, please let me know. -> -> -> -> may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -Yeah, this looks promising! I'll try it out (though it's a one-time -> -crash for me). Thanks! -After applying this patch, I don't see the original segfaut and -backtrace, but I see this crash - -[Thread debugging using libthread_db enabled] -Using host libthread_db library "/lib64/libthread_db.so.1". -Core was generated by `/usr/local/libexec/qemu-kvm -name -sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -Program terminated with signal 11, Segmentation fault. -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -addr=0, val=<optimized out>, size=<optimized out>) at -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -1324 VirtIOPCIProxy *proxy = -VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -Missing separate debuginfos, use: debuginfo-install -glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -(gdb) bt -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -addr=0, val=<optimized out>, size=<optimized out>) at -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -#1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -addr=<optimized out>, value=<optimized out>, size=<optimized out>, -shift=<optimized out>, mask=<optimized out>, attrs=...) at -/usr/src/debug/qemu-4.0/memory.c:502 -#2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, access_size_min=<optimized -out>, access_size_max=<optimized out>, access_fn=0x561216835ac0 -<memory_region_write_accessor>, mr=0x56121846d340, attrs=...) - at /usr/src/debug/qemu-4.0/memory.c:568 -#3 0x0000561216837c66 in memory_region_dispatch_write -(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -#4 0x00005612167e036f in flatview_write_continue (fv=fv@entry=0x56121852edd0, -addr=addr@entry=841813602304, attrs=..., buf=buf@entry=0x7fce7dd97028 <Address -0x7fce7dd97028 out of bounds>, len=len@entry=2, addr1=<optimized out>, -l=<optimized out>, mr=0x56121846d340) - at /usr/src/debug/qemu-4.0/exec.c:3279 -#5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, addr=841813602304, -attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, len=2) at -/usr/src/debug/qemu-4.0/exec.c:3318 -#6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at -/usr/src/debug/qemu-4.0/exec.c:3408 -#7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, addr=<optimized -out>, attrs=..., attrs@entry=..., buf=buf@entry=0x7fce7dd97028 <Address -0x7fce7dd97028 out of bounds>, len=<optimized out>, is_write=<optimized out>) -at /usr/src/debug/qemu-4.0/exec.c:3419 -#8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -#9 0x000056121682255e in qemu_kvm_cpu_thread_fn (arg=arg@entry=0x56121849aa00) -at /usr/src/debug/qemu-4.0/cpus.c:1281 -#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 - -And I searched and found -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -bug. - -But I can still hit the bug even after applying the commit. Do I miss -anything? - -Thanks, -Eryu -> -Eryu - -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > Eryu Guan <address@hidden> wrote: -> -> > -> -> > > Hi, -> -> > > -> -> > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > think it's because io completion hits use-after-free when device is -> -> > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > the git log but didn't find anything obvious). -> -> > > -> -> > > gdb backtrace is: -> -> > > -> -> > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > Program terminated with signal 11, Segmentation fault. -> -> > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > 903 return obj->class; -> -> > > (gdb) bt -> -> > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > vector=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > opaque=0x558a2f2fd420, ret=0) -> -> > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > i1=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > #6 0x00007fff9ed75780 in ?? () -> -> > > #7 0x0000000000000000 in ?? () -> -> > > -> -> > > It seems like qemu was completing a discard/write_zero request, but -> -> > > parent BusState was already freed & set to NULL. -> -> > > -> -> > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > device? Like the following patch proposed? -> -> > > -> -> > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > -> -> > > If more info is needed, please let me know. -> -> > -> -> > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> -> -> Yeah, this looks promising! I'll try it out (though it's a one-time -> -> crash for me). Thanks! -> -> -After applying this patch, I don't see the original segfaut and -> -backtrace, but I see this crash -> -> -[Thread debugging using libthread_db enabled] -> -Using host libthread_db library "/lib64/libthread_db.so.1". -> -Core was generated by `/usr/local/libexec/qemu-kvm -name -> -sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -Program terminated with signal 11, Segmentation fault. -> -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -addr=0, val=<optimized out>, size=<optimized out>) at -> -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -1324 VirtIOPCIProxy *proxy = -> -VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -Missing separate debuginfos, use: debuginfo-install -> -glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -(gdb) bt -> -#0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -addr=0, val=<optimized out>, size=<optimized out>) at -> -/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -#1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -/usr/src/debug/qemu-4.0/memory.c:502 -> -#2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -access_size_min=<optimized out>, access_size_max=<optimized out>, -> -access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -attrs=...) -> -at /usr/src/debug/qemu-4.0/memory.c:568 -> -#3 0x0000561216837c66 in memory_region_dispatch_write -> -(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -#4 0x00005612167e036f in flatview_write_continue -> -(fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -len=len@entry=2, addr1=<optimized out>, l=<optimized out>, mr=0x56121846d340) -> -at /usr/src/debug/qemu-4.0/exec.c:3279 -> -#5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out -> -of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -#6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at -> -/usr/src/debug/qemu-4.0/exec.c:3408 -> -#7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -addr=<optimized out>, attrs=..., attrs@entry=..., -> -buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -len=<optimized out>, is_write=<optimized out>) at -> -/usr/src/debug/qemu-4.0/exec.c:3419 -> -#8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -#9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -(arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -And I searched and found -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -bug. -> -> -But I can still hit the bug even after applying the commit. Do I miss -> -anything? -Hi Eryu, -This backtrace seems to be caused by this bug (there were two bugs in -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -Although the solution hasn't been tested on virtio-blk yet, you may -want to apply this patch: -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -Let me know if this works. - -Best regards, Julia Suvorova. - -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -> -> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > > Hi, -> -> > > > -> -> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > > think it's because io completion hits use-after-free when device is -> -> > > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > > the git log but didn't find anything obvious). -> -> > > > -> -> > > > gdb backtrace is: -> -> > > > -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > 903 return obj->class; -> -> > > > (gdb) bt -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > vector=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > i1=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > #7 0x0000000000000000 in ?? () -> -> > > > -> -> > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > parent BusState was already freed & set to NULL. -> -> > > > -> -> > > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > > device? Like the following patch proposed? -> -> > > > -> -> > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > -> -> > > > If more info is needed, please let me know. -> -> > > -> -> > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > -> -> > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > crash for me). Thanks! -> -> -> -> After applying this patch, I don't see the original segfaut and -> -> backtrace, but I see this crash -> -> -> -> [Thread debugging using libthread_db enabled] -> -> Using host libthread_db library "/lib64/libthread_db.so.1". -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> 1324 VirtIOPCIProxy *proxy = -> -> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> Missing separate debuginfos, use: debuginfo-install -> -> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> (gdb) bt -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -> addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> /usr/src/debug/qemu-4.0/memory.c:502 -> -> #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -> attrs=...) -> -> at /usr/src/debug/qemu-4.0/memory.c:568 -> -> #3 0x0000561216837c66 in memory_region_dispatch_write -> -> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> #4 0x00005612167e036f in flatview_write_continue -> -> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> mr=0x56121846d340) -> -> at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) -> -> at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., attrs@entry=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=<optimized out>, is_write=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/exec.c:3419 -> -> #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -> -> And I searched and found -> -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> bug. -> -> -> -> But I can still hit the bug even after applying the commit. Do I miss -> -> anything? -> -> -Hi Eryu, -> -This backtrace seems to be caused by this bug (there were two bugs in -> -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -Although the solution hasn't been tested on virtio-blk yet, you may -> -want to apply this patch: -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -Let me know if this works. -Will try it out, thanks a lot! - -Eryu - -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> -> -> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > > Hi, -> -> > > > -> -> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I -> -> > > > think it's because io completion hits use-after-free when device is -> -> > > > already gone. Is this a known bug that has been fixed? (I went through -> -> > > > the git log but didn't find anything obvious). -> -> > > > -> -> > > > gdb backtrace is: -> -> > > > -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > 903 return obj->class; -> -> > > > (gdb) bt -> -> > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > vector=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > i1=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > #7 0x0000000000000000 in ?? () -> -> > > > -> -> > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > parent BusState was already freed & set to NULL. -> -> > > > -> -> > > > Do we need to drain all pending request before unrealizing virtio-blk -> -> > > > device? Like the following patch proposed? -> -> > > > -> -> > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > -> -> > > > If more info is needed, please let me know. -> -> > > -> -> > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > -> -> > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > crash for me). Thanks! -> -> -> -> After applying this patch, I don't see the original segfaut and -> -> backtrace, but I see this crash -> -> -> -> [Thread debugging using libthread_db enabled] -> -> Using host libthread_db library "/lib64/libthread_db.so.1". -> -> Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> Program terminated with signal 11, Segmentation fault. -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> 1324 VirtIOPCIProxy *proxy = -> -> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> Missing separate debuginfos, use: debuginfo-install -> -> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> (gdb) bt -> -> #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> addr=0, val=<optimized out>, size=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, -> -> addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> /usr/src/debug/qemu-4.0/memory.c:502 -> -> #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340, -> -> attrs=...) -> -> at /usr/src/debug/qemu-4.0/memory.c:568 -> -> #3 0x0000561216837c66 in memory_region_dispatch_write -> -> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> #4 0x00005612167e036f in flatview_write_continue -> -> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> mr=0x56121846d340) -> -> at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) -> -> at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> addr=<optimized out>, attrs=..., attrs@entry=..., -> -> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> len=<optimized out>, is_write=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/exec.c:3419 -> -> #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> -> -> And I searched and found -> -> -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> bug. -> -> -> -> But I can still hit the bug even after applying the commit. Do I miss -> -> anything? -> -> -Hi Eryu, -> -This backtrace seems to be caused by this bug (there were two bugs in -> -1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -Although the solution hasn't been tested on virtio-blk yet, you may -> -want to apply this patch: -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -Let me know if this works. -Unfortunately, I still see the same segfault & backtrace after applying -commit 421afd2fe8dd ("virtio: reset region cache when on queue -deletion") - -Anything I can help to debug? - -Thanks, -Eryu - -On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > -> -> > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > Eryu Guan <address@hidden> wrote: -> -> > > > -> -> > > > > Hi, -> -> > > > > -> -> > > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, -> -> > > > > I -> -> > > > > think it's because io completion hits use-after-free when device is -> -> > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > through -> -> > > > > the git log but didn't find anything obvious). -> -> > > > > -> -> > > > > gdb backtrace is: -> -> > > > > -> -> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > 903 return obj->class; -> -> > > > > (gdb) bt -> -> > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0, -> -> > > > > vector=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete ( -> -> > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>, -> -> > > > > i1=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > #7 0x0000000000000000 in ?? () -> -> > > > > -> -> > > > > It seems like qemu was completing a discard/write_zero request, but -> -> > > > > parent BusState was already freed & set to NULL. -> -> > > > > -> -> > > > > Do we need to drain all pending request before unrealizing -> -> > > > > virtio-blk -> -> > > > > device? Like the following patch proposed? -> -> > > > > -> -> > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > -> -> > > > > If more info is needed, please let me know. -> -> > > > -> -> > > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > > -> -> > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > crash for me). Thanks! -> -> > -> -> > After applying this patch, I don't see the original segfaut and -> -> > backtrace, but I see this crash -> -> > -> -> > [Thread debugging using libthread_db enabled] -> -> > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > Program terminated with signal 11, Segmentation fault. -> -> > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> > addr=0, val=<optimized out>, size=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > 1324 VirtIOPCIProxy *proxy = -> -> > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > Missing separate debuginfos, use: debuginfo-install -> -> > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > (gdb) bt -> -> > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, -> -> > addr=0, val=<optimized out>, size=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, -> -> > shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> > access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> > access_fn=0x561216835ac0 <memory_region_write_accessor>, -> -> > mr=0x56121846d340, attrs=...) -> -> > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > #4 0x00005612167e036f in flatview_write_continue -> -> > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > mr=0x56121846d340) -> -> > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 -> -> > out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > len=<optimized out>, is_write=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at -> -> > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > -> -> > And I searched and found -> -> > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> > bug. -> -> > -> -> > But I can still hit the bug even after applying the commit. Do I miss -> -> > anything? -> -> -> -> Hi Eryu, -> -> This backtrace seems to be caused by this bug (there were two bugs in -> -> 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> Although the solution hasn't been tested on virtio-blk yet, you may -> -> want to apply this patch: -> -> -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> Let me know if this works. -> -> -Unfortunately, I still see the same segfault & backtrace after applying -> -commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -deletion") -> -> -Anything I can help to debug? -Please post the QEMU command-line and the QMP commands use to remove the -device. - -The backtrace shows a vcpu thread submitting a request. The device -seems to be partially destroyed. That's surprising because the monitor -and the vcpu thread should use the QEMU global mutex to avoid race -conditions. Maybe seeing the QMP commands will make it clearer... - -Stefan -signature.asc -Description: -PGP signature - -On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > -> -> > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > -> -> > > > > > Hi, -> -> > > > > > -> -> > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > sandbox, I -> -> > > > > > think it's because io completion hits use-after-free when device -> -> > > > > > is -> -> > > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > > through -> -> > > > > > the git log but didn't find anything obvious). -> -> > > > > > -> -> > > > > > gdb backtrace is: -> -> > > > > > -> -> > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > 903 return obj->class; -> -> > > > > > (gdb) bt -> -> > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > vector=<optimized out>) at -> -> > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > out>, -> -> > > > > > i1=<optimized out>) at -> -> > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > -> -> > > > > > It seems like qemu was completing a discard/write_zero request, -> -> > > > > > but -> -> > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > -> -> > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > virtio-blk -> -> > > > > > device? Like the following patch proposed? -> -> > > > > > -> -> > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > -> -> > > > > > If more info is needed, please let me know. -> -> > > > > -> -> > > > > may be this will help: -https://patchwork.kernel.org/patch/11213047/ -> -> > > > -> -> > > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > > crash for me). Thanks! -> -> > > -> -> > > After applying this patch, I don't see the original segfaut and -> -> > > backtrace, but I see this crash -> -> > > -> -> > > [Thread debugging using libthread_db enabled] -> -> > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > Program terminated with signal 11, Segmentation fault. -> -> > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > 1324 VirtIOPCIProxy *proxy = -> -> > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > Missing separate debuginfos, use: debuginfo-install -> -> > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > > (gdb) bt -> -> > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized -> -> > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, -> -> > > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, -> -> > > access_size_min=<optimized out>, access_size_max=<optimized out>, -> -> > > access_fn=0x561216835ac0 <memory_region_write_accessor>, -> -> > > mr=0x56121846d340, attrs=...) -> -> > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > #4 0x00005612167e036f in flatview_write_continue -> -> > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > > mr=0x56121846d340) -> -> > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > len=<optimized out>, is_write=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) -> -> > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > -> -> > > And I searched and found -> -> > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the same -> -> > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular -> -> > > bug. -> -> > > -> -> > > But I can still hit the bug even after applying the commit. Do I miss -> -> > > anything? -> -> > -> -> > Hi Eryu, -> -> > This backtrace seems to be caused by this bug (there were two bugs in -> -> > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > want to apply this patch: -> -> > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > Let me know if this works. -> -> -> -> Unfortunately, I still see the same segfault & backtrace after applying -> -> commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> deletion") -> -> -> -> Anything I can help to debug? -> -> -Please post the QEMU command-line and the QMP commands use to remove the -> -device. -It's a normal kata instance using virtio-fs as rootfs. - -/usr/local/libexec/qemu-kvm -name -sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ - -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ - -cpu host -qmp -unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait - \ - -qmp -unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait - \ - -m 2048M,slots=10,maxmem=773893M -device -pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ - -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -virtconsole,chardev=charconsole0,id=console0 \ - -chardev -socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait - \ - -device -virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \ - -chardev -socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait - \ - -device nvdimm,id=nv0,memdev=mem0 -object -memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 - \ - -object rng-random,id=rng0,filename=/dev/urandom -device -virtio-rng,rng=rng0,romfile= \ - -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ - -chardev -socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait - \ - -chardev -socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock - \ - -device -vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M --netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ - -device -driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= - \ - -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults --nographic -daemonize \ - -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on --numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ - -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 -i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 -console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 -root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro -rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96 -agent.use_vsock=false init=/usr/lib/systemd/systemd -systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service -systemd.mask=systemd-networkd.socket \ - -pidfile -/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -\ - -smp 1,cores=1,threads=1,sockets=96,maxcpus=96 - -QMP command to delete device (the device id is just an example, not the -one caused the crash): - -"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" - -which has been hot plugged by: -"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -"{\"return\": {}}" -"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -"{\"return\": {}}" - -> -> -The backtrace shows a vcpu thread submitting a request. The device -> -seems to be partially destroyed. That's surprising because the monitor -> -and the vcpu thread should use the QEMU global mutex to avoid race -> -conditions. Maybe seeing the QMP commands will make it clearer... -> -> -Stefan -Thanks! - -Eryu - -On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote: -> -On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -> On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > > -> -> > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > > -> -> > > > > > > Hi, -> -> > > > > > > -> -> > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > > sandbox, I -> -> > > > > > > think it's because io completion hits use-after-free when -> -> > > > > > > device is -> -> > > > > > > already gone. Is this a known bug that has been fixed? (I went -> -> > > > > > > through -> -> > > > > > > the git log but didn't find anything obvious). -> -> > > > > > > -> -> > > > > > > gdb backtrace is: -> -> > > > > > > -> -> > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > 903 return obj->class; -> -> > > > > > > (gdb) bt -> -> > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > > vector=<optimized out>) at -> -> > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420) -> -> > > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > > out>, -> -> > > > > > > i1=<optimized out>) at -> -> > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > > -> -> > > > > > > It seems like qemu was completing a discard/write_zero request, -> -> > > > > > > but -> -> > > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > > -> -> > > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > > virtio-blk -> -> > > > > > > device? Like the following patch proposed? -> -> > > > > > > -> -> > > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > > -> -> > > > > > > If more info is needed, please let me know. -> -> > > > > > -> -> > > > > > may be this will help: -> -> > > > > > -https://patchwork.kernel.org/patch/11213047/ -> -> > > > > -> -> > > > > Yeah, this looks promising! I'll try it out (though it's a one-time -> -> > > > > crash for me). Thanks! -> -> > > > -> -> > > > After applying this patch, I don't see the original segfaut and -> -> > > > backtrace, but I see this crash -> -> > > > -> -> > > > [Thread debugging using libthread_db enabled] -> -> > > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > > Program terminated with signal 11, Segmentation fault. -> -> > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > 1324 VirtIOPCIProxy *proxy = -> -> > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > > Missing separate debuginfos, use: debuginfo-install -> -> > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > > libstdc++-4.8.5-28.alios7.1.x86_64 -> -> > > > numactl-libs-2.0.9-5.1.alios7.x86_64 pixman-0.32.6-3.1.alios7.x86_64 -> -> > > > zlib-1.2.7-16.2.alios7.x86_64 -> -> > > > (gdb) bt -> -> > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized -> -> > > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized -> -> > > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at -> -> > > > /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > > #2 0x0000561216833c5d in access_with_adjusted_size -> -> > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8, -> -> > > > size=size@entry=2, access_size_min=<optimized out>, -> -> > > > access_size_max=<optimized out>, access_fn=0x561216835ac0 -> -> > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...) -> -> > > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > > #4 0x00005612167e036f in flatview_write_continue -> -> > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=..., -> -> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>, -> -> > > > mr=0x56121846d340) -> -> > > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized -> -> > > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, -> -> > > > len=<optimized out>, is_write=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) -> -> > > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at -> -> > > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0 -> -> > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > > -> -> > > > And I searched and found -> -> > > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the -> -> > > > same -> -> > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add -> -> > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this -> -> > > > particular -> -> > > > bug. -> -> > > > -> -> > > > But I can still hit the bug even after applying the commit. Do I miss -> -> > > > anything? -> -> > > -> -> > > Hi Eryu, -> -> > > This backtrace seems to be caused by this bug (there were two bugs in -> -> > > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > > want to apply this patch: -> -> > > -> -> > > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > > Let me know if this works. -> -> > -> -> > Unfortunately, I still see the same segfault & backtrace after applying -> -> > commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> > deletion") -> -> > -> -> > Anything I can help to debug? -> -> -> -> Please post the QEMU command-line and the QMP commands use to remove the -> -> device. -> -> -It's a normal kata instance using virtio-fs as rootfs. -> -> -/usr/local/libexec/qemu-kvm -name -> -sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ -> --uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -> -q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ -> --cpu host -qmp -> -unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -\ -> --qmp -> -unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -\ -> --m 2048M,slots=10,maxmem=773893M -device -> -pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ -> --device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -> -virtconsole,chardev=charconsole0,id=console0 \ -> --chardev -> -socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait -> -\ -> --device -> -virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \ -> --chardev -> -socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait -> -\ -> --device nvdimm,id=nv0,memdev=mem0 -object -> -memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 -> -\ -> --object rng-random,id=rng0,filename=/dev/urandom -device -> -virtio-rng,rng=rng0,romfile= \ -> --device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ -> --chardev -> -socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait -> -\ -> --chardev -> -socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock -> -\ -> --device -> -vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M -> --netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ -> --device -> -driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= -> -\ -> --global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -> --nodefaults -nographic -daemonize \ -> --object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -> --numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ -> --append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 -> -i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k -> -console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 -> -pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro -> -ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96 -> -agent.use_vsock=false init=/usr/lib/systemd/systemd -> -systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service -> -systemd.mask=systemd-networkd.socket \ -> --pidfile -> -/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -> -\ -> --smp 1,cores=1,threads=1,sockets=96,maxcpus=96 -> -> -QMP command to delete device (the device id is just an example, not the -> -one caused the crash): -> -> -"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" -> -> -which has been hot plugged by: -> -"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -> -"{\"return\": {}}" -> -"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -> -"{\"return\": {}}" -Thanks. I wasn't able to reproduce this crash with qemu.git/master. - -One thing that is strange about the latest backtrace you posted: QEMU is -dispatching the memory access instead of using the ioeventfd code that -that virtio-blk-pci normally takes when a virtqueue is notified. I -guess this means ioeventfd has already been disabled due to the hot -unplug. - -Could you try with machine type "i440fx" instead of "q35"? I wonder if -pci-bridge/shpc is part of the problem. - -Stefan -signature.asc -Description: -PGP signature - -On Tue, Jan 14, 2020 at 04:16:24PM +0000, Stefan Hajnoczi wrote: -> -On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote: -> -> On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote: -> -> > On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote: -> -> > > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote: -> -> > > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote: -> -> > > > > -> -> > > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote: -> -> > > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote: -> -> > > > > > > On Tue, 31 Dec 2019 18:34:34 +0800 -> -> > > > > > > Eryu Guan <address@hidden> wrote: -> -> > > > > > > -> -> > > > > > > > Hi, -> -> > > > > > > > -> -> > > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata -> -> > > > > > > > sandbox, I -> -> > > > > > > > think it's because io completion hits use-after-free when -> -> > > > > > > > device is -> -> > > > > > > > already gone. Is this a known bug that has been fixed? (I -> -> > > > > > > > went through -> -> > > > > > > > the git log but didn't find anything obvious). -> -> > > > > > > > -> -> > > > > > > > gdb backtrace is: -> -> > > > > > > > -> -> > > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'. -> -> > > > > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > > 903 return obj->class; -> -> > > > > > > > (gdb) bt -> -> > > > > > > > #0 object_get_class (obj=obj@entry=0x0) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903 -> -> > > > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector -> -> > > > > > > > (vdev=0x558a2e7751d0, -> -> > > > > > > > vector=<optimized out>) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118 -> -> > > > > > > > #2 0x0000558a2bfdcb1e in -> -> > > > > > > > virtio_blk_discard_write_zeroes_complete ( -> -> > > > > > > > opaque=0x558a2f2fd420, ret=0) -> -> > > > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186 -> -> > > > > > > > #3 0x0000558a2c261c7e in blk_aio_complete -> -> > > > > > > > (acb=0x558a2eed7420) -> -> > > > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305 -> -> > > > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized -> -> > > > > > > > out>, -> -> > > > > > > > i1=<optimized out>) at -> -> > > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116 -> -> > > > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6 -> -> > > > > > > > #6 0x00007fff9ed75780 in ?? () -> -> > > > > > > > #7 0x0000000000000000 in ?? () -> -> > > > > > > > -> -> > > > > > > > It seems like qemu was completing a discard/write_zero -> -> > > > > > > > request, but -> -> > > > > > > > parent BusState was already freed & set to NULL. -> -> > > > > > > > -> -> > > > > > > > Do we need to drain all pending request before unrealizing -> -> > > > > > > > virtio-blk -> -> > > > > > > > device? Like the following patch proposed? -> -> > > > > > > > -> -> > > > > > > > -https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html -> -> > > > > > > > -> -> > > > > > > > If more info is needed, please let me know. -> -> > > > > > > -> -> > > > > > > may be this will help: -> -> > > > > > > -https://patchwork.kernel.org/patch/11213047/ -> -> > > > > > -> -> > > > > > Yeah, this looks promising! I'll try it out (though it's a -> -> > > > > > one-time -> -> > > > > > crash for me). Thanks! -> -> > > > > -> -> > > > > After applying this patch, I don't see the original segfaut and -> -> > > > > backtrace, but I see this crash -> -> > > > > -> -> > > > > [Thread debugging using libthread_db enabled] -> -> > > > > Using host libthread_db library "/lib64/libthread_db.so.1". -> -> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name -> -> > > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'. -> -> > > > > Program terminated with signal 11, Segmentation fault. -> -> > > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, -> -> > > > > size=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > > 1324 VirtIOPCIProxy *proxy = -> -> > > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent); -> -> > > > > Missing separate debuginfos, use: debuginfo-install -> -> > > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 -> -> > > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 -> -> > > > > libstdc++-4.8.5-28.alios7.1.x86_64 -> -> > > > > numactl-libs-2.0.9-5.1.alios7.x86_64 -> -> > > > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64 -> -> > > > > (gdb) bt -> -> > > > > #0 0x0000561216a57609 in virtio_pci_notify_write -> -> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, -> -> > > > > size=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324 -> -> > > > > #1 0x0000561216835b22 in memory_region_write_accessor -> -> > > > > (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, -> -> > > > > size=<optimized out>, shift=<optimized out>, mask=<optimized out>, -> -> > > > > attrs=...) at /usr/src/debug/qemu-4.0/memory.c:502 -> -> > > > > #2 0x0000561216833c5d in access_with_adjusted_size -> -> > > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8, -> -> > > > > size=size@entry=2, access_size_min=<optimized out>, -> -> > > > > access_size_max=<optimized out>, access_fn=0x561216835ac0 -> -> > > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...) -> -> > > > > at /usr/src/debug/qemu-4.0/memory.c:568 -> -> > > > > #3 0x0000561216837c66 in memory_region_dispatch_write -> -> > > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, -> -> > > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503 -> -> > > > > #4 0x00005612167e036f in flatview_write_continue -> -> > > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, -> -> > > > > attrs=..., buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out -> -> > > > > of bounds>, len=len@entry=2, addr1=<optimized out>, l=<optimized -> -> > > > > out>, mr=0x56121846d340) -> -> > > > > at /usr/src/debug/qemu-4.0/exec.c:3279 -> -> > > > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0, -> -> > > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address -> -> > > > > 0x7fce7dd97028 out of bounds>, len=2) at -> -> > > > > /usr/src/debug/qemu-4.0/exec.c:3318 -> -> > > > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>, -> -> > > > > addr=<optimized out>, attrs=..., buf=<optimized out>, -> -> > > > > len=<optimized out>) at /usr/src/debug/qemu-4.0/exec.c:3408 -> -> > > > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>, -> -> > > > > addr=<optimized out>, attrs=..., attrs@entry=..., -> -> > > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of -> -> > > > > bounds>, len=<optimized out>, is_write=<optimized out>) at -> -> > > > > /usr/src/debug/qemu-4.0/exec.c:3419 -> -> > > > > #8 0x0000561216849da1 in kvm_cpu_exec -> -> > > > > (cpu=cpu@entry=0x56121849aa00) at -> -> > > > > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034 -> -> > > > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn -> -> > > > > (arg=arg@entry=0x56121849aa00) at -> -> > > > > /usr/src/debug/qemu-4.0/cpus.c:1281 -> -> > > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) -> -> > > > > at /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502 -> -> > > > > #11 0x00007fce7bef6e25 in start_thread () from -> -> > > > > /lib64/libpthread.so.0 -> -> > > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6 -> -> > > > > -> -> > > > > And I searched and found -> -> > > > > -https://bugzilla.redhat.com/show_bug.cgi?id=1706759 -, which has the -> -> > > > > same -> -> > > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: -> -> > > > > Add -> -> > > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this -> -> > > > > particular -> -> > > > > bug. -> -> > > > > -> -> > > > > But I can still hit the bug even after applying the commit. Do I -> -> > > > > miss -> -> > > > > anything? -> -> > > > -> -> > > > Hi Eryu, -> -> > > > This backtrace seems to be caused by this bug (there were two bugs in -> -> > > > 1706759): -https://bugzilla.redhat.com/show_bug.cgi?id=1708480 -> -> > > > Although the solution hasn't been tested on virtio-blk yet, you may -> -> > > > want to apply this patch: -> -> > > > -> -> > > > -https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html -> -> > > > Let me know if this works. -> -> > > -> -> > > Unfortunately, I still see the same segfault & backtrace after applying -> -> > > commit 421afd2fe8dd ("virtio: reset region cache when on queue -> -> > > deletion") -> -> > > -> -> > > Anything I can help to debug? -> -> > -> -> > Please post the QEMU command-line and the QMP commands use to remove the -> -> > device. -> -> -> -> It's a normal kata instance using virtio-fs as rootfs. -> -> -> -> /usr/local/libexec/qemu-kvm -name -> -> sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \ -> -> -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine -> -> q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \ -> -> -cpu host -qmp -> -> unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -> \ -> -> -qmp -> -> unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait -> -> \ -> -> -m 2048M,slots=10,maxmem=773893M -device -> -> pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \ -> -> -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device -> -> virtconsole,chardev=charconsole0,id=console0 \ -> -> -chardev -> -> socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait -> -> \ -> -> -device -> -> virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 -> -> \ -> -> -chardev -> -> socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait -> -> \ -> -> -device nvdimm,id=nv0,memdev=mem0 -object -> -> memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456 -> -> \ -> -> -object rng-random,id=rng0,filename=/dev/urandom -device -> -> virtio-rng,rng=rng0,romfile= \ -> -> -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \ -> -> -chardev -> -> socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait -> -> \ -> -> -chardev -> -> socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock -> -> \ -> -> -device -> -> vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M -> -> -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \ -> -> -device -> -> driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile= -> -> \ -> -> -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -> -> -nodefaults -nographic -daemonize \ -> -> -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -> -> -numa node,memdev=dimm1 -kernel /usr/local/share/kernel \ -> -> -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 -> -> i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp -> -> reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests -> -> net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 -> -> rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 quiet -> -> systemd.show_status=false panic=1 nr_cpus=96 agent.use_vsock=false -> -> init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target -> -> systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket \ -> -> -pidfile -> -> /run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid -> -> \ -> -> -smp 1,cores=1,threads=1,sockets=96,maxcpus=96 -> -> -> -> QMP command to delete device (the device id is just an example, not the -> -> one caused the crash): -> -> -> -> "{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}" -> -> -> -> which has been hot plugged by: -> -> "{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}" -> -> "{\"return\": {}}" -> -> "{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}" -> -> "{\"return\": {}}" -> -> -Thanks. I wasn't able to reproduce this crash with qemu.git/master. -> -> -One thing that is strange about the latest backtrace you posted: QEMU is -> -dispatching the memory access instead of using the ioeventfd code that -> -that virtio-blk-pci normally takes when a virtqueue is notified. I -> -guess this means ioeventfd has already been disabled due to the hot -> -unplug. -> -> -Could you try with machine type "i440fx" instead of "q35"? I wonder if -> -pci-bridge/shpc is part of the problem. -Sure, will try it. But it may take some time, as the test bed is busy -with other testing tasks. I'll report back once I got the results. - -Thanks, -Eryu - diff --git a/results/classifier/02/other/88281850 b/results/classifier/02/other/88281850 deleted file mode 100644 index 4a745e007..000000000 --- a/results/classifier/02/other/88281850 +++ /dev/null @@ -1,282 +0,0 @@ -other: 0.983 -instruction: 0.978 -semantic: 0.968 -boot: 0.967 -mistranslation: 0.948 - -[Bug] Take more 150s to boot qemu on ARM64 - -Hi all, -I encounter a issue with kernel 5.19-rc1 on a ARM64 board: it takes -about 150s between beginning to run qemu command and beginng to boot -Linux kernel ("EFI stub: Booting Linux Kernel..."). -But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel -code and it finds c2445d387850 ("srcu: Add contention check to -call_srcu() srcu_data ->lock acquisition"). -The qemu (qemu version is 6.2.92) command i run is : - -./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \ ---trace "kvm*" \ --cpu host \ --machine virt,accel=kvm,gic-version=3 \ --machine smp.cpus=2,smp.sockets=2 \ --no-reboot \ --nographic \ --monitor unix:/home/cx/qmp-test,server,nowait \ --bios /home/cx/boot/QEMU_EFI.fd \ --kernel /home/cx/boot/Image \ --device -pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1 -\ --device vfio-pci,host=7d:01.3,id=net0 \ --device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4 \ --drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \ --append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \ --net none \ --D /home/cx/qemu_log.txt -I am not familiar with rcu code, and don't know how it causes the issue. -Do you have any idea about this issue? -Best Regard, - -Xiang Chen - -On Mon, Jun 13, 2022 at 08:26:34PM +0800, chenxiang (M) wrote: -> -Hi all, -> -> -I encounter a issue with kernel 5.19-rc1 on a ARM64 board: it takes about -> -150s between beginning to run qemu command and beginng to boot Linux kernel -> -("EFI stub: Booting Linux Kernel..."). -> -> -But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel code -> -and it finds c2445d387850 ("srcu: Add contention check to call_srcu() -> -srcu_data ->lock acquisition"). -> -> -The qemu (qemu version is 6.2.92) command i run is : -> -> -./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \ -> ---trace "kvm*" \ -> --cpu host \ -> --machine virt,accel=kvm,gic-version=3 \ -> --machine smp.cpus=2,smp.sockets=2 \ -> --no-reboot \ -> --nographic \ -> --monitor unix:/home/cx/qmp-test,server,nowait \ -> --bios /home/cx/boot/QEMU_EFI.fd \ -> --kernel /home/cx/boot/Image \ -> --device -> -pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1 -> -\ -> --device vfio-pci,host=7d:01.3,id=net0 \ -> --device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4 \ -> --drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \ -> --append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \ -> --net none \ -> --D /home/cx/qemu_log.txt -> -> -I am not familiar with rcu code, and don't know how it causes the issue. Do -> -you have any idea about this issue? -Please see the discussion here: -https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/ -Though that report requires ACPI to be forced on to get the -delay, which results in more than 9,000 back-to-back calls to -synchronize_srcu_expedited(). I cannot reproduce this on my setup, even -with an artificial tight loop invoking synchronize_srcu_expedited(), -but then again I don't have ARM hardware. - -My current guess is that the following patch, but with larger values for -SRCU_MAX_NODELAY_PHASE. Here "larger" might well be up in the hundreds, -or perhaps even larger. - -If you get a chance to experiment with this, could you please reply -to the discussion at the above URL? (Or let me know, and I can CC -you on the next message in that thread.) - - Thanx, Paul - ------------------------------------------------------------------------- - -diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c -index 50ba70f019dea..0db7873f4e95b 100644 ---- a/kernel/rcu/srcutree.c -+++ b/kernel/rcu/srcutree.c -@@ -513,7 +513,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp) - - #define SRCU_INTERVAL 1 // Base delay if no expedited GPs -pending. - #define SRCU_MAX_INTERVAL 10 // Maximum incremental delay from slow -readers. --#define SRCU_MAX_NODELAY_PHASE 1 // Maximum per-GP-phase consecutive -no-delay instances. -+#define SRCU_MAX_NODELAY_PHASE 3 // Maximum per-GP-phase consecutive -no-delay instances. - #define SRCU_MAX_NODELAY 100 // Maximum consecutive no-delay -instances. - - /* -@@ -522,16 +522,22 @@ static bool srcu_readers_active(struct srcu_struct *ssp) - */ - static unsigned long srcu_get_delay(struct srcu_struct *ssp) - { -+ unsigned long gpstart; -+ unsigned long j; - unsigned long jbase = SRCU_INTERVAL; - - if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), -READ_ONCE(ssp->srcu_gp_seq_needed_exp))) - jbase = 0; -- if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) -- jbase += jiffies - READ_ONCE(ssp->srcu_gp_start); -- if (!jbase) { -- WRITE_ONCE(ssp->srcu_n_exp_nodelay, -READ_ONCE(ssp->srcu_n_exp_nodelay) + 1); -- if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE) -- jbase = 1; -+ if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) { -+ j = jiffies - 1; -+ gpstart = READ_ONCE(ssp->srcu_gp_start); -+ if (time_after(j, gpstart)) -+ jbase += j - gpstart; -+ if (!jbase) { -+ WRITE_ONCE(ssp->srcu_n_exp_nodelay, -READ_ONCE(ssp->srcu_n_exp_nodelay) + 1); -+ if (READ_ONCE(ssp->srcu_n_exp_nodelay) > -SRCU_MAX_NODELAY_PHASE) -+ jbase = 1; -+ } - } - return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase; - } - -å¨ 2022/6/13 21:22, Paul E. McKenney åé: -On Mon, Jun 13, 2022 at 08:26:34PM +0800, chenxiang (M) wrote: -Hi all, - -I encounter a issue with kernel 5.19-rc1 on a ARM64 board: it takes about -150s between beginning to run qemu command and beginng to boot Linux kernel -("EFI stub: Booting Linux Kernel..."). - -But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel code -and it finds c2445d387850 ("srcu: Add contention check to call_srcu() -srcu_data ->lock acquisition"). - -The qemu (qemu version is 6.2.92) command i run is : - -./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \ ---trace "kvm*" \ --cpu host \ --machine virt,accel=kvm,gic-version=3 \ --machine smp.cpus=2,smp.sockets=2 \ --no-reboot \ --nographic \ --monitor unix:/home/cx/qmp-test,server,nowait \ --bios /home/cx/boot/QEMU_EFI.fd \ --kernel /home/cx/boot/Image \ --device -pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1 -\ --device vfio-pci,host=7d:01.3,id=net0 \ --device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4 \ --drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \ --append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \ --net none \ --D /home/cx/qemu_log.txt - -I am not familiar with rcu code, and don't know how it causes the issue. Do -you have any idea about this issue? -Please see the discussion here: -https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/ -Though that report requires ACPI to be forced on to get the -delay, which results in more than 9,000 back-to-back calls to -synchronize_srcu_expedited(). I cannot reproduce this on my setup, even -with an artificial tight loop invoking synchronize_srcu_expedited(), -but then again I don't have ARM hardware. - -My current guess is that the following patch, but with larger values for -SRCU_MAX_NODELAY_PHASE. Here "larger" might well be up in the hundreds, -or perhaps even larger. - -If you get a chance to experiment with this, could you please reply -to the discussion at the above URL? (Or let me know, and I can CC -you on the next message in that thread.) -Ok, thanks, i will reply it on above URL. -Thanx, Paul - ------------------------------------------------------------------------- - -diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c -index 50ba70f019dea..0db7873f4e95b 100644 ---- a/kernel/rcu/srcutree.c -+++ b/kernel/rcu/srcutree.c -@@ -513,7 +513,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp) -#define SRCU_INTERVAL 1 // Base delay if no expedited GPs pending. -#define SRCU_MAX_INTERVAL 10 // Maximum incremental delay from slow -readers. --#define SRCU_MAX_NODELAY_PHASE 1 // Maximum per-GP-phase consecutive -no-delay instances. -+#define SRCU_MAX_NODELAY_PHASE 3 // Maximum per-GP-phase consecutive -no-delay instances. - #define SRCU_MAX_NODELAY 100 // Maximum consecutive no-delay -instances. -/* -@@ -522,16 +522,22 @@ static bool srcu_readers_active(struct srcu_struct *ssp) - */ - static unsigned long srcu_get_delay(struct srcu_struct *ssp) - { -+ unsigned long gpstart; -+ unsigned long j; - unsigned long jbase = SRCU_INTERVAL; -if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp))) -jbase = 0; -- if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) -- jbase += jiffies - READ_ONCE(ssp->srcu_gp_start); -- if (!jbase) { -- WRITE_ONCE(ssp->srcu_n_exp_nodelay, -READ_ONCE(ssp->srcu_n_exp_nodelay) + 1); -- if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE) -- jbase = 1; -+ if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) { -+ j = jiffies - 1; -+ gpstart = READ_ONCE(ssp->srcu_gp_start); -+ if (time_after(j, gpstart)) -+ jbase += j - gpstart; -+ if (!jbase) { -+ WRITE_ONCE(ssp->srcu_n_exp_nodelay, -READ_ONCE(ssp->srcu_n_exp_nodelay) + 1); -+ if (READ_ONCE(ssp->srcu_n_exp_nodelay) > -SRCU_MAX_NODELAY_PHASE) -+ jbase = 1; -+ } - } - return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase; - } -. - diff --git a/results/classifier/02/other/92957605 b/results/classifier/02/other/92957605 deleted file mode 100644 index 5aaf36b15..000000000 --- a/results/classifier/02/other/92957605 +++ /dev/null @@ -1,419 +0,0 @@ -other: 0.997 -semantic: 0.995 -instruction: 0.993 -boot: 0.992 -mistranslation: 0.974 - -[Qemu-devel] Fwd: [BUG] Failed to compile using gcc7.1 - -Hi all, -I encountered the same problem on gcc 7.1.1 and found Qu's mail in -this list from google search. - -Temporarily fix it by specifying the string length in snprintf -directive. Hope this is helpful to other people encountered the same -problem. - -@@ -1,9 +1,7 @@ ---- ---- a/block/blkdebug.c -- "blkdebug:%s:%s", s->config_file ?: "", ---- a/block/blkverify.c -- "blkverify:%s:%s", ---- a/hw/usb/bus.c -- snprintf(downstream->path, sizeof(downstream->path), "%s.%d", -- snprintf(downstream->path, sizeof(downstream->path), "%d", portnr); --- -+++ b/block/blkdebug.c -+ "blkdebug:%.2037s:%.2037s", s->config_file ?: "", -+++ b/block/blkverify.c -+ "blkverify:%.2038s:%.2038s", -+++ b/hw/usb/bus.c -+ snprintf(downstream->path, sizeof(downstream->path), "%.12s.%d", -+ snprintf(downstream->path, sizeof(downstream->path), "%.12d", portnr); - -Tsung-en Hsiao - -> -Qu Wenruo Wrote: -> -> -Hi all, -> -> -After upgrading gcc from 6.3.1 to 7.1.1, qemu can't be compiled with gcc. -> -> -The error is: -> -> ------- -> -CC block/blkdebug.o -> -block/blkdebug.c: In function 'blkdebug_refresh_filename': -> -> -block/blkdebug.c:693:31: error: '%s' directive output may be truncated -> -writing up to 4095 bytes into a region of size 4086 -> -[-Werror=format-truncation=] -> -> -"blkdebug:%s:%s", s->config_file ?: "", -> -^~ -> -In file included from /usr/include/stdio.h:939:0, -> -from /home/adam/qemu/include/qemu/osdep.h:68, -> -from block/blkdebug.c:25: -> -> -/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output 11 -> -or more bytes (assuming 4106) into a destination of size 4096 -> -> -return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, -> -^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -> -__bos (__s), __fmt, __va_arg_pack ()); -> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -> -cc1: all warnings being treated as errors -> -make: *** [/home/adam/qemu/rules.mak:69: block/blkdebug.o] Error 1 -> ------- -> -> -It seems that gcc 7 is introducing more restrict check for printf. -> -> -If using clang, although there are some extra warning, it can at least pass -> -the compile. -> -> -Thanks, -> -Qu - -Hi Tsung-en, - -On 06/11/2017 04:08 PM, Tsung-en Hsiao wrote: -Hi all, -I encountered the same problem on gcc 7.1.1 and found Qu's mail in -this list from google search. - -Temporarily fix it by specifying the string length in snprintf -directive. Hope this is helpful to other people encountered the same -problem. -Thank your for sharing this. -@@ -1,9 +1,7 @@ ---- ---- a/block/blkdebug.c -- "blkdebug:%s:%s", s->config_file ?: "", ---- a/block/blkverify.c -- "blkverify:%s:%s", ---- a/hw/usb/bus.c -- snprintf(downstream->path, sizeof(downstream->path), "%s.%d", -- snprintf(downstream->path, sizeof(downstream->path), "%d", portnr); --- -+++ b/block/blkdebug.c -+ "blkdebug:%.2037s:%.2037s", s->config_file ?: "", -It is a rather funny way to silent this warning :) Truncating the -filename until it fits. -However I don't think it is the correct way since there is indeed an -overflow of bs->exact_filename. -Apparently exact_filename from "block/block_int.h" is defined to hold a -pathname: -char exact_filename[PATH_MAX]; -but is used for more than that (for example in blkdebug.c it might use -until 10+2*PATH_MAX chars). -I suppose it started as a buffer to hold a pathname then more block -drivers were added and this buffer ended used differently. -If it is a multi-purpose buffer one safer option might be to declare it -as a GString* and use g_string_printf(). -I CC'ed the block folks to have their feedback. - -Regards, - -Phil. -+++ b/block/blkverify.c -+ "blkverify:%.2038s:%.2038s", -+++ b/hw/usb/bus.c -+ snprintf(downstream->path, sizeof(downstream->path), "%.12s.%d", -+ snprintf(downstream->path, sizeof(downstream->path), "%.12d", portnr); - -Tsung-en Hsiao -Qu Wenruo Wrote: - -Hi all, - -After upgrading gcc from 6.3.1 to 7.1.1, qemu can't be compiled with gcc. - -The error is: - ------- - CC block/blkdebug.o -block/blkdebug.c: In function 'blkdebug_refresh_filename': - -block/blkdebug.c:693:31: error: '%s' directive output may be truncated writing -up to 4095 bytes into a region of size 4086 [-Werror=format-truncation=] - - "blkdebug:%s:%s", s->config_file ?: "", - ^~ -In file included from /usr/include/stdio.h:939:0, - from /home/adam/qemu/include/qemu/osdep.h:68, - from block/blkdebug.c:25: - -/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output 11 or -more bytes (assuming 4106) into a destination of size 4096 - - return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, - ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - __bos (__s), __fmt, __va_arg_pack ()); - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -cc1: all warnings being treated as errors -make: *** [/home/adam/qemu/rules.mak:69: block/blkdebug.o] Error 1 ------- - -It seems that gcc 7 is introducing more restrict check for printf. - -If using clang, although there are some extra warning, it can at least pass the -compile. - -Thanks, -Qu - -On 2017-06-12 05:19, Philippe Mathieu-Daudé wrote: -> -Hi Tsung-en, -> -> -On 06/11/2017 04:08 PM, Tsung-en Hsiao wrote: -> -> Hi all, -> -> I encountered the same problem on gcc 7.1.1 and found Qu's mail in -> -> this list from google search. -> -> -> -> Temporarily fix it by specifying the string length in snprintf -> -> directive. Hope this is helpful to other people encountered the same -> -> problem. -> -> -Thank your for sharing this. -> -> -> -> -> @@ -1,9 +1,7 @@ -> -> --- -> -> --- a/block/blkdebug.c -> -> - "blkdebug:%s:%s", s->config_file ?: "", -> -> --- a/block/blkverify.c -> -> - "blkverify:%s:%s", -> -> --- a/hw/usb/bus.c -> -> - snprintf(downstream->path, sizeof(downstream->path), "%s.%d", -> -> - snprintf(downstream->path, sizeof(downstream->path), "%d", -> -> portnr); -> -> -- -> -> +++ b/block/blkdebug.c -> -> + "blkdebug:%.2037s:%.2037s", s->config_file ?: "", -> -> -It is a rather funny way to silent this warning :) Truncating the -> -filename until it fits. -> -> -However I don't think it is the correct way since there is indeed an -> -overflow of bs->exact_filename. -> -> -Apparently exact_filename from "block/block_int.h" is defined to hold a -> -pathname: -> -char exact_filename[PATH_MAX]; -> -> -but is used for more than that (for example in blkdebug.c it might use -> -until 10+2*PATH_MAX chars). -In any case, truncating the filenames will do just as much as truncating -the result: You'll get an unusable filename. - -> -I suppose it started as a buffer to hold a pathname then more block -> -drivers were added and this buffer ended used differently. -> -> -If it is a multi-purpose buffer one safer option might be to declare it -> -as a GString* and use g_string_printf(). -What it is supposed to be now is just an information string we can print -to the user, because strings are nicer than JSON objects. There are some -commands that take a filename for identifying a block node, but I dream -we can get rid of them in 3.0... - -The right solution is to remove it altogether and have a -"char *bdrv_filename(BlockDriverState *bs)" function (which generates -the filename every time it's called). I've been working on this for some -years now, actually, but it was never pressing enough to get it finished -(so I never had enough time). - -What we can do in the meantime is to not generate a plain filename if it -won't fit into bs->exact_filename. - -(The easiest way to do this probably would be to truncate -bs->exact_filename back to an empty string if snprintf() returns a value -greater than or equal to the length of bs->exact_filename.) - -What to do about hw/usb/bus.c I don't know (I guess the best solution -would be to ignore the warning, but I don't suppose that is going to work). - -Max - -> -> -I CC'ed the block folks to have their feedback. -> -> -Regards, -> -> -Phil. -> -> -> +++ b/block/blkverify.c -> -> + "blkverify:%.2038s:%.2038s", -> -> +++ b/hw/usb/bus.c -> -> + snprintf(downstream->path, sizeof(downstream->path), "%.12s.%d", -> -> + snprintf(downstream->path, sizeof(downstream->path), "%.12d", -> -> portnr); -> -> -> -> Tsung-en Hsiao -> -> -> ->> Qu Wenruo Wrote: -> ->> -> ->> Hi all, -> ->> -> ->> After upgrading gcc from 6.3.1 to 7.1.1, qemu can't be compiled with -> ->> gcc. -> ->> -> ->> The error is: -> ->> -> ->> ------ -> ->> CC block/blkdebug.o -> ->> block/blkdebug.c: In function 'blkdebug_refresh_filename': -> ->> -> ->> block/blkdebug.c:693:31: error: '%s' directive output may be -> ->> truncated writing up to 4095 bytes into a region of size 4086 -> ->> [-Werror=format-truncation=] -> ->> -> ->> "blkdebug:%s:%s", s->config_file ?: "", -> ->> ^~ -> ->> In file included from /usr/include/stdio.h:939:0, -> ->> from /home/adam/qemu/include/qemu/osdep.h:68, -> ->> from block/blkdebug.c:25: -> ->> -> ->> /usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' -> ->> output 11 or more bytes (assuming 4106) into a destination of size 4096 -> ->> -> ->> return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, -> ->> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -> ->> __bos (__s), __fmt, __va_arg_pack ()); -> ->> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -> ->> cc1: all warnings being treated as errors -> ->> make: *** [/home/adam/qemu/rules.mak:69: block/blkdebug.o] Error 1 -> ->> ------ -> ->> -> ->> It seems that gcc 7 is introducing more restrict check for printf. -> ->> -> ->> If using clang, although there are some extra warning, it can at -> ->> least pass the compile. -> ->> -> ->> Thanks, -> ->> Qu -> -> -signature.asc -Description: -OpenPGP digital signature - diff --git a/results/classifier/02/other/95154278 b/results/classifier/02/other/95154278 deleted file mode 100644 index 14ac3cd34..000000000 --- a/results/classifier/02/other/95154278 +++ /dev/null @@ -1,156 +0,0 @@ -other: 0.953 -instruction: 0.938 -semantic: 0.937 -boot: 0.902 -mistranslation: 0.897 - -[Qemu-devel] [BUG] checkpatch.pl hangs on target/mips/msa_helper.c - -If checkpatch.pl is applied (using switch "-f") on file -target/mips/msa_helper.c, it will hang. - -There is a workaround for this particular file: - -These lines in msa_helper.c: - - uint## BITS ##_t S = _S, T = _T; \ - uint## BITS ##_t as, at, xs, xt, xd; \ - -should be replaced with: - - uint## BITS ## _t S = _S, T = _T; \ - uint## BITS ## _t as, at, xs, xt, xd; \ - -(a space is added after the second "##" in each line) - -The workaround is found by partial deleting and undeleting of the code in -msa_helper.c in binary search fashion. - -This workaround will soon be submitted by me as a patch within a series on misc -MIPS issues. - -I took a look at checkpatch.pl code, and it looks it is fairly complicated to -fix the issue, since it happens in the code segment involving intricate logic -conditions. - -Regards, -Aleksandar - -On Wed, Jul 04, 2018 at 03:35:18PM +0000, Aleksandar Markovic wrote: -> -If checkpatch.pl is applied (using switch "-f") on file -> -target/mips/msa_helper.c, it will hang. -> -> -There is a workaround for this particular file: -> -> -These lines in msa_helper.c: -> -> -uint## BITS ##_t S = _S, T = _T; \ -> -uint## BITS ##_t as, at, xs, xt, xd; \ -> -> -should be replaced with: -> -> -uint## BITS ## _t S = _S, T = _T; \ -> -uint## BITS ## _t as, at, xs, xt, xd; \ -> -> -(a space is added after the second "##" in each line) -> -> -The workaround is found by partial deleting and undeleting of the code in -> -msa_helper.c in binary search fashion. -> -> -This workaround will soon be submitted by me as a patch within a series on -> -misc MIPS issues. -> -> -I took a look at checkpatch.pl code, and it looks it is fairly complicated to -> -fix the issue, since it happens in the code segment involving intricate logic -> -conditions. -Thanks for figuring this out, Aleksandar. Not sure if anyone else has -the apetite to fix checkpatch.pl. - -Stefan -signature.asc -Description: -PGP signature - -On 07/11/2018 09:36 AM, Stefan Hajnoczi wrote: -> -On Wed, Jul 04, 2018 at 03:35:18PM +0000, Aleksandar Markovic wrote: -> -> If checkpatch.pl is applied (using switch "-f") on file -> -> target/mips/msa_helper.c, it will hang. -> -> -> -> There is a workaround for this particular file: -> -> -> -> These lines in msa_helper.c: -> -> -> -> uint## BITS ##_t S = _S, T = _T; \ -> -> uint## BITS ##_t as, at, xs, xt, xd; \ -> -> -> -> should be replaced with: -> -> -> -> uint## BITS ## _t S = _S, T = _T; \ -> -> uint## BITS ## _t as, at, xs, xt, xd; \ -> -> -> -> (a space is added after the second "##" in each line) -> -> -> -> The workaround is found by partial deleting and undeleting of the code in -> -> msa_helper.c in binary search fashion. -> -> -> -> This workaround will soon be submitted by me as a patch within a series on -> -> misc MIPS issues. -> -> -> -> I took a look at checkpatch.pl code, and it looks it is fairly complicated -> -> to fix the issue, since it happens in the code segment involving intricate -> -> logic conditions. -> -> -Thanks for figuring this out, Aleksandar. Not sure if anyone else has -> -the apetite to fix checkpatch.pl. -Anyone else but Paolo ;P -http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01250.html -signature.asc -Description: -OpenPGP digital signature - diff --git a/results/classifier/02/other/99674399 b/results/classifier/02/other/99674399 deleted file mode 100644 index 72912d57e..000000000 --- a/results/classifier/02/other/99674399 +++ /dev/null @@ -1,149 +0,0 @@ -other: 0.883 -instruction: 0.860 -mistranslation: 0.843 -semantic: 0.822 -boot: 0.822 - -[BUG] qemu crashes on assertion in cpu_asidx_from_attrs when cpu is in smm mode - -Hi all! - -First, I see this issue: -https://gitlab.com/qemu-project/qemu/-/issues/1198 -. -where some kvm/hardware failure leads to guest crash, and finally to this -assertion: - - cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed. - -But in the ticket the talk is about the guest crash and fixing the kernel, not -about the final QEMU assertion (which definitely show that something should be -fixed in QEMU code too). - - -We've faced same stack one time: - -(gdb) bt -#0 raise () from /lib/x86_64-linux-gnu/libc.so.6 -#1 abort () from /lib/x86_64-linux-gnu/libc.so.6 -#2 ?? () from /lib/x86_64-linux-gnu/libc.so.6 -#3 __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 -#4 cpu_asidx_from_attrs at ../hw/core/cpu-sysemu.c:76 -#5 cpu_memory_rw_debug at ../softmmu/physmem.c:3529 -#6 x86_cpu_dump_state at ../target/i386/cpu-dump.c:560 -#7 kvm_cpu_exec at ../accel/kvm/kvm-all.c:3000 -#8 kvm_vcpu_thread_fn at ../accel/kvm/kvm-accel-ops.c:51 -#9 qemu_thread_start at ../util/qemu-thread-posix.c:505 -#10 start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 -#11 clone () from /lib/x86_64-linux-gnu/libc.so.6 - - -And what I see: - -static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs) -{ - return !!attrs.secure; -} - -int cpu_asidx_from_attrs(CPUState *cpu, MemTxAttrs attrs) -{ - int ret = 0; - - if (cpu->cc->sysemu_ops->asidx_from_attrs) { - ret = cpu->cc->sysemu_ops->asidx_from_attrs(cpu, attrs); - assert(ret < cpu->num_ases && ret >= 0); <<<<<<<<<<<<<<<<< - } - return ret; -} - -(gdb) p cpu->num_ases -$3 = 1 - -(gdb) fr 5 -#5 0x00005578c8814ba3 in cpu_memory_rw_debug (cpu=c... -(gdb) p attrs -$6 = {unspecified = 0, secure = 1, user = 0, memory = 0, requester_id = 0, -byte_swap = 0, target_tlb_bit0 = 0, target_tlb_bit1 = 0, target_tlb_bit2 = 0} - -so .secure is 1, therefore ret is 1, in the same time num_ases is 1 too and -assertion fails. - - - -Where is .secure from? - -static inline MemTxAttrs cpu_get_mem_attrs(CPUX86State *env) -{ - return ((MemTxAttrs) { .secure = (env->hflags & HF_SMM_MASK) != 0 }); -} - -Ok, it means we in SMM mode. - - - -On the other hand, it seems that num_ases seems to be always 1 for x86: - -vsementsov@vsementsov-lin:~/work/src/qemu/yc-7.2$ git grep 'num_ases = ' -cpu.c: cpu->num_ases = 0; -softmmu/cpus.c: cpu->num_ases = 1; -target/arm/cpu.c: cs->num_ases = 3 + has_secure; -target/arm/cpu.c: cs->num_ases = 1 + has_secure; -target/i386/tcg/sysemu/tcg-cpu.c: cs->num_ases = 2; - - -So, something is wrong around cpu->num_ases and x86_asidx_from_attrs() which -may return more in SMM mode. - - -The stack starts in -//7 0x00005578c882f539 in kvm_cpu_exec (cpu=cpu@entry=0x5578ca2eb340) at -../accel/kvm/kvm-all.c:3000 - if (ret < 0) { - cpu_dump_state(cpu, stderr, CPU_DUMP_CODE); - vm_stop(RUN_STATE_INTERNAL_ERROR); - } - -So that was some kvm error, and we decided to call cpu_dump_state(). And it -crashes. cpu_dump_state() is also called from hmp_info_registers, so I can -reproduce the crash with a tiny patch to master (as only CPU_DUMP_CODE path -calls cpu_memory_rw_debug(), as it is in kvm_cpu_exec()): - -diff --git a/monitor/hmp-cmds-target.c b/monitor/hmp-cmds-target.c -index ff01cf9d8d..dcf0189048 100644 ---- a/monitor/hmp-cmds-target.c -+++ b/monitor/hmp-cmds-target.c -@@ -116,7 +116,7 @@ void hmp_info_registers(Monitor *mon, const QDict *qdict) - } - - monitor_printf(mon, "\nCPU#%d\n", cs->cpu_index); -- cpu_dump_state(cs, NULL, CPU_DUMP_FPU); -+ cpu_dump_state(cs, NULL, CPU_DUMP_CODE); - } - } - - -Than run - -yes "info registers" | ./build/qemu-system-x86_64 -accel kvm -monitor stdio \ - -global driver=cfi.pflash01,property=secure,value=on \ - -blockdev "{'driver': 'file', 'filename': -'/usr/share/OVMF/OVMF_CODE_4M.secboot.fd', 'node-name': 'ovmf-code', 'read-only': -true}" \ - -blockdev "{'driver': 'file', 'filename': '/usr/share/OVMF/OVMF_VARS_4M.fd', -'node-name': 'ovmf-vars', 'read-only': true}" \ - -machine q35,smm=on,pflash0=ovmf-code,pflash1=ovmf-vars -m 2G -nodefaults - -And after some time (less than 20 seconds for me) it leads to - -qemu-system-x86_64: ../hw/core/cpu-sysemu.c:76: cpu_asidx_from_attrs: Assertion `ret < -cpu->num_ases && ret >= 0' failed. -Aborted (core dumped) - - -I've no idea how to correctly fix this bug, but I hope that my reproducer and -investigation will help a bit. - --- -Best regards, -Vladimir - diff --git a/results/classifier/02/semantic/05479587 b/results/classifier/02/semantic/05479587 deleted file mode 100644 index 8df3bc6ef..000000000 --- a/results/classifier/02/semantic/05479587 +++ /dev/null @@ -1,84 +0,0 @@ -semantic: 0.866 -mistranslation: 0.759 -instruction: 0.597 -boot: 0.474 -other: 0.200 - -[Qemu-devel] [BUG] network qga : windows os lost ip address of the network card in some cases - -We think this problem coulde be solevd in qga modulesãcan anybody give some -advice ? - - -[BUG] network : windows os lost ip address of the network card in some cases - -we found this problem for a long time ãFor example, if we has three network -card in virtual xml file ï¼such as "network connection 1" / "network connection -2"/"network connection 3" ã - -Echo network card has own ip address ï¼such as 192.168.1.1 / 2.1 /3.1 , when -delete the first card ï¼reboot the windows virtual os, then this problem -happened ! - - - - -we found that the sencond network card will replace the first one , then the -ip address of "network connection 2 " become 192.168.1.1 ã - - -Our third party users began to complain about this bug ãAll the business of the -second ip lost !!! - -I mean both of windows and linux has this bug , we solve this bug in linux -throught bonding netcrad pci and mac address ã - -There is no good solution on windows os . thera are ? we implemented a plan to -resumption of IP by QGA. Is there a better way ? - - - - - - - - -åå§é®ä»¶ - - - -å件人ï¼å°¹ä½ä¸º10144574 -æ¶ä»¶äººï¼ address@hidden -æ¥ æ ï¼2017å¹´04æ14æ¥ 16:46 -主 é¢ ï¼[BUG] network : windows os lost ip address of the network card in some -cases - - - - - - -we found this problem for a long time ãFor example, if we has three network -card in virtual xml file ï¼such as "network connection 1" / "network connection -2"/"network connection 3" ã - -Echo network card has own ip address ï¼such as 192.168.1.1 / 2.1 /3.1 , when -delete the first card ï¼reboot the windows virtual os, then this problem -happened ! - - - - -we found that the sencond network card will replace the first one , then the -ip address of "network connection 2 " become 192.168.1.1 ã - - -Our third party users began to complain about this bug ãAll the business of the -second ip lost !!! - -I mean both of windows and linux has this bug , we solve this bug in linux -throught bonding netcrad pci and mac address ã - -There is no good solution on windows os . thera are ? we implemented a plan to -resumption of IP by QGA. Is there a better way ? - diff --git a/results/classifier/02/semantic/12360755 b/results/classifier/02/semantic/12360755 deleted file mode 100644 index 591417d46..000000000 --- a/results/classifier/02/semantic/12360755 +++ /dev/null @@ -1,297 +0,0 @@ -semantic: 0.911 -instruction: 0.894 -other: 0.886 -mistranslation: 0.844 -boot: 0.818 - -[Qemu-devel] [BUG] virtio-net linux driver fails to probe on MIPS Malta since 'hw/virtio-pci: fix virtio behaviour' - -Hi, - -I've bisected the following failure of the virtio_net linux v4.10 driver -to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: - -virtio_net virtio0: virtio: device uses modern interface but does not have -VIRTIO_F_VERSION_1 -virtio_net: probe of virtio0 failed with error -22 - -To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). - -It appears that adding ",disable-modern=on,disable-legacy=off" to the -virtio-net -device makes it work again. - -I presume this should really just work out of the box. Any ideas why it -isn't? - -Cheers -James -signature.asc -Description: -Digital signature - -On 03/17/2017 11:57 PM, James Hogan wrote: -Hi, - -I've bisected the following failure of the virtio_net linux v4.10 driver -to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: - -virtio_net virtio0: virtio: device uses modern interface but does not have -VIRTIO_F_VERSION_1 -virtio_net: probe of virtio0 failed with error -22 - -To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). - -It appears that adding ",disable-modern=on,disable-legacy=off" to the -virtio-net -device makes it work again. - -I presume this should really just work out of the box. Any ideas why it -isn't? -Hi, - - -This is strange. This commit changes virtio devices from legacy to virtio -"transitional". -(your command line changes it to legacy) -Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU -side -there is nothing new. - -Michael, do you have any idea? - -Thanks, -Marcel -Cheers -James - -On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: -> -On 03/17/2017 11:57 PM, James Hogan wrote: -> -> Hi, -> -> -> -> I've bisected the following failure of the virtio_net linux v4.10 driver -> -> to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: -> -> -> -> virtio_net virtio0: virtio: device uses modern interface but does not have -> -> VIRTIO_F_VERSION_1 -> -> virtio_net: probe of virtio0 failed with error -22 -> -> -> -> To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). -> -> -> -> It appears that adding ",disable-modern=on,disable-legacy=off" to the -> -> virtio-net -device makes it work again. -> -> -> -> I presume this should really just work out of the box. Any ideas why it -> -> isn't? -> -> -> -> -Hi, -> -> -> -This is strange. This commit changes virtio devices from legacy to virtio -> -"transitional". -> -(your command line changes it to legacy) -> -Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU -> -side -> -there is nothing new. -> -> -Michael, do you have any idea? -> -> -Thanks, -> -Marcel -My guess would be firmware mishandling 64 bit BARs - we saw such -a case on sparc previously. As a result you are probably reading -all zeroes from features register or something like that. -Marcel, could you send a patch making the bar 32 bit? -If that helps we know what the issue is. - -> -> Cheers -> -> James -> -> - -On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: -On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: -On 03/17/2017 11:57 PM, James Hogan wrote: -Hi, - -I've bisected the following failure of the virtio_net linux v4.10 driver -to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: - -virtio_net virtio0: virtio: device uses modern interface but does not have -VIRTIO_F_VERSION_1 -virtio_net: probe of virtio0 failed with error -22 - -To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). - -It appears that adding ",disable-modern=on,disable-legacy=off" to the -virtio-net -device makes it work again. - -I presume this should really just work out of the box. Any ideas why it -isn't? -Hi, - - -This is strange. This commit changes virtio devices from legacy to virtio -"transitional". -(your command line changes it to legacy) -Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU -side -there is nothing new. - -Michael, do you have any idea? - -Thanks, -Marcel -My guess would be firmware mishandling 64 bit BARs - we saw such -a case on sparc previously. As a result you are probably reading -all zeroes from features register or something like that. -Marcel, could you send a patch making the bar 32 bit? -If that helps we know what the issue is. -Sure, - -Thanks, -Marcel -Cheers -James - -On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: -On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: -On 03/17/2017 11:57 PM, James Hogan wrote: -Hi, - -I've bisected the following failure of the virtio_net linux v4.10 driver -to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: - -virtio_net virtio0: virtio: device uses modern interface but does not have -VIRTIO_F_VERSION_1 -virtio_net: probe of virtio0 failed with error -22 - -To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). - -It appears that adding ",disable-modern=on,disable-legacy=off" to the -virtio-net -device makes it work again. - -I presume this should really just work out of the box. Any ideas why it -isn't? -Hi, - - -This is strange. This commit changes virtio devices from legacy to virtio -"transitional". -(your command line changes it to legacy) -Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU -side -there is nothing new. - -Michael, do you have any idea? - -Thanks, -Marcel -My guess would be firmware mishandling 64 bit BARs - we saw such -a case on sparc previously. As a result you are probably reading -all zeroes from features register or something like that. -Marcel, could you send a patch making the bar 32 bit? -If that helps we know what the issue is. -Hi James, - -Can you please check if the below patch fixes the problem? -Please note it is not a solution. - -diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c -index f9b7244..5b4d429 100644 ---- a/hw/virtio/virtio-pci.c -+++ b/hw/virtio/virtio-pci.c -@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, -Error **errp) - } - - pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, -- PCI_BASE_ADDRESS_SPACE_MEMORY | -- PCI_BASE_ADDRESS_MEM_PREFETCH | -- PCI_BASE_ADDRESS_MEM_TYPE_64, -+ PCI_BASE_ADDRESS_SPACE_MEMORY, - &proxy->modern_bar); - - proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); - - -Thanks, -Marcel - -Hi Marcel, - -On Tue, Mar 21, 2017 at 04:16:58PM +0200, Marcel Apfelbaum wrote: -> -Can you please check if the below patch fixes the problem? -> -Please note it is not a solution. -> -> -diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c -> -index f9b7244..5b4d429 100644 -> ---- a/hw/virtio/virtio-pci.c -> -+++ b/hw/virtio/virtio-pci.c -> -@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, -> -Error **errp) -> -} -> -> -pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, -> -- PCI_BASE_ADDRESS_SPACE_MEMORY | -> -- PCI_BASE_ADDRESS_MEM_PREFETCH | -> -- PCI_BASE_ADDRESS_MEM_TYPE_64, -> -+ PCI_BASE_ADDRESS_SPACE_MEMORY, -> -&proxy->modern_bar); -> -> -proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); -Sorry for the delay trying this, I was away last week. - -No, it doesn't seem to make any difference. - -Thanks -James -signature.asc -Description: -Digital signature - diff --git a/results/classifier/02/semantic/28596630 b/results/classifier/02/semantic/28596630 deleted file mode 100644 index ebf20576e..000000000 --- a/results/classifier/02/semantic/28596630 +++ /dev/null @@ -1,114 +0,0 @@ -semantic: 0.814 -mistranslation: 0.813 -instruction: 0.748 -other: 0.707 -boot: 0.609 - -[Qemu-devel] [BUG] [low severity] a strange appearance of message involving slirp while doing "empty" make - -Folks, - -If qemu tree is already fully built, and "make" is attempted, for 3.1, the -outcome is: - -$ make - CHK version_gen.h -$ - -For 4.0-rc0, the outcome seems to be different: - -$ make -make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' -make[1]: Nothing to be done for 'all'. -make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' - CHK version_gen.h -$ - -Not sure how significant is that, but I report it just in case. - -Yours, -Aleksandar - -On 20/03/2019 22.08, Aleksandar Markovic wrote: -> -Folks, -> -> -If qemu tree is already fully built, and "make" is attempted, for 3.1, the -> -outcome is: -> -> -$ make -> -CHK version_gen.h -> -$ -> -> -For 4.0-rc0, the outcome seems to be different: -> -> -$ make -> -make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' -> -make[1]: Nothing to be done for 'all'. -> -make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' -> -CHK version_gen.h -> -$ -> -> -Not sure how significant is that, but I report it just in case. -It's likely because slirp is currently being reworked to become a -separate project, so the makefiles have been changed a little bit. I -guess the message will go away again once slirp has become a stand-alone -library. - - Thomas - -On Fri, 22 Mar 2019 at 04:59, Thomas Huth <address@hidden> wrote: -> -On 20/03/2019 22.08, Aleksandar Markovic wrote: -> -> $ make -> -> make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' -> -> make[1]: Nothing to be done for 'all'. -> -> make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' -> -> CHK version_gen.h -> -> $ -> -> -> -> Not sure how significant is that, but I report it just in case. -> -> -It's likely because slirp is currently being reworked to become a -> -separate project, so the makefiles have been changed a little bit. I -> -guess the message will go away again once slirp has become a stand-alone -> -library. -Well, we'll still need to ship slirp for the foreseeable future... - -I think the cause of this is that the rule in Makefile for -calling the slirp Makefile is not passing it $(SUBDIR_MAKEFLAGS) -like all the other recursive make invocations. If we do that -then we'll suppress the entering/leaving messages for -non-verbose builds. (Some tweaking will be needed as -it looks like the slirp makefile has picked an incompatible -meaning for $BUILD_DIR, which the SUBDIR_MAKEFLAGS will -also be passing to it.) - -thanks --- PMM - diff --git a/results/classifier/02/semantic/30680944 b/results/classifier/02/semantic/30680944 deleted file mode 100644 index 3e4b4340f..000000000 --- a/results/classifier/02/semantic/30680944 +++ /dev/null @@ -1,596 +0,0 @@ -semantic: 0.953 -other: 0.944 -instruction: 0.919 -boot: 0.840 -mistranslation: 0.799 - -[BUG]QEMU jump into interrupt when single-stepping on aarch64 - -Dear, folks, - -I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 -platform, -the added breakpoint hits but after I type `step`, the gdb always jumps into -interrupt. - -My env: - - gdb-10.2 - qemu-6.2.0 - host kernel: 5.10.84 - VM kernel: 5.10.84 - -The steps to reproduce: - # host console: run a VM with only one core, the import arg: <qemu:arg -value='-s'/> - # details can be found here: -https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt -virsh create dev_core0.xml - - # run gdb client - gdb ./vmlinux - - # gdb client on host console - (gdb) dir -./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 - (gdb) target remote localhost:1234 - (gdb) info b - Num Type Disp Enb Address What - 1 breakpoint keep y <MULTIPLE> - 1.1 y 0xffff800010361444 -mm/memory-failure.c:1318 - 1.2 y 0xffff800010361450 in memory_failure - at mm/memory-failure.c:1488 - (gdb) c - Continuing. - - # console in VM, use madvise to inject a hwposion at virtual address -vaddr, - # which will hit the b inmemory_failur: madvise(vaddr, pagesize, -MADV_HWPOISON); - # and the VM pause - ./run_madvise.c - - # gdb client on host console - (gdb) - Continuing. - Breakpoint 1, 0xffff800010361444 in memory_failure () at -mm/memory-failure.c:1318 - 1318 res = -EHWPOISON; - (gdb) n - vectors () at arch/arm64/kernel/entry.S:552 - 552 kernel_ventry 1, irq // IRQ -EL1h - (gdb) n - (gdb) n - (gdb) n - (gdb) n - gic_handle_irq (regs=0xffff8000147c3b80) at -drivers/irqchip/irq-gic-v3.c:721 - # after several step, I got the irqnr - (gdb) p irqnr - $5 = 8262 - -Sometimes, the irqnr is 27ï¼ which is used for arch_timer. - -I was wondering do you have any comments on this? And feedback are welcomed. - -Thank you. - -Best Regards. -Shuai - -On 4/6/22 09:30, Shuai Xue wrote: -Dear, folks, - -I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 -platform, -the added breakpoint hits but after I type `step`, the gdb always jumps into -interrupt. - -My env: - - gdb-10.2 - qemu-6.2.0 - host kernel: 5.10.84 - VM kernel: 5.10.84 - -The steps to reproduce: - # host console: run a VM with only one core, the import arg: <qemu:arg -value='-s'/> - # details can be found here: -https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt -virsh create dev_core0.xml - - # run gdb client - gdb ./vmlinux - - # gdb client on host console - (gdb) dir -./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 - (gdb) target remote localhost:1234 - (gdb) info b - Num Type Disp Enb Address What - 1 breakpoint keep y <MULTIPLE> - 1.1 y 0xffff800010361444 -mm/memory-failure.c:1318 - 1.2 y 0xffff800010361450 in memory_failure - at mm/memory-failure.c:1488 - (gdb) c - Continuing. - - # console in VM, use madvise to inject a hwposion at virtual address -vaddr, - # which will hit the b inmemory_failur: madvise(vaddr, pagesize, -MADV_HWPOISON); - # and the VM pause - ./run_madvise.c - - # gdb client on host console - (gdb) - Continuing. - Breakpoint 1, 0xffff800010361444 in memory_failure () at -mm/memory-failure.c:1318 - 1318 res = -EHWPOISON; - (gdb) n - vectors () at arch/arm64/kernel/entry.S:552 - 552 kernel_ventry 1, irq // IRQ -EL1h -The 'n' command is not a single-step: use stepi, which will suppress interrupts. -Anyway, not a bug. - -r~ - -å¨ 2022/4/7 AM12:57, Richard Henderson åé: -> -On 4/6/22 09:30, Shuai Xue wrote: -> -> Dear, folks, -> -> -> -> I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 -> -> platform, -> -> the added breakpoint hits but after I type `step`, the gdb always jumps into -> -> interrupt. -> -> -> -> My env: -> -> -> ->     gdb-10.2 -> ->     qemu-6.2.0 -> ->     host kernel: 5.10.84 -> ->     VM kernel: 5.10.84 -> -> -> -> The steps to reproduce: -> ->     # host console: run a VM with only one core, the import arg: <qemu:arg -> -> value='-s'/> -> ->     # details can be found here: -> -> -https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt -> ->     virsh create dev_core0.xml -> ->     -> ->     # run gdb client -> ->     gdb ./vmlinux -> -> -> ->     # gdb client on host console -> ->     (gdb) dir -> -> ./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 -> ->     (gdb) target remote localhost:1234 -> ->     (gdb) info b -> ->     Num    Type          Disp Enb Address           What -> ->     1      breakpoint    keep y  <MULTIPLE> -> ->     1.1                        y  0xffff800010361444 -> -> mm/memory-failure.c:1318 -> ->     1.2                        y  0xffff800010361450 in memory_failure -> ->                                                    at -> -> mm/memory-failure.c:1488 -> ->     (gdb) c -> ->     Continuing. -> -> -> ->     # console in VM, use madvise to inject a hwposion at virtual address -> -> vaddr, -> ->     # which will hit the b inmemory_failur: madvise(vaddr, pagesize, -> -> MADV_HWPOISON); -> ->     # and the VM pause -> ->     ./run_madvise.c -> -> -> ->     # gdb client on host console -> ->     (gdb) -> ->     Continuing. -> ->     Breakpoint 1, 0xffff800010361444 in memory_failure () at -> -> mm/memory-failure.c:1318 -> ->     1318                   res = -EHWPOISON; -> ->     (gdb) n -> ->     vectors () at arch/arm64/kernel/entry.S:552 -> ->     552            kernel_ventry  1, irq                         // IRQ -> -> EL1h -> -> -The 'n' command is not a single-step: use stepi, which will suppress -> -interrupts. -> -Anyway, not a bug. -> -> -r~ -Hi, Richard, - -Thank you for your quick reply, I also try `stepi`, but it does NOT work either. - - (gdb) c - Continuing. - - Breakpoint 1, memory_failure (pfn=1273982, flags=1) at -mm/memory-failure.c:1488 - 1488 { - (gdb) stepi - vectors () at arch/arm64/kernel/entry.S:552 - 552 kernel_ventry 1, irq // IRQ -EL1h - -According to QEMU doc[1]: the default single stepping behavior is step with the -IRQs -and timer service routines off. I checked the MASK bits used to control the -single -stepping IE on my machine as bellow: - - # gdb client on host (x86 plafrom) - (gdb) maintenance packet qqemu.sstepbits - sending: "qqemu.sstepbits" - received: "ENABLE=1,NOIRQ=2,NOTIMER=4" - -The sstep MASK looks as expected, but does not work as expected. - -I also try the same kernel and qemu version on X86 platform: -> -> gdb-10.2 -> -> qemu-6.2.0 -> -> host kernel: 5.10.84 -> -> VM kernel: 5.10.84 -The command `n` jumps to the next instruction. - - # gdb client on host (x86 plafrom) - (gdb) b memory-failure.c:1488 - Breakpoint 1, memory_failure (pfn=1128931, flags=1) at -mm/memory-failure.c:1488 - 1488 { - (gdb) n - 1497 if (!sysctl_memory_failure_recovery) - (gdb) stepi - 0xffffffff812efdbc 1497 if -(!sysctl_memory_failure_recovery) - (gdb) stepi - 0xffffffff812efdbe 1497 if -(!sysctl_memory_failure_recovery) - (gdb) n - 1500 p = pfn_to_online_page(pfn); - (gdb) l - 1496 - 1497 if (!sysctl_memory_failure_recovery) - 1498 panic("Memory failure on page %lx", pfn); - 1499 - 1500 p = pfn_to_online_page(pfn); - 1501 if (!p) { - -Best Regrades, -Shuai - - -[1] -https://github.com/qemu/qemu/blob/master/docs/system/gdb.rst - -å¨ 2022/4/7 PM12:10, Shuai Xue åé: -> -å¨ 2022/4/7 AM12:57, Richard Henderson åé: -> -> On 4/6/22 09:30, Shuai Xue wrote: -> ->> Dear, folks, -> ->> -> ->> I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 -> ->> platform, -> ->> the added breakpoint hits but after I type `step`, the gdb always jumps -> ->> into interrupt. -> ->> -> ->> My env: -> ->> -> ->>     gdb-10.2 -> ->>     qemu-6.2.0 -> ->>     host kernel: 5.10.84 -> ->>     VM kernel: 5.10.84 -> ->> -> ->> The steps to reproduce: -> ->>     # host console: run a VM with only one core, the import arg: <qemu:arg -> ->> value='-s'/> -> ->>     # details can be found here: -> ->> -https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt -> ->>     virsh create dev_core0.xml -> ->>     -> ->>     # run gdb client -> ->>     gdb ./vmlinux -> ->> -> ->>     # gdb client on host console -> ->>     (gdb) dir -> ->> ./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 -> ->>     (gdb) target remote localhost:1234 -> ->>     (gdb) info b -> ->>     Num    Type          Disp Enb Address           What -> ->>     1      breakpoint    keep y  <MULTIPLE> -> ->>     1.1                        y  0xffff800010361444 -> ->> mm/memory-failure.c:1318 -> ->>     1.2                        y  0xffff800010361450 in memory_failure -> ->>                                                    at -> ->> mm/memory-failure.c:1488 -> ->>     (gdb) c -> ->>     Continuing. -> ->> -> ->>     # console in VM, use madvise to inject a hwposion at virtual address -> ->> vaddr, -> ->>     # which will hit the b inmemory_failur: madvise(vaddr, pagesize, -> ->> MADV_HWPOISON); -> ->>     # and the VM pause -> ->>     ./run_madvise.c -> ->> -> ->>     # gdb client on host console -> ->>     (gdb) -> ->>     Continuing. -> ->>     Breakpoint 1, 0xffff800010361444 in memory_failure () at -> ->> mm/memory-failure.c:1318 -> ->>     1318                   res = -EHWPOISON; -> ->>     (gdb) n -> ->>     vectors () at arch/arm64/kernel/entry.S:552 -> ->>     552            kernel_ventry  1, irq                         // IRQ -> ->> EL1h -> -> -> -> The 'n' command is not a single-step: use stepi, which will suppress -> -> interrupts. -> -> Anyway, not a bug. -> -> -> -> r~ -> -> -Hi, Richard, -> -> -Thank you for your quick reply, I also try `stepi`, but it does NOT work -> -either. -> -> -(gdb) c -> -Continuing. -> -> -Breakpoint 1, memory_failure (pfn=1273982, flags=1) at -> -mm/memory-failure.c:1488 -> -1488 { -> -(gdb) stepi -> -vectors () at arch/arm64/kernel/entry.S:552 -> -552 kernel_ventry 1, irq // IRQ -> -EL1h -> -> -According to QEMU doc[1]: the default single stepping behavior is step with -> -the IRQs -> -and timer service routines off. I checked the MASK bits used to control the -> -single -> -stepping IE on my machine as bellow: -> -> -# gdb client on host (x86 plafrom) -> -(gdb) maintenance packet qqemu.sstepbits -> -sending: "qqemu.sstepbits" -> -received: "ENABLE=1,NOIRQ=2,NOTIMER=4" -> -> -The sstep MASK looks as expected, but does not work as expected. -> -> -I also try the same kernel and qemu version on X86 platform: -> ->> gdb-10.2 -> ->> qemu-6.2.0 -> ->> host kernel: 5.10.84 -> ->> VM kernel: 5.10.84 -> -> -> -The command `n` jumps to the next instruction. -> -> -# gdb client on host (x86 plafrom) -> -(gdb) b memory-failure.c:1488 -> -Breakpoint 1, memory_failure (pfn=1128931, flags=1) at -> -mm/memory-failure.c:1488 -> -1488 { -> -(gdb) n -> -1497 if (!sysctl_memory_failure_recovery) -> -(gdb) stepi -> -0xffffffff812efdbc 1497 if -> -(!sysctl_memory_failure_recovery) -> -(gdb) stepi -> -0xffffffff812efdbe 1497 if -> -(!sysctl_memory_failure_recovery) -> -(gdb) n -> -1500 p = pfn_to_online_page(pfn); -> -(gdb) l -> -1496 -> -1497 if (!sysctl_memory_failure_recovery) -> -1498 panic("Memory failure on page %lx", pfn); -> -1499 -> -1500 p = pfn_to_online_page(pfn); -> -1501 if (!p) { -> -> -Best Regrades, -> -Shuai -> -> -> -[1] -https://github.com/qemu/qemu/blob/master/docs/system/gdb.rst -Hi, Richard, - -I was wondering that do you have any comments to this? - -Best Regrades, -Shuai - diff --git a/results/classifier/02/semantic/46572227 b/results/classifier/02/semantic/46572227 deleted file mode 100644 index 579703ee5..000000000 --- a/results/classifier/02/semantic/46572227 +++ /dev/null @@ -1,407 +0,0 @@ -semantic: 0.965 -mistranslation: 0.946 -other: 0.927 -instruction: 0.906 -boot: 0.900 - -[Qemu-devel] [Bug?] Windows 7's time drift obviously while RTC rate switching frequently between high and low timer rate - -Hi, - -We tested with the latest QEMU, and found that time drift obviously (clock fast -in guest) -in Windows 7 64 bits guest in some cases. - -It is easily to reproduce, using the follow QEMU command line to start windows -7: - -# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine -pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp -4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet --global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc -:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device -piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 --device usb-kbd,id=input2 -monitor stdio - -Adjust the VM's time to host time, and run java application or run the follow -program -in windows 7: - -#pragma comment(lib, "winmm") -#include <stdio.h> -#include <windows.h> - -#define SWITCH_PEROID 13 - -int main() -{ - DWORD count = 0; - - while (1) - { - count++; - timeBeginPeriod(1); - DWORD start = timeGetTime(); - Sleep(40); - timeEndPeriod(1); - if ((count % SWITCH_PEROID) == 0) { - Sleep(1); - } - } - return 0; -} - -After few minutes, you will find that the time in windows 7 goes ahead of the -host time, drifts about several seconds. - -I have dug deeper in this problem. For windows systems that use the CMOS timer, -the base interrupt rate is usually 64Hz, but running some application in VM -will raise the timer rate to 1024Hz, running java application and or above -program will raise the timer rate. -Besides, Windows operating systems generally keep time by counting timer -interrupts (ticks). But QEMU seems not emulate the rate converting fine. - -We update the timer in function periodic_timer_update(): -static void periodic_timer_update(RTCState *s, int64_t current_time) -{ - - cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, get_ticks_per_sec()); - next_irq_clock = (cur_clock & ~(period - 1)) + period; - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Here we calculate the next interrupt time by align the current clock with the -new period, I'm a little confused that why we care about the *history* time ? -If VM switches from high rate to low rate, the next interrupt time may come -earlier than it supposed to be. We have observed it in our test. we printed the -interval time of interrupts and the VM's current time (We got the time from VM). - -Here is part of the log: -... ... -period=512 irq inject 1534: 15625 us -Tue Mar 29 04:38:00 2016 -*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 -us -... ... -*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 -us -Convert 32 --- > 512: 703: 96578 us -period=512 irq inject 44391: 12702 us -Convert 512 --- > 32: 704: 12704 us11 -period=32 irq inject 44392: 979 us -... ... -32 --- > 512: 705: 24388 us -period=512 irq inject 44417: 6834 us -Convert 512 --- > 32: 706: 6830 us -period=32 irq inject 44418: 978 us -... ... -Convert 32 --- > 512: 707: 60525 us -period=512 irq inject 44480: 1945 us -Convert 512 --- > 32: 708: 1955 us -period=32 irq inject 44481: 977 us -... ... -Convert 32 --- > 512: 709: 36105 us -period=512 irq inject 44518: 10741 us -Convert 512 --- > 32: 710: 10736 us -period=32 irq inject 44519: 989 us -... ... -Convert 32 --- > 512: 711: 123998 us -period=512 irq inject 44646: 974 us -period=512 irq inject 44647: 15607 us -Convert 512 --- > 32: 712: 16560 us -period=32 irq inject 44648: 980 us -... ... -period=32 irq inject 44738: 974 us -Convert 32 --- > 512: 713: 88828 us -period=512 irq inject 44739: 4885 us -Convert 512 --- > 32: 714: 4882 us -period=32 irq inject 44740: 989 us -... ... -period=32 irq inject 44842: 974 us -Convert 32 --- > 512: 715: 100537 us -period=512 irq inject 44843: 8788 us -Convert 512 --- > 32: 716: 8789 us -period=32 irq inject 44844: 972 us -... ... -period=32 irq inject 44941: 979 us -Convert 32 --- > 512: 717: 95677 us -period=512 irq inject 44942: 13661 us -Convert 512 --- > 32: 718: 13657 us -period=32 irq inject 44943: 987 us -... ... -Convert 32 --- > 512: 719: 94690 us -period=512 irq inject 45040: 14643 us -Convert 512 --- > 32: 720: 14642 us -period=32 irq inject 45041: 974 us -... ... -Convert 32 --- > 512: 721: 88848 us -period=512 irq inject 45132: 4892 us -Convert 512 --- > 32: 722: 4931 us -period=32 irq inject 45133: 964 us -... ... -Tue Mar 29 04:39:19 2016 -*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is -911520 us - -For windows 7, it has got 835 IRQs which injected during the period of 32, -and got 11 IRQs that injected during the period of 512. it updated the -wall-clock -time with one second, because it supposed it has counted -(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us. - -IMHO, we should calculate the next interrupt time based on the time of last -interrupt injected, and it seems to be more similar with hardware CMOS timer -in this way. -Maybe someone can tell me the reason why we calculated the interrupt timer -in that way, or is it a bug ? ;) - -Thanks, -Hailiang - -ping... - -It seems that we can eliminate the drift by the following patch. -(I tested it for two hours, and there is no drift, before, the timer -in Windows 7 drifts about 2 seconds per minute.) I'm not sure if it is -the right way to solve the problem. -Any comments are welcomed. Thanks. - -From bd6acd577cbbc9d92d6376c770219470f184f7de Mon Sep 17 00:00:00 2001 -From: zhanghailiang <address@hidden> -Date: Thu, 31 Mar 2016 16:36:15 -0400 -Subject: [PATCH] timer/mc146818rtc: fix timer drift in Windows OS while RTC - rate converting frequently - -Signed-off-by: zhanghailiang <address@hidden> ---- - hw/timer/mc146818rtc.c | 25 ++++++++++++++++++++++--- - 1 file changed, 22 insertions(+), 3 deletions(-) - -diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c -index 2ac0fd3..e39d2da 100644 ---- a/hw/timer/mc146818rtc.c -+++ b/hw/timer/mc146818rtc.c -@@ -79,6 +79,7 @@ typedef struct RTCState { - /* periodic timer */ - QEMUTimer *periodic_timer; - int64_t next_periodic_time; -+ uint64_t last_periodic_time; - /* update-ended timer */ - QEMUTimer *update_timer; - uint64_t next_alarm_time; -@@ -152,7 +153,8 @@ static void rtc_coalesced_timer(void *opaque) - static void periodic_timer_update(RTCState *s, int64_t current_time) - { - int period_code, period; -- int64_t cur_clock, next_irq_clock; -+ int64_t cur_clock, next_irq_clock, pre_irq_clock; -+ bool change = false; - - period_code = s->cmos_data[RTC_REG_A] & 0x0f; - if (period_code != 0 -@@ -165,14 +167,28 @@ static void periodic_timer_update(RTCState *s, int64_t -current_time) - if (period != s->period) { - s->irq_coalesced = (s->irq_coalesced * s->period) / period; - DPRINTF_C("cmos: coalesced irqs scaled to %d\n", s->irq_coalesced); -+ if (s->period && period) { -+ change = true; -+ } - } - s->period = period; - #endif - /* compute 32 khz clock */ - cur_clock = - muldiv64(current_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND); -+ if (change) { -+ int offset = 0; - -- next_irq_clock = (cur_clock & ~(period - 1)) + period; -+ pre_irq_clock = muldiv64(s->last_periodic_time, RTC_CLOCK_RATE, -+ NANOSECONDS_PER_SECOND); -+ if ((cur_clock - pre_irq_clock) > period) { -+ offset = (cur_clock - pre_irq_clock) / period; -+ } -+ s->irq_coalesced += offset; -+ next_irq_clock = pre_irq_clock + (offset + 1) * period; -+ } else { -+ next_irq_clock = (cur_clock & ~(period - 1)) + period; -+ } - s->next_periodic_time = muldiv64(next_irq_clock, -NANOSECONDS_PER_SECOND, - RTC_CLOCK_RATE) + 1; - timer_mod(s->periodic_timer, s->next_periodic_time); -@@ -187,7 +203,9 @@ static void periodic_timer_update(RTCState *s, int64_t -current_time) - static void rtc_periodic_timer(void *opaque) - { - RTCState *s = opaque; -- -+ int64_t next_periodic_time; -+ -+ next_periodic_time = s->next_periodic_time; - periodic_timer_update(s, s->next_periodic_time); - s->cmos_data[RTC_REG_C] |= REG_C_PF; - if (s->cmos_data[RTC_REG_B] & REG_B_PIE) { -@@ -204,6 +222,7 @@ static void rtc_periodic_timer(void *opaque) - DPRINTF_C("cmos: coalesced irqs increased to %d\n", - s->irq_coalesced); - } -+ s->last_periodic_time = next_periodic_time; - } else - #endif - qemu_irq_raise(s->irq); --- -1.8.3.1 - - -On 2016/3/29 19:58, Hailiang Zhang wrote: -Hi, - -We tested with the latest QEMU, and found that time drift obviously (clock fast -in guest) -in Windows 7 64 bits guest in some cases. - -It is easily to reproduce, using the follow QEMU command line to start windows -7: - -# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine -pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp -4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet --global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc -:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device -piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 --device usb-kbd,id=input2 -monitor stdio - -Adjust the VM's time to host time, and run java application or run the follow -program -in windows 7: - -#pragma comment(lib, "winmm") -#include <stdio.h> -#include <windows.h> - -#define SWITCH_PEROID 13 - -int main() -{ - DWORD count = 0; - - while (1) - { - count++; - timeBeginPeriod(1); - DWORD start = timeGetTime(); - Sleep(40); - timeEndPeriod(1); - if ((count % SWITCH_PEROID) == 0) { - Sleep(1); - } - } - return 0; -} - -After few minutes, you will find that the time in windows 7 goes ahead of the -host time, drifts about several seconds. - -I have dug deeper in this problem. For windows systems that use the CMOS timer, -the base interrupt rate is usually 64Hz, but running some application in VM -will raise the timer rate to 1024Hz, running java application and or above -program will raise the timer rate. -Besides, Windows operating systems generally keep time by counting timer -interrupts (ticks). But QEMU seems not emulate the rate converting fine. - -We update the timer in function periodic_timer_update(): -static void periodic_timer_update(RTCState *s, int64_t current_time) -{ - - cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, -get_ticks_per_sec()); - next_irq_clock = (cur_clock & ~(period - 1)) + period; - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Here we calculate the next interrupt time by align the current clock with the -new period, I'm a little confused that why we care about the *history* time ? -If VM switches from high rate to low rate, the next interrupt time may come -earlier than it supposed to be. We have observed it in our test. we printed the -interval time of interrupts and the VM's current time (We got the time from VM). - -Here is part of the log: -... ... -period=512 irq inject 1534: 15625 us -Tue Mar 29 04:38:00 2016 -*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 -us -... ... -*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 -us -Convert 32 --- > 512: 703: 96578 us -period=512 irq inject 44391: 12702 us -Convert 512 --- > 32: 704: 12704 us11 -period=32 irq inject 44392: 979 us -... ... -32 --- > 512: 705: 24388 us -period=512 irq inject 44417: 6834 us -Convert 512 --- > 32: 706: 6830 us -period=32 irq inject 44418: 978 us -... ... -Convert 32 --- > 512: 707: 60525 us -period=512 irq inject 44480: 1945 us -Convert 512 --- > 32: 708: 1955 us -period=32 irq inject 44481: 977 us -... ... -Convert 32 --- > 512: 709: 36105 us -period=512 irq inject 44518: 10741 us -Convert 512 --- > 32: 710: 10736 us -period=32 irq inject 44519: 989 us -... ... -Convert 32 --- > 512: 711: 123998 us -period=512 irq inject 44646: 974 us -period=512 irq inject 44647: 15607 us -Convert 512 --- > 32: 712: 16560 us -period=32 irq inject 44648: 980 us -... ... -period=32 irq inject 44738: 974 us -Convert 32 --- > 512: 713: 88828 us -period=512 irq inject 44739: 4885 us -Convert 512 --- > 32: 714: 4882 us -period=32 irq inject 44740: 989 us -... ... -period=32 irq inject 44842: 974 us -Convert 32 --- > 512: 715: 100537 us -period=512 irq inject 44843: 8788 us -Convert 512 --- > 32: 716: 8789 us -period=32 irq inject 44844: 972 us -... ... -period=32 irq inject 44941: 979 us -Convert 32 --- > 512: 717: 95677 us -period=512 irq inject 44942: 13661 us -Convert 512 --- > 32: 718: 13657 us -period=32 irq inject 44943: 987 us -... ... -Convert 32 --- > 512: 719: 94690 us -period=512 irq inject 45040: 14643 us -Convert 512 --- > 32: 720: 14642 us -period=32 irq inject 45041: 974 us -... ... -Convert 32 --- > 512: 721: 88848 us -period=512 irq inject 45132: 4892 us -Convert 512 --- > 32: 722: 4931 us -period=32 irq inject 45133: 964 us -... ... -Tue Mar 29 04:39:19 2016 -*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is -911520 us - -For windows 7, it has got 835 IRQs which injected during the period of 32, -and got 11 IRQs that injected during the period of 512. it updated the -wall-clock -time with one second, because it supposed it has counted -(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us. - -IMHO, we should calculate the next interrupt time based on the time of last -interrupt injected, and it seems to be more similar with hardware CMOS timer -in this way. -Maybe someone can tell me the reason why we calculated the interrupt timer -in that way, or is it a bug ? ;) - -Thanks, -Hailiang - diff --git a/results/classifier/02/semantic/53568181 b/results/classifier/02/semantic/53568181 deleted file mode 100644 index 2a2342afc..000000000 --- a/results/classifier/02/semantic/53568181 +++ /dev/null @@ -1,79 +0,0 @@ -semantic: 0.943 -instruction: 0.932 -other: 0.921 -boot: 0.876 -mistranslation: 0.854 - -[BUG] x86/PAT handling severely crippled AMD-V SVM KVM performance - -Hi, I maintain an out-of-tree 3D APIs pass-through QEMU device models at -https://github.com/kjliew/qemu-3dfx -that provide 3D acceleration for legacy -32-bit Windows guests (Win98SE, WinME, Win2k and WinXP) with the focus on -playing old legacy games from 1996-2003. It currently supports the now-defunct -3Dfx propriety API called Glide and an alternative OpenGL pass-through based on -MESA implementation. - -The basic concept of both implementations create memory-mapped virtual -interfaces consist of host/guest shared memory with guest-push model instead of -a more common host-pull model for typical QEMU device model implementation. -Guest uses shared memory as FIFOs for drawing commands and data to bulk up the -operations until serialization event that flushes the FIFOs into host. This -achieves extremely good performance since virtual CPUs are fast with hardware -acceleration (Intel VT/AMD-V) and reduces the overhead of frequent VMEXITs to -service the device emulation. Both implementations work on Windows 10 with WHPX -and HAXM accelerators as well as KVM in Linux. - -On Windows 10, QEMU WHPX implementation does not sync MSR_IA32_PAT during -host/guest states sync. There is no visibility into the closed-source WHPX on -how things are managed behind the scene, but from measuring performance figures -I can conclude that it didn't handle the MSR_IA32_PAT correctly for both Intel -and AMD. Call this fair enough, if you will, it didn't flag any concerns, in -fact games such as Quake2 and Quake3 were still within playable frame rate of -40~60FPS on Win2k/XP guest. Until the same games were run on Win98/ME guest and -the frame rate blew off the roof (300~500FPS) on the same CPU and GPU. In fact, -the later seemed to be more inlined with runnng the games bare-metal with vsync -off. - -On Linux (at the time of writing kernel 5.6.7/Mesa 20.0), the difference -prevailed. Intel CPUs (and it so happened that I was on laptop with Intel GPU), -the VMX-based kvm_intel got it right while SVM-based kvm_amd did not. -To put this in simple exaggeration, an aging Core i3-4010U/HD Graphics 4400 -(Haswell GT2) exhibited an insane performance in Quake2/Quake3 timedemos that -totally crushed more recent AMD Ryzen 2500U APU/Vega 8 Graphics and AMD -FX8300/NVIDIA GT730 on desktop. Simply unbelievable! - -It turned out that there was something to do with AMD-V NPT. By loading kvm_amd -with npt=0, AMD Ryzen APU and FX8300 regained a huge performance leap. However, -AMD NPT issue with KVM was supposedly fixed in 2017 kernel commits. NPT=0 would -actually incur performance loss for VM due to intervention required by -hypervisors to maintain the shadow page tables. Finally, I was able to find the -pointer that pointed to MSR_IA32_PAT register. By updating the MSR_IA32_PAT to -0x0606xxxx0606xxxxULL, AMD CPUs now regain their rightful performance without -taking the hit of NPT=0 for Linux KVM. Taking the same solution into Windows, -both Intel and AMD CPUs no longer require Win98/ME guest to unleash the full -performance potentials and performance figures based on games measured on WHPX -were not very far behind Linux KVM. - -So I guess the problem lies in host/guest shared memory regions mapped as -uncacheable from virtual CPU perspective. As virtual CPUs now completely execute -in hardware context with x86 hardware virtualiztion extensions, the cacheability -of memory types would severely impact the performance on guests. WHPX didn't -handle it for both Intel EPT and AMD NPT, but KVM seems to do it right for Intel -EPT. I don't have the correct fix for QEMU. But what I can do for my 3D APIs -pass-through device models is to implement host-side hooks to reprogram and -restore MSR_IA32_PAT upon activation/deactivation of the 3D APIs. Perhaps there -is also a better solution of having the proper kernel drivers for virtual -interfaces to manage the memory types of host/guest shared memory in kernel -space, but to do that and the needs of Microsoft tools/DDKs, I will just forget -it. The guest stubs uses the same kernel drivers included in 3Dfx drivers for -memory mapping and the virtual interfaces remain driver-less from Windows OS -perspective. Considering the current state of halting progress for QEMU native -virgil3D to support Windows OS, I am just being pragmatic. I understand that -QEMU virgil3D will eventually bring 3D acceleration for Windows guests, but I do -not expect anything to support legacy 32-bit Windows OSes which have out-grown -their commercial usefulness. - -Regards, -KJ Liew - diff --git a/results/classifier/02/semantic/80570214 b/results/classifier/02/semantic/80570214 deleted file mode 100644 index 987d0c1ea..000000000 --- a/results/classifier/02/semantic/80570214 +++ /dev/null @@ -1,401 +0,0 @@ -semantic: 0.978 -instruction: 0.978 -other: 0.978 -mistranslation: 0.973 -boot: 0.969 - -[Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user - -Hi, - -We catch a segfault in our project. - -Qemu version is 2.3.0 - -The Stack backtrace is: -(gdb) bt -#0 0x0000000000000000 in ?? () -#1 0x00007f7ad9280b2f in qemu_deliver_packet (sender=<optimized out>, flags=<optimized -out>, data=<optimized out>, size=100, opaque= - 0x7f7ad9d6db10) at net/net.c:510 -#2 0x00007f7ad92831fa in qemu_net_queue_deliver (size=<optimized out>, data=<optimized -out>, flags=<optimized out>, - sender=<optimized out>, queue=<optimized out>) at net/queue.c:157 -#3 qemu_net_queue_flush (queue=0x7f7ad9d39630) at net/queue.c:254 -#4 0x00007f7ad9280dac in qemu_flush_or_purge_queued_packets -(nc=0x7f7ad9d6db10, purge=true) at net/net.c:539 -#5 0x00007f7ad9280e76 in net_vm_change_state_handler (opaque=<optimized out>, -running=<optimized out>, state=100) at net/net.c:1214 -#6 0x00007f7ad915612f in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) -at vl.c:1820 -#7 0x00007f7ad906db1a in do_vm_stop (state=<optimized out>) at -/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:631 -#8 vm_stop (state=RUN_STATE_SHUTDOWN) at -/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:1325 -#9 0x00007f7ad915e4a2 in main_loop_should_exit () at vl.c:2080 -#10 main_loop () at vl.c:2131 -#11 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at -vl.c:4721 -(gdb) p *(NetClientState *)0x7f7ad9d6db10 -$1 = {info = 0x7f7ad9824520, link_down = 0, next = {tqe_next = 0x7f7ad0f06d10, -tqe_prev = 0x7f7ad98b1cf0}, peer = 0x7f7ad0f06d10, - incoming_queue = 0x7f7ad9d39630, model = 0x7f7ad9d39590 "vhost_user", name = -0x7f7ad9d39570 "hostnet0", info_str = - "vhost-user to charnet0", '\000' <repeats 233 times>, receive_disabled = 0, -destructor = - 0x7f7ad92821f0 <qemu_net_client_destructor>, queue_index = 0, -rxfilter_notify_enabled = 0} -(gdb) p *(NetClientInfo *)0x7f7ad9824520 -$2 = {type = NET_CLIENT_OPTIONS_KIND_VHOST_USER, size = 360, receive = 0, -receive_raw = 0, receive_iov = 0, can_receive = 0, cleanup = - 0x7f7ad9288850 <vhost_user_cleanup>, link_status_changed = 0, -query_rx_filter = 0, poll = 0, has_ufo = - 0x7f7ad92886d0 <vhost_user_has_ufo>, has_vnet_hdr = 0x7f7ad9288670 -<vhost_user_has_vnet_hdr>, has_vnet_hdr_len = 0, - using_vnet_hdr = 0, set_offload = 0, set_vnet_hdr_len = 0} -(gdb) - -The corresponding codes where gdb reports error are: (We have added some codes -in net.c) -ssize_t qemu_deliver_packet(NetClientState *sender, - unsigned flags, - const uint8_t *data, - size_t size, - void *opaque) -{ - NetClientState *nc = opaque; - ssize_t ret; - - if (nc->link_down) { - return size; - } - - if (nc->receive_disabled) { - return 0; - } - - if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { - ret = nc->info->receive_raw(nc, data, size); - } else { - ret = nc->info->receive(nc, data, size); ----> Here is 510 line - } - -I'm not quite familiar with vhost-user, but for vhost-user, these two callback -functions seem to be always NULL, -Why we can come here ? -Is it an error to add VM state change handler for vhost-user ? - -Thanks, -zhanghailiang - -Hi - -On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang -<address@hidden> wrote: -> -The corresponding codes where gdb reports error are: (We have added some -> -codes in net.c) -Can you reproduce with unmodified qemu? Could you give instructions to do so? - -> -ssize_t qemu_deliver_packet(NetClientState *sender, -> -unsigned flags, -> -const uint8_t *data, -> -size_t size, -> -void *opaque) -> -{ -> -NetClientState *nc = opaque; -> -ssize_t ret; -> -> -if (nc->link_down) { -> -return size; -> -} -> -> -if (nc->receive_disabled) { -> -return 0; -> -} -> -> -if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { -> -ret = nc->info->receive_raw(nc, data, size); -> -} else { -> -ret = nc->info->receive(nc, data, size); ----> Here is 510 line -> -} -> -> -I'm not quite familiar with vhost-user, but for vhost-user, these two -> -callback functions seem to be always NULL, -> -Why we can come here ? -You should not come here, vhost-user has nc->receive_disabled (it -changes in 2.5) - --- -Marc-André Lureau - -On 2015/11/3 22:54, Marc-André Lureau wrote: -Hi - -On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang -<address@hidden> wrote: -The corresponding codes where gdb reports error are: (We have added some -codes in net.c) -Can you reproduce with unmodified qemu? Could you give instructions to do so? -OK, i will try to do it. There is nothing special, we run iperf tool in VM, -and then shutdown or reboot it. There is change you can catch segfault. -ssize_t qemu_deliver_packet(NetClientState *sender, - unsigned flags, - const uint8_t *data, - size_t size, - void *opaque) -{ - NetClientState *nc = opaque; - ssize_t ret; - - if (nc->link_down) { - return size; - } - - if (nc->receive_disabled) { - return 0; - } - - if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { - ret = nc->info->receive_raw(nc, data, size); - } else { - ret = nc->info->receive(nc, data, size); ----> Here is 510 line - } - -I'm not quite familiar with vhost-user, but for vhost-user, these two -callback functions seem to be always NULL, -Why we can come here ? -You should not come here, vhost-user has nc->receive_disabled (it -changes in 2.5) -I have looked at the newest codes, i think we can still have chance to -come here, since we will change nc->receive_disable to false temporarily in -qemu_flush_or_purge_queued_packets(), there is no difference between 2.3 and 2.5 -for this. -Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true -in qemu_net_queue_flush() for vhost-user ? - -i will try to reproduce it by using newest qemu. - -Thanks, -zhanghailiang - -On 11/04/2015 10:24 AM, zhanghailiang wrote: -> -On 2015/11/3 22:54, Marc-André Lureau wrote: -> -> Hi -> -> -> -> On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang -> -> <address@hidden> wrote: -> ->> The corresponding codes where gdb reports error are: (We have added -> ->> some -> ->> codes in net.c) -> -> -> -> Can you reproduce with unmodified qemu? Could you give instructions -> -> to do so? -> -> -> -> -OK, i will try to do it. There is nothing special, we run iperf tool -> -in VM, -> -and then shutdown or reboot it. There is change you can catch segfault. -> -> ->> ssize_t qemu_deliver_packet(NetClientState *sender, -> ->> unsigned flags, -> ->> const uint8_t *data, -> ->> size_t size, -> ->> void *opaque) -> ->> { -> ->> NetClientState *nc = opaque; -> ->> ssize_t ret; -> ->> -> ->> if (nc->link_down) { -> ->> return size; -> ->> } -> ->> -> ->> if (nc->receive_disabled) { -> ->> return 0; -> ->> } -> ->> -> ->> if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { -> ->> ret = nc->info->receive_raw(nc, data, size); -> ->> } else { -> ->> ret = nc->info->receive(nc, data, size); ----> Here is -> ->> 510 line -> ->> } -> ->> -> ->> I'm not quite familiar with vhost-user, but for vhost-user, these two -> ->> callback functions seem to be always NULL, -> ->> Why we can come here ? -> -> -> -> You should not come here, vhost-user has nc->receive_disabled (it -> -> changes in 2.5) -> -> -> -> -I have looked at the newest codes, i think we can still have chance to -> -come here, since we will change nc->receive_disable to false -> -temporarily in -> -qemu_flush_or_purge_queued_packets(), there is no difference between -> -2.3 and 2.5 -> -for this. -> -Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true -> -in qemu_net_queue_flush() for vhost-user ? -The only thing I can image is self announcing. Are you trying to do -migration? 2.5 only support sending rarp through this. - -And it's better to have a breakpoint to see why a packet was queued for -vhost-user. The stack trace may also help in this case. - -> -> -i will try to reproduce it by using newest qemu. -> -> -Thanks, -> -zhanghailiang -> - -On 2015/11/4 11:19, Jason Wang wrote: -On 11/04/2015 10:24 AM, zhanghailiang wrote: -On 2015/11/3 22:54, Marc-André Lureau wrote: -Hi - -On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang -<address@hidden> wrote: -The corresponding codes where gdb reports error are: (We have added -some -codes in net.c) -Can you reproduce with unmodified qemu? Could you give instructions -to do so? -OK, i will try to do it. There is nothing special, we run iperf tool -in VM, -and then shutdown or reboot it. There is change you can catch segfault. -ssize_t qemu_deliver_packet(NetClientState *sender, - unsigned flags, - const uint8_t *data, - size_t size, - void *opaque) -{ - NetClientState *nc = opaque; - ssize_t ret; - - if (nc->link_down) { - return size; - } - - if (nc->receive_disabled) { - return 0; - } - - if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { - ret = nc->info->receive_raw(nc, data, size); - } else { - ret = nc->info->receive(nc, data, size); ----> Here is -510 line - } - -I'm not quite familiar with vhost-user, but for vhost-user, these two -callback functions seem to be always NULL, -Why we can come here ? -You should not come here, vhost-user has nc->receive_disabled (it -changes in 2.5) -I have looked at the newest codes, i think we can still have chance to -come here, since we will change nc->receive_disable to false -temporarily in -qemu_flush_or_purge_queued_packets(), there is no difference between -2.3 and 2.5 -for this. -Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true -in qemu_net_queue_flush() for vhost-user ? -The only thing I can image is self announcing. Are you trying to do -migration? 2.5 only support sending rarp through this. -Hmm, it's not triggered by migration, For qemu-2.5, IMHO, it doesn't have such -problem, -since the callback function 'receive' is not NULL. It is vhost_user_receive(). -And it's better to have a breakpoint to see why a packet was queued for -vhost-user. The stack trace may also help in this case. -OK, i'm trying to reproduce it. - -Thanks, -zhanghailiang -i will try to reproduce it by using newest qemu. - -Thanks, -zhanghailiang -. - diff --git a/results/classifier/02/semantic/96782458 b/results/classifier/02/semantic/96782458 deleted file mode 100644 index 7506ea447..000000000 --- a/results/classifier/02/semantic/96782458 +++ /dev/null @@ -1,1000 +0,0 @@ -semantic: 0.984 -other: 0.982 -boot: 0.980 -instruction: 0.974 -mistranslation: 0.949 - -[Qemu-devel] [BUG] Migrate failes between boards with different PMC counts - -Hi all, - -Recently, I found migration failed when enable vPMU. - -migrate vPMU state was introduced in linux-3.10 + qemu-1.7. - -As long as enable vPMU, qemu will save / load the -vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration. -But global_ctrl generated based on cpuid(0xA), the number of general-purpose -performance -monitoring counters(PMC) can vary according to Intel SDN. The number of PMC -presented -to vm, does not support configuration currently, it depend on host cpuid, and -enable all pmc -defaultly at KVM. It cause migration to fail between boards with different PMC -counts. - -The return value of cpuid (0xA) is different dur to cpu, according to Intel -SDNï¼18-10 Vol. 3B: - -Note: The number of general-purpose performance monitoring counters (i.e. N in -Figure 18-9) -can vary across processor generations within a processor family, across -processor families, or -could be different depending on the configuration chosen at boot time in the -BIOS regarding -Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N -=4 for processors -based on the Nehalem microarchitecture; for processors based on the Sandy Bridge -microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 -if not active). - -Also I found, N=8 if HT is not active based on the broadwellï¼, -such as CPU E7-8890 v4 @ 2.20GHz - -# ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda -/data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming -tcp::8888 -Completed 100 % -qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff -qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -kvm_put_msrs: -Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -Aborted - -So make number of pmc configurable to vm ? Any better idea ? - - -Regards, --Zhuang Yanying - -* Zhuangyanying (address@hidden) wrote: -> -Hi all, -> -> -Recently, I found migration failed when enable vPMU. -> -> -migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> -As long as enable vPMU, qemu will save / load the -> -vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration. -> -But global_ctrl generated based on cpuid(0xA), the number of general-purpose -> -performance -> -monitoring counters(PMC) can vary according to Intel SDN. The number of PMC -> -presented -> -to vm, does not support configuration currently, it depend on host cpuid, and -> -enable all pmc -> -defaultly at KVM. It cause migration to fail between boards with different -> -PMC counts. -> -> -The return value of cpuid (0xA) is different dur to cpu, according to Intel -> -SDNï¼18-10 Vol. 3B: -> -> -Note: The number of general-purpose performance monitoring counters (i.e. N -> -in Figure 18-9) -> -can vary across processor generations within a processor family, across -> -processor families, or -> -could be different depending on the configuration chosen at boot time in the -> -BIOS regarding -> -Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; -> -N =4 for processors -> -based on the Nehalem microarchitecture; for processors based on the Sandy -> -Bridge -> -microarchitecture, N = 4 if Intel Hyper Threading Technology is active and -> -N=8 if not active). -> -> -Also I found, N=8 if HT is not active based on the broadwellï¼, -> -such as CPU E7-8890 v4 @ 2.20GHz -> -> -# ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda -> -/data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming -> -tcp::8888 -> -Completed 100 % -> -qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff -> -qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -kvm_put_msrs: -> -Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -Aborted -> -> -So make number of pmc configurable to vm ? Any better idea ? -Coincidentally we hit a similar problem a few days ago with -cpu host - it -took me -quite a while to spot the difference between the machines was the source -had hyperthreading disabled. - -An option to set the number of counters makes sense to me; but I wonder -how many other options we need as well. Also, I'm not sure there's any -easy way for libvirt etc to figure out how many counters a host supports - it's -not in /proc/cpuinfo. - -Dave - -> -> -Regards, -> --Zhuang Yanying --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: -> -* Zhuangyanying (address@hidden) wrote: -> -> Hi all, -> -> -> -> Recently, I found migration failed when enable vPMU. -> -> -> -> migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> -> -> As long as enable vPMU, qemu will save / load the -> -> vmstate_msr_architectural_pmu(msr_global_ctrl) register during the -> -> migration. -> -> But global_ctrl generated based on cpuid(0xA), the number of -> -> general-purpose performance -> -> monitoring counters(PMC) can vary according to Intel SDN. The number of PMC -> -> presented -> -> to vm, does not support configuration currently, it depend on host cpuid, -> -> and enable all pmc -> -> defaultly at KVM. It cause migration to fail between boards with different -> -> PMC counts. -> -> -> -> The return value of cpuid (0xA) is different dur to cpu, according to Intel -> -> SDNï¼18-10 Vol. 3B: -> -> -> -> Note: The number of general-purpose performance monitoring counters (i.e. N -> -> in Figure 18-9) -> -> can vary across processor generations within a processor family, across -> -> processor families, or -> -> could be different depending on the configuration chosen at boot time in -> -> the BIOS regarding -> -> Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom -> -> processors; N =4 for processors -> -> based on the Nehalem microarchitecture; for processors based on the Sandy -> -> Bridge -> -> microarchitecture, N = 4 if Intel Hyper Threading Technology is active and -> -> N=8 if not active). -> -> -> -> Also I found, N=8 if HT is not active based on the broadwellï¼, -> -> such as CPU E7-8890 v4 @ 2.20GHz -> -> -> -> # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda -> -> /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming -> -> tcp::8888 -> -> Completed 100 % -> -> qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff -> -> qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -> kvm_put_msrs: -> -> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -> Aborted -> -> -> -> So make number of pmc configurable to vm ? Any better idea ? -> -> -Coincidentally we hit a similar problem a few days ago with -cpu host - it -> -took me -> -quite a while to spot the difference between the machines was the source -> -had hyperthreading disabled. -> -> -An option to set the number of counters makes sense to me; but I wonder -> -how many other options we need as well. Also, I'm not sure there's any -> -easy way for libvirt etc to figure out how many counters a host supports - -> -it's not in /proc/cpuinfo. -We actually try to avoid /proc/cpuinfo whereever possible. We do direct -CPUID asm instructions to identify features, and prefer to use -/sys/devices/system/cpu if that has suitable data - -Where do the PMC counts come from originally ? CPUID or something else ? - -Regards, -Daniel --- -|: -https://berrange.com --o- -https://www.flickr.com/photos/dberrange -:| -|: -https://libvirt.org --o- -https://fstop138.berrange.com -:| -|: -https://entangle-photo.org --o- -https://www.instagram.com/dberrange -:| - -* Daniel P. Berrange (address@hidden) wrote: -> -On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: -> -> * Zhuangyanying (address@hidden) wrote: -> -> > Hi all, -> -> > -> -> > Recently, I found migration failed when enable vPMU. -> -> > -> -> > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> > -> -> > As long as enable vPMU, qemu will save / load the -> -> > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the -> -> > migration. -> -> > But global_ctrl generated based on cpuid(0xA), the number of -> -> > general-purpose performance -> -> > monitoring counters(PMC) can vary according to Intel SDN. The number of -> -> > PMC presented -> -> > to vm, does not support configuration currently, it depend on host cpuid, -> -> > and enable all pmc -> -> > defaultly at KVM. It cause migration to fail between boards with -> -> > different PMC counts. -> -> > -> -> > The return value of cpuid (0xA) is different dur to cpu, according to -> -> > Intel SDNï¼18-10 Vol. 3B: -> -> > -> -> > Note: The number of general-purpose performance monitoring counters (i.e. -> -> > N in Figure 18-9) -> -> > can vary across processor generations within a processor family, across -> -> > processor families, or -> -> > could be different depending on the configuration chosen at boot time in -> -> > the BIOS regarding -> -> > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom -> -> > processors; N =4 for processors -> -> > based on the Nehalem microarchitecture; for processors based on the Sandy -> -> > Bridge -> -> > microarchitecture, N = 4 if Intel Hyper Threading Technology is active -> -> > and N=8 if not active). -> -> > -> -> > Also I found, N=8 if HT is not active based on the broadwellï¼, -> -> > such as CPU E7-8890 v4 @ 2.20GHz -> -> > -> -> > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda -> -> > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming -> -> > tcp::8888 -> -> > Completed 100 % -> -> > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff -> -> > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -> > kvm_put_msrs: -> -> > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -> > Aborted -> -> > -> -> > So make number of pmc configurable to vm ? Any better idea ? -> -> -> -> Coincidentally we hit a similar problem a few days ago with -cpu host - it -> -> took me -> -> quite a while to spot the difference between the machines was the source -> -> had hyperthreading disabled. -> -> -> -> An option to set the number of counters makes sense to me; but I wonder -> -> how many other options we need as well. Also, I'm not sure there's any -> -> easy way for libvirt etc to figure out how many counters a host supports - -> -> it's not in /proc/cpuinfo. -> -> -We actually try to avoid /proc/cpuinfo whereever possible. We do direct -> -CPUID asm instructions to identify features, and prefer to use -> -/sys/devices/system/cpu if that has suitable data -> -> -Where do the PMC counts come from originally ? CPUID or something else ? -Yes, they're bits 8..15 of CPUID leaf 0xa - -Dave - -> -Regards, -> -Daniel -> --- -> -|: -https://berrange.com --o- -https://www.flickr.com/photos/dberrange -:| -> -|: -https://libvirt.org --o- -https://fstop138.berrange.com -:| -> -|: -https://entangle-photo.org --o- -https://www.instagram.com/dberrange -:| --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - -On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: -> -* Daniel P. Berrange (address@hidden) wrote: -> -> On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: -> -> > * Zhuangyanying (address@hidden) wrote: -> -> > > Hi all, -> -> > > -> -> > > Recently, I found migration failed when enable vPMU. -> -> > > -> -> > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> > > -> -> > > As long as enable vPMU, qemu will save / load the -> -> > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the -> -> > > migration. -> -> > > But global_ctrl generated based on cpuid(0xA), the number of -> -> > > general-purpose performance -> -> > > monitoring counters(PMC) can vary according to Intel SDN. The number of -> -> > > PMC presented -> -> > > to vm, does not support configuration currently, it depend on host -> -> > > cpuid, and enable all pmc -> -> > > defaultly at KVM. It cause migration to fail between boards with -> -> > > different PMC counts. -> -> > > -> -> > > The return value of cpuid (0xA) is different dur to cpu, according to -> -> > > Intel SDNï¼18-10 Vol. 3B: -> -> > > -> -> > > Note: The number of general-purpose performance monitoring counters -> -> > > (i.e. N in Figure 18-9) -> -> > > can vary across processor generations within a processor family, across -> -> > > processor families, or -> -> > > could be different depending on the configuration chosen at boot time -> -> > > in the BIOS regarding -> -> > > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom -> -> > > processors; N =4 for processors -> -> > > based on the Nehalem microarchitecture; for processors based on the -> -> > > Sandy Bridge -> -> > > microarchitecture, N = 4 if Intel Hyper Threading Technology is active -> -> > > and N=8 if not active). -> -> > > -> -> > > Also I found, N=8 if HT is not active based on the broadwellï¼, -> -> > > such as CPU E7-8890 v4 @ 2.20GHz -> -> > > -> -> > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda -> -> > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -> -> > > -incoming tcp::8888 -> -> > > Completed 100 % -> -> > > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff -> -> > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -> > > kvm_put_msrs: -> -> > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -> > > Aborted -> -> > > -> -> > > So make number of pmc configurable to vm ? Any better idea ? -> -> > -> -> > Coincidentally we hit a similar problem a few days ago with -cpu host - -> -> > it took me -> -> > quite a while to spot the difference between the machines was the source -> -> > had hyperthreading disabled. -> -> > -> -> > An option to set the number of counters makes sense to me; but I wonder -> -> > how many other options we need as well. Also, I'm not sure there's any -> -> > easy way for libvirt etc to figure out how many counters a host supports - -> -> > it's not in /proc/cpuinfo. -> -> -> -> We actually try to avoid /proc/cpuinfo whereever possible. We do direct -> -> CPUID asm instructions to identify features, and prefer to use -> -> /sys/devices/system/cpu if that has suitable data -> -> -> -> Where do the PMC counts come from originally ? CPUID or something else ? -> -> -Yes, they're bits 8..15 of CPUID leaf 0xa -Ok, that's easy enough for libvirt to detect then. More a question of what -libvirt should then do this with the info.... - -Regards, -Daniel --- -|: -https://berrange.com --o- -https://www.flickr.com/photos/dberrange -:| -|: -https://libvirt.org --o- -https://fstop138.berrange.com -:| -|: -https://entangle-photo.org --o- -https://www.instagram.com/dberrange -:| - -> ------Original Message----- -> -From: Daniel P. Berrange [ -mailto:address@hidden -> -Sent: Monday, April 24, 2017 6:34 PM -> -To: Dr. David Alan Gilbert -> -Cc: Zhuangyanying; Zhanghailiang; wangxin (U); address@hidden; -> -Gonglei (Arei); Huangzhichao; address@hidden -> -Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different -> -PMC counts -> -> -On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: -> -> * Daniel P. Berrange (address@hidden) wrote: -> -> > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: -> -> > > * Zhuangyanying (address@hidden) wrote: -> -> > > > Hi all, -> -> > > > -> -> > > > Recently, I found migration failed when enable vPMU. -> -> > > > -> -> > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> > > > -> -> > > > As long as enable vPMU, qemu will save / load the -> -> > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the -> -migration. -> -> > > > But global_ctrl generated based on cpuid(0xA), the number of -> -> > > > general-purpose performance monitoring counters(PMC) can vary -> -> > > > according to Intel SDN. The number of PMC presented to vm, does -> -> > > > not support configuration currently, it depend on host cpuid, and -> -> > > > enable -> -all pmc defaultly at KVM. It cause migration to fail between boards with -> -different PMC counts. -> -> > > > -> -> > > > The return value of cpuid (0xA) is different dur to cpu, according to -> -> > > > Intel -> -SDNï¼18-10 Vol. 3B: -> -> > > > -> -> > > > Note: The number of general-purpose performance monitoring -> -> > > > counters (i.e. N in Figure 18-9) can vary across processor -> -> > > > generations within a processor family, across processor -> -> > > > families, or could be different depending on the configuration -> -> > > > chosen at boot time in the BIOS regarding Intel Hyper Threading -> -> > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for -> -processors based on the Nehalem microarchitecture; for processors based on -> -the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology -> -is active and N=8 if not active). -> -> > > > -> -> > > > Also I found, N=8 if HT is not active based on the broadwellï¼, -> -> > > > such as CPU E7-8890 v4 @ 2.20GHz -> -> > > > -> -> > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m -> -> > > > 4096 -hda -> -> > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -> -> > > > -incoming tcp::8888 Completed 100 % -> -> > > > qemu-system-x86_64: error: failed to set MSR 0x38f to -> -> > > > 0x7000000ff -> -> > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -kvm_put_msrs: -> -> > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -> > > > Aborted -> -> > > > -> -> > > > So make number of pmc configurable to vm ? Any better idea ? -> -> > > -> -> > > Coincidentally we hit a similar problem a few days ago with -cpu -> -> > > host - it took me quite a while to spot the difference between -> -> > > the machines was the source had hyperthreading disabled. -> -> > > -> -> > > An option to set the number of counters makes sense to me; but I -> -> > > wonder how many other options we need as well. Also, I'm not sure -> -> > > there's any easy way for libvirt etc to figure out how many -> -> > > counters a host supports - it's not in /proc/cpuinfo. -> -> > -> -> > We actually try to avoid /proc/cpuinfo whereever possible. We do -> -> > direct CPUID asm instructions to identify features, and prefer to -> -> > use /sys/devices/system/cpu if that has suitable data -> -> > -> -> > Where do the PMC counts come from originally ? CPUID or something -> -else ? -> -> -> -> Yes, they're bits 8..15 of CPUID leaf 0xa -> -> -Ok, that's easy enough for libvirt to detect then. More a question of what -> -libvirt -> -should then do this with the info.... -> -Do you mean to do a validation at the begining of migration? in -qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are -not equal, just quit migration? -It maybe a good enough first edition. -But for a further better edition, maybe it's better to support Heterogeneous -migration I think, so we might need to make PMC number configrable, then we -need to modify KVM/qemu as well. - -Regards, --Zhuang Yanying - -* Zhuangyanying (address@hidden) wrote: -> -> -> -> -----Original Message----- -> -> From: Daniel P. Berrange [ -mailto:address@hidden -> -> Sent: Monday, April 24, 2017 6:34 PM -> -> To: Dr. David Alan Gilbert -> -> Cc: Zhuangyanying; Zhanghailiang; wangxin (U); address@hidden; -> -> Gonglei (Arei); Huangzhichao; address@hidden -> -> Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different -> -> PMC counts -> -> -> -> On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: -> -> > * Daniel P. Berrange (address@hidden) wrote: -> -> > > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: -> -> > > > * Zhuangyanying (address@hidden) wrote: -> -> > > > > Hi all, -> -> > > > > -> -> > > > > Recently, I found migration failed when enable vPMU. -> -> > > > > -> -> > > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. -> -> > > > > -> -> > > > > As long as enable vPMU, qemu will save / load the -> -> > > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the -> -> migration. -> -> > > > > But global_ctrl generated based on cpuid(0xA), the number of -> -> > > > > general-purpose performance monitoring counters(PMC) can vary -> -> > > > > according to Intel SDN. The number of PMC presented to vm, does -> -> > > > > not support configuration currently, it depend on host cpuid, and -> -> > > > > enable -> -> all pmc defaultly at KVM. It cause migration to fail between boards with -> -> different PMC counts. -> -> > > > > -> -> > > > > The return value of cpuid (0xA) is different dur to cpu, according -> -> > > > > to Intel -> -> SDNï¼18-10 Vol. 3B: -> -> > > > > -> -> > > > > Note: The number of general-purpose performance monitoring -> -> > > > > counters (i.e. N in Figure 18-9) can vary across processor -> -> > > > > generations within a processor family, across processor -> -> > > > > families, or could be different depending on the configuration -> -> > > > > chosen at boot time in the BIOS regarding Intel Hyper Threading -> -> > > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for -> -> processors based on the Nehalem microarchitecture; for processors based on -> -> the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading -> -> Technology -> -> is active and N=8 if not active). -> -> > > > > -> -> > > > > Also I found, N=8 if HT is not active based on the broadwellï¼, -> -> > > > > such as CPU E7-8890 v4 @ 2.20GHz -> -> > > > > -> -> > > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m -> -> > > > > 4096 -hda -> -> > > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -> -> > > > > -incoming tcp::8888 Completed 100 % -> -> > > > > qemu-system-x86_64: error: failed to set MSR 0x38f to -> -> > > > > 0x7000000ff -> -> > > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: -> -> kvm_put_msrs: -> -> > > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. -> -> > > > > Aborted -> -> > > > > -> -> > > > > So make number of pmc configurable to vm ? Any better idea ? -> -> > > > -> -> > > > Coincidentally we hit a similar problem a few days ago with -cpu -> -> > > > host - it took me quite a while to spot the difference between -> -> > > > the machines was the source had hyperthreading disabled. -> -> > > > -> -> > > > An option to set the number of counters makes sense to me; but I -> -> > > > wonder how many other options we need as well. Also, I'm not sure -> -> > > > there's any easy way for libvirt etc to figure out how many -> -> > > > counters a host supports - it's not in /proc/cpuinfo. -> -> > > -> -> > > We actually try to avoid /proc/cpuinfo whereever possible. We do -> -> > > direct CPUID asm instructions to identify features, and prefer to -> -> > > use /sys/devices/system/cpu if that has suitable data -> -> > > -> -> > > Where do the PMC counts come from originally ? CPUID or something -> -> else ? -> -> > -> -> > Yes, they're bits 8..15 of CPUID leaf 0xa -> -> -> -> Ok, that's easy enough for libvirt to detect then. More a question of what -> -> libvirt -> -> should then do this with the info.... -> -> -> -> -Do you mean to do a validation at the begining of migration? in -> -qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are -> -not equal, just quit migration? -> -It maybe a good enough first edition. -> -But for a further better edition, maybe it's better to support Heterogeneous -> -migration I think, so we might need to make PMC number configrable, then we -> -need to modify KVM/qemu as well. -Yes agreed; the only thing I wanted to check was that libvirt would have enough -information to be able to use any feature we added to QEMU. - -Dave - -> -Regards, -> --Zhuang Yanying --- -Dr. David Alan Gilbert / address@hidden / Manchester, UK - diff --git a/results/classifier/02/semantic/gitlab_semantic_addsubps b/results/classifier/02/semantic/gitlab_semantic_addsubps deleted file mode 100644 index 4334dc949..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_addsubps +++ /dev/null @@ -1,29 +0,0 @@ -semantic: 0.974 -instruction: 0.931 -other: 0.732 -boot: 0.465 -mistranslation: 0.299 - -x86 SSE/SSE2/SSE3 instruction semantic bugs with NaN - -Description of problem -The result of SSE/SSE2/SSE3 instructions with NaN is different from the CPU. From Intel manual Volume 1 Appendix D.4.2.2, they defined the behavior of such instructions with NaN. But I think QEMU did not implement this semantic exactly because the byte result is different. - -Steps to reproduce - -Compile this code - -void main() { - asm("mov rax, 0x000000007fffffff; push rax; mov rax, 0x00000000ffffffff; push rax; movdqu XMM1, [rsp];"); - asm("mov rax, 0x2e711de7aa46af1a; push rax; mov rax, 0x7fffffff7fffffff; push rax; movdqu XMM2, [rsp];"); - asm("addsubps xmm1, xmm2"); -} - -Execute and compare the result with the CPU. This problem happens with other SSE/SSE2/SSE3 instructions specified in the manual, Volume 1 Appendix D.4.2.2. - -CPU xmm1[3] = 0xffffffff - -QEMU xmm1[3] = 0x7fffffff - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/02/semantic/gitlab_semantic_adox b/results/classifier/02/semantic/gitlab_semantic_adox deleted file mode 100644 index 5ee748f82..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_adox +++ /dev/null @@ -1,42 +0,0 @@ -semantic: 0.990 -instruction: 0.944 -boot: 0.599 -mistranslation: 0.452 -other: 0.286 - -x86 ADOX and ADCX semantic bug -Description of problem -The result of instruction ADOX and ADCX are different from the CPU. The value of one of EFLAGS is different. - -Steps to reproduce - -Compile this code - - -void main() { - asm("push 512; popfq;"); - asm("mov rax, 0xffffffff84fdbf24"); - asm("mov rbx, 0xb197d26043bec15d"); - asm("adox eax, ebx"); -} - - - -Execute and compare the result with the CPU. This problem happens with ADCX, too (with CF). - -CPU - -OF = 0 - - -QEMU - -OF = 1 - - - - - - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/02/semantic/gitlab_semantic_bextr b/results/classifier/02/semantic/gitlab_semantic_bextr deleted file mode 100644 index b0b902d9c..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_bextr +++ /dev/null @@ -1,31 +0,0 @@ -semantic: 0.993 -instruction: 0.944 -boot: 0.516 -mistranslation: 0.337 -other: 0.099 - -x86 BEXTR semantic bug -Description of problem -The result of instruction BEXTR is different with from the CPU. The value of destination register is different. I think QEMU does not consider the operand size limit. - -Steps to reproduce - -Compile this code - -void main() { - asm("mov rax, 0x17b3693f77fb6e9"); - asm("mov rbx, 0x8f635a775ad3b9b4"); - asm("mov rcx, 0xb717b75da9983018"); - asm("bextr eax, ebx, ecx"); -} - -Execute and compare the result with the CPU. - -CPU -RAX = 0x5a - -QEMU -RAX = 0x635a775a - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/02/semantic/gitlab_semantic_blsi b/results/classifier/02/semantic/gitlab_semantic_blsi deleted file mode 100644 index 892674b33..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_blsi +++ /dev/null @@ -1,26 +0,0 @@ -semantic: 0.983 -instruction: 0.964 -boot: 0.678 -other: 0.609 -mistranslation: 0.606 - -x86 BLSI and BLSR semantic bug -Description of problem -The result of instruction BLSI and BLSR is different from the CPU. The value of CF is different. - -Steps to reproduce - -Compile this code - - -void main() { - asm("blsi rax, rbx"); -} - - - -Execute and compare the result with the CPU. The value of CF is exactly the opposite. This problem happens with BLSR, too. - - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/02/semantic/gitlab_semantic_blsmsk b/results/classifier/02/semantic/gitlab_semantic_blsmsk deleted file mode 100644 index 245a1326f..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_blsmsk +++ /dev/null @@ -1,33 +0,0 @@ -semantic: 0.987 -instruction: 0.962 -mistranslation: 0.603 -boot: 0.585 -other: 0.269 - -x86 BLSMSK semantic bug -Description of problem -The result of instruction BLSMSK is different with from the CPU. The value of CF is different. - -Steps to reproduce - -Compile this code - -void main() { - asm("mov rax, 0x65b2e276ad27c67"); - asm("mov rbx, 0x62f34955226b2b5d"); - asm("blsmsk eax, ebx"); -} - -Execute and compare the result with the CPU. - -CPU - -CF = 0 - - -QEMU - -CF = 1 - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/02/semantic/gitlab_semantic_bzhi b/results/classifier/02/semantic/gitlab_semantic_bzhi deleted file mode 100644 index 944cd8148..000000000 --- a/results/classifier/02/semantic/gitlab_semantic_bzhi +++ /dev/null @@ -1,44 +0,0 @@ -semantic: 0.920 -instruction: 0.623 -boot: 0.220 -mistranslation: 0.171 -other: 0.064 - -x86 BZHI semantic bug -Description of problem -The result of instruction BZHI is different from the CPU. The value of destination register and SF of EFLAGS are different. - -Steps to reproduce - -Compile this code - - -void main() { - asm("mov rax, 0xb1aa9da2fe33fe3"); - asm("mov rbx, 0x80000000ffffffff"); - asm("mov rcx, 0xf3fce8829b99a5c6"); - asm("bzhi rax, rbx, rcx"); -} - - - -Execute and compare the result with the CPU. - -CPU - -RAX = 0x0x80000000ffffffff -SF = 1 - - -QEMU - -RAX = 0xffffffff -SF = 0 - - - - - - -Additional information -This bug is discovered by research conducted by KAIST SoftSec. |