diff options
Diffstat (limited to 'classification_output/03/semantic')
| -rw-r--r-- | classification_output/03/semantic/12360755 | 299 | ||||
| -rw-r--r-- | classification_output/03/semantic/28596630 | 116 | ||||
| -rw-r--r-- | classification_output/03/semantic/30680944 | 598 | ||||
| -rw-r--r-- | classification_output/03/semantic/46572227 | 409 | ||||
| -rw-r--r-- | classification_output/03/semantic/53568181 | 81 | ||||
| -rw-r--r-- | classification_output/03/semantic/80570214 | 403 | ||||
| -rw-r--r-- | classification_output/03/semantic/96782458 | 1002 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_addsubps | 31 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_adox | 44 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_bextr | 33 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_blsi | 28 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_blsmsk | 35 | ||||
| -rw-r--r-- | classification_output/03/semantic/gitlab_semantic_bzhi | 46 |
13 files changed, 3125 insertions, 0 deletions
diff --git a/classification_output/03/semantic/12360755 b/classification_output/03/semantic/12360755 new file mode 100644 index 00000000..c0c31442 --- /dev/null +++ b/classification_output/03/semantic/12360755 @@ -0,0 +1,299 @@ +semantic: 0.911 +instruction: 0.894 +other: 0.886 +mistranslation: 0.844 +boot: 0.818 +KVM: 0.770 +network: 0.738 + +[Qemu-devel] [BUG] virtio-net linux driver fails to probe on MIPS Malta since 'hw/virtio-pci: fix virtio behaviour' + +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? + +Cheers +James +signature.asc +Description: +Digital signature + +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +Cheers +James + +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +> +On 03/17/2017 11:57 PM, James Hogan wrote: +> +> Hi, +> +> +> +> I've bisected the following failure of the virtio_net linux v4.10 driver +> +> to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: +> +> +> +> virtio_net virtio0: virtio: device uses modern interface but does not have +> +> VIRTIO_F_VERSION_1 +> +> virtio_net: probe of virtio0 failed with error -22 +> +> +> +> To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). +> +> +> +> It appears that adding ",disable-modern=on,disable-legacy=off" to the +> +> virtio-net -device makes it work again. +> +> +> +> I presume this should really just work out of the box. Any ideas why it +> +> isn't? +> +> +> +> +Hi, +> +> +> +This is strange. This commit changes virtio devices from legacy to virtio +> +"transitional". +> +(your command line changes it to legacy) +> +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +> +side +> +there is nothing new. +> +> +Michael, do you have any idea? +> +> +Thanks, +> +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. + +> +> Cheers +> +> James +> +> + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Sure, + +Thanks, +Marcel +Cheers +James + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Hi James, + +Can you please check if the below patch fixes the problem? +Please note it is not a solution. + +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +index f9b7244..5b4d429 100644 +--- a/hw/virtio/virtio-pci.c ++++ b/hw/virtio/virtio-pci.c +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +Error **errp) + } + + pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +- PCI_BASE_ADDRESS_SPACE_MEMORY | +- PCI_BASE_ADDRESS_MEM_PREFETCH | +- PCI_BASE_ADDRESS_MEM_TYPE_64, ++ PCI_BASE_ADDRESS_SPACE_MEMORY, + &proxy->modern_bar); + + proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); + + +Thanks, +Marcel + +Hi Marcel, + +On Tue, Mar 21, 2017 at 04:16:58PM +0200, Marcel Apfelbaum wrote: +> +Can you please check if the below patch fixes the problem? +> +Please note it is not a solution. +> +> +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +> +index f9b7244..5b4d429 100644 +> +--- a/hw/virtio/virtio-pci.c +> ++++ b/hw/virtio/virtio-pci.c +> +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +> +Error **errp) +> +} +> +> +pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +> +- PCI_BASE_ADDRESS_SPACE_MEMORY | +> +- PCI_BASE_ADDRESS_MEM_PREFETCH | +> +- PCI_BASE_ADDRESS_MEM_TYPE_64, +> ++ PCI_BASE_ADDRESS_SPACE_MEMORY, +> +&proxy->modern_bar); +> +> +proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); +Sorry for the delay trying this, I was away last week. + +No, it doesn't seem to make any difference. + +Thanks +James +signature.asc +Description: +Digital signature + diff --git a/classification_output/03/semantic/28596630 b/classification_output/03/semantic/28596630 new file mode 100644 index 00000000..a4307c19 --- /dev/null +++ b/classification_output/03/semantic/28596630 @@ -0,0 +1,116 @@ +semantic: 0.814 +mistranslation: 0.813 +network: 0.780 +instruction: 0.748 +other: 0.707 +KVM: 0.649 +boot: 0.609 + +[Qemu-devel] [BUG] [low severity] a strange appearance of message involving slirp while doing "empty" make + +Folks, + +If qemu tree is already fully built, and "make" is attempted, for 3.1, the +outcome is: + +$ make + CHK version_gen.h +$ + +For 4.0-rc0, the outcome seems to be different: + +$ make +make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' +make[1]: Nothing to be done for 'all'. +make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' + CHK version_gen.h +$ + +Not sure how significant is that, but I report it just in case. + +Yours, +Aleksandar + +On 20/03/2019 22.08, Aleksandar Markovic wrote: +> +Folks, +> +> +If qemu tree is already fully built, and "make" is attempted, for 3.1, the +> +outcome is: +> +> +$ make +> +CHK version_gen.h +> +$ +> +> +For 4.0-rc0, the outcome seems to be different: +> +> +$ make +> +make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' +> +make[1]: Nothing to be done for 'all'. +> +make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' +> +CHK version_gen.h +> +$ +> +> +Not sure how significant is that, but I report it just in case. +It's likely because slirp is currently being reworked to become a +separate project, so the makefiles have been changed a little bit. I +guess the message will go away again once slirp has become a stand-alone +library. + + Thomas + +On Fri, 22 Mar 2019 at 04:59, Thomas Huth <address@hidden> wrote: +> +On 20/03/2019 22.08, Aleksandar Markovic wrote: +> +> $ make +> +> make[1]: Entering directory '/home/build/malta-mips64r6/qemu-4.0/slirp' +> +> make[1]: Nothing to be done for 'all'. +> +> make[1]: Leaving directory '/home/build/malta-mips64r6/qemu-4.0/slirp' +> +> CHK version_gen.h +> +> $ +> +> +> +> Not sure how significant is that, but I report it just in case. +> +> +It's likely because slirp is currently being reworked to become a +> +separate project, so the makefiles have been changed a little bit. I +> +guess the message will go away again once slirp has become a stand-alone +> +library. +Well, we'll still need to ship slirp for the foreseeable future... + +I think the cause of this is that the rule in Makefile for +calling the slirp Makefile is not passing it $(SUBDIR_MAKEFLAGS) +like all the other recursive make invocations. If we do that +then we'll suppress the entering/leaving messages for +non-verbose builds. (Some tweaking will be needed as +it looks like the slirp makefile has picked an incompatible +meaning for $BUILD_DIR, which the SUBDIR_MAKEFLAGS will +also be passing to it.) + +thanks +-- PMM + diff --git a/classification_output/03/semantic/30680944 b/classification_output/03/semantic/30680944 new file mode 100644 index 00000000..ee4cea99 --- /dev/null +++ b/classification_output/03/semantic/30680944 @@ -0,0 +1,598 @@ +semantic: 0.953 +other: 0.944 +instruction: 0.919 +boot: 0.840 +network: 0.813 +mistranslation: 0.799 +KVM: 0.701 + +[BUG]QEMU jump into interrupt when single-stepping on aarch64 + +Dear, folks, + +I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 +platform, +the added breakpoint hits but after I type `step`, the gdb always jumps into +interrupt. + +My env: + + gdb-10.2 + qemu-6.2.0 + host kernel: 5.10.84 + VM kernel: 5.10.84 + +The steps to reproduce: + # host console: run a VM with only one core, the import arg: <qemu:arg +value='-s'/> + # details can be found here: +https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt +virsh create dev_core0.xml + + # run gdb client + gdb ./vmlinux + + # gdb client on host console + (gdb) dir +./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 + (gdb) target remote localhost:1234 + (gdb) info b + Num Type Disp Enb Address What + 1 breakpoint keep y <MULTIPLE> + 1.1 y 0xffff800010361444 +mm/memory-failure.c:1318 + 1.2 y 0xffff800010361450 in memory_failure + at mm/memory-failure.c:1488 + (gdb) c + Continuing. + + # console in VM, use madvise to inject a hwposion at virtual address +vaddr, + # which will hit the b inmemory_failur: madvise(vaddr, pagesize, +MADV_HWPOISON); + # and the VM pause + ./run_madvise.c + + # gdb client on host console + (gdb) + Continuing. + Breakpoint 1, 0xffff800010361444 in memory_failure () at +mm/memory-failure.c:1318 + 1318 res = -EHWPOISON; + (gdb) n + vectors () at arch/arm64/kernel/entry.S:552 + 552 kernel_ventry 1, irq // IRQ +EL1h + (gdb) n + (gdb) n + (gdb) n + (gdb) n + gic_handle_irq (regs=0xffff8000147c3b80) at +drivers/irqchip/irq-gic-v3.c:721 + # after several step, I got the irqnr + (gdb) p irqnr + $5 = 8262 + +Sometimes, the irqnr is 27ï¼ which is used for arch_timer. + +I was wondering do you have any comments on this? And feedback are welcomed. + +Thank you. + +Best Regards. +Shuai + +On 4/6/22 09:30, Shuai Xue wrote: +Dear, folks, + +I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 +platform, +the added breakpoint hits but after I type `step`, the gdb always jumps into +interrupt. + +My env: + + gdb-10.2 + qemu-6.2.0 + host kernel: 5.10.84 + VM kernel: 5.10.84 + +The steps to reproduce: + # host console: run a VM with only one core, the import arg: <qemu:arg +value='-s'/> + # details can be found here: +https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt +virsh create dev_core0.xml + + # run gdb client + gdb ./vmlinux + + # gdb client on host console + (gdb) dir +./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 + (gdb) target remote localhost:1234 + (gdb) info b + Num Type Disp Enb Address What + 1 breakpoint keep y <MULTIPLE> + 1.1 y 0xffff800010361444 +mm/memory-failure.c:1318 + 1.2 y 0xffff800010361450 in memory_failure + at mm/memory-failure.c:1488 + (gdb) c + Continuing. + + # console in VM, use madvise to inject a hwposion at virtual address +vaddr, + # which will hit the b inmemory_failur: madvise(vaddr, pagesize, +MADV_HWPOISON); + # and the VM pause + ./run_madvise.c + + # gdb client on host console + (gdb) + Continuing. + Breakpoint 1, 0xffff800010361444 in memory_failure () at +mm/memory-failure.c:1318 + 1318 res = -EHWPOISON; + (gdb) n + vectors () at arch/arm64/kernel/entry.S:552 + 552 kernel_ventry 1, irq // IRQ +EL1h +The 'n' command is not a single-step: use stepi, which will suppress interrupts. +Anyway, not a bug. + +r~ + +å¨ 2022/4/7 AM12:57, Richard Henderson åé: +> +On 4/6/22 09:30, Shuai Xue wrote: +> +> Dear, folks, +> +> +> +> I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 +> +> platform, +> +> the added breakpoint hits but after I type `step`, the gdb always jumps into +> +> interrupt. +> +> +> +> My env: +> +> +> +>     gdb-10.2 +> +>     qemu-6.2.0 +> +>     host kernel: 5.10.84 +> +>     VM kernel: 5.10.84 +> +> +> +> The steps to reproduce: +> +>     # host console: run a VM with only one core, the import arg: <qemu:arg +> +> value='-s'/> +> +>     # details can be found here: +> +> +https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt +> +>     virsh create dev_core0.xml +> +>     +> +>     # run gdb client +> +>     gdb ./vmlinux +> +> +> +>     # gdb client on host console +> +>     (gdb) dir +> +> ./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 +> +>     (gdb) target remote localhost:1234 +> +>     (gdb) info b +> +>     Num    Type          Disp Enb Address           What +> +>     1      breakpoint    keep y  <MULTIPLE> +> +>     1.1                        y  0xffff800010361444 +> +> mm/memory-failure.c:1318 +> +>     1.2                        y  0xffff800010361450 in memory_failure +> +>                                                    at +> +> mm/memory-failure.c:1488 +> +>     (gdb) c +> +>     Continuing. +> +> +> +>     # console in VM, use madvise to inject a hwposion at virtual address +> +> vaddr, +> +>     # which will hit the b inmemory_failur: madvise(vaddr, pagesize, +> +> MADV_HWPOISON); +> +>     # and the VM pause +> +>     ./run_madvise.c +> +> +> +>     # gdb client on host console +> +>     (gdb) +> +>     Continuing. +> +>     Breakpoint 1, 0xffff800010361444 in memory_failure () at +> +> mm/memory-failure.c:1318 +> +>     1318                   res = -EHWPOISON; +> +>     (gdb) n +> +>     vectors () at arch/arm64/kernel/entry.S:552 +> +>     552            kernel_ventry  1, irq                         // IRQ +> +> EL1h +> +> +The 'n' command is not a single-step: use stepi, which will suppress +> +interrupts. +> +Anyway, not a bug. +> +> +r~ +Hi, Richard, + +Thank you for your quick reply, I also try `stepi`, but it does NOT work either. + + (gdb) c + Continuing. + + Breakpoint 1, memory_failure (pfn=1273982, flags=1) at +mm/memory-failure.c:1488 + 1488 { + (gdb) stepi + vectors () at arch/arm64/kernel/entry.S:552 + 552 kernel_ventry 1, irq // IRQ +EL1h + +According to QEMU doc[1]: the default single stepping behavior is step with the +IRQs +and timer service routines off. I checked the MASK bits used to control the +single +stepping IE on my machine as bellow: + + # gdb client on host (x86 plafrom) + (gdb) maintenance packet qqemu.sstepbits + sending: "qqemu.sstepbits" + received: "ENABLE=1,NOIRQ=2,NOTIMER=4" + +The sstep MASK looks as expected, but does not work as expected. + +I also try the same kernel and qemu version on X86 platform: +> +> gdb-10.2 +> +> qemu-6.2.0 +> +> host kernel: 5.10.84 +> +> VM kernel: 5.10.84 +The command `n` jumps to the next instruction. + + # gdb client on host (x86 plafrom) + (gdb) b memory-failure.c:1488 + Breakpoint 1, memory_failure (pfn=1128931, flags=1) at +mm/memory-failure.c:1488 + 1488 { + (gdb) n + 1497 if (!sysctl_memory_failure_recovery) + (gdb) stepi + 0xffffffff812efdbc 1497 if +(!sysctl_memory_failure_recovery) + (gdb) stepi + 0xffffffff812efdbe 1497 if +(!sysctl_memory_failure_recovery) + (gdb) n + 1500 p = pfn_to_online_page(pfn); + (gdb) l + 1496 + 1497 if (!sysctl_memory_failure_recovery) + 1498 panic("Memory failure on page %lx", pfn); + 1499 + 1500 p = pfn_to_online_page(pfn); + 1501 if (!p) { + +Best Regrades, +Shuai + + +[1] +https://github.com/qemu/qemu/blob/master/docs/system/gdb.rst + +å¨ 2022/4/7 PM12:10, Shuai Xue åé: +> +å¨ 2022/4/7 AM12:57, Richard Henderson åé: +> +> On 4/6/22 09:30, Shuai Xue wrote: +> +>> Dear, folks, +> +>> +> +>> I try to debug Linux kernel with QEMU in single-stepping mode on aarch64 +> +>> platform, +> +>> the added breakpoint hits but after I type `step`, the gdb always jumps +> +>> into interrupt. +> +>> +> +>> My env: +> +>> +> +>>     gdb-10.2 +> +>>     qemu-6.2.0 +> +>>     host kernel: 5.10.84 +> +>>     VM kernel: 5.10.84 +> +>> +> +>> The steps to reproduce: +> +>>     # host console: run a VM with only one core, the import arg: <qemu:arg +> +>> value='-s'/> +> +>>     # details can be found here: +> +>> +https://www.redhat.com/en/blog/debugging-kernel-qemulibvirt +> +>>     virsh create dev_core0.xml +> +>>     +> +>>     # run gdb client +> +>>     gdb ./vmlinux +> +>> +> +>>     # gdb client on host console +> +>>     (gdb) dir +> +>> ./usr/src/debug/kernel-5.10.84/linux-5.10.84-004.alpha.ali5000.alios7.aarch64 +> +>>     (gdb) target remote localhost:1234 +> +>>     (gdb) info b +> +>>     Num    Type          Disp Enb Address           What +> +>>     1      breakpoint    keep y  <MULTIPLE> +> +>>     1.1                        y  0xffff800010361444 +> +>> mm/memory-failure.c:1318 +> +>>     1.2                        y  0xffff800010361450 in memory_failure +> +>>                                                    at +> +>> mm/memory-failure.c:1488 +> +>>     (gdb) c +> +>>     Continuing. +> +>> +> +>>     # console in VM, use madvise to inject a hwposion at virtual address +> +>> vaddr, +> +>>     # which will hit the b inmemory_failur: madvise(vaddr, pagesize, +> +>> MADV_HWPOISON); +> +>>     # and the VM pause +> +>>     ./run_madvise.c +> +>> +> +>>     # gdb client on host console +> +>>     (gdb) +> +>>     Continuing. +> +>>     Breakpoint 1, 0xffff800010361444 in memory_failure () at +> +>> mm/memory-failure.c:1318 +> +>>     1318                   res = -EHWPOISON; +> +>>     (gdb) n +> +>>     vectors () at arch/arm64/kernel/entry.S:552 +> +>>     552            kernel_ventry  1, irq                         // IRQ +> +>> EL1h +> +> +> +> The 'n' command is not a single-step: use stepi, which will suppress +> +> interrupts. +> +> Anyway, not a bug. +> +> +> +> r~ +> +> +Hi, Richard, +> +> +Thank you for your quick reply, I also try `stepi`, but it does NOT work +> +either. +> +> +(gdb) c +> +Continuing. +> +> +Breakpoint 1, memory_failure (pfn=1273982, flags=1) at +> +mm/memory-failure.c:1488 +> +1488 { +> +(gdb) stepi +> +vectors () at arch/arm64/kernel/entry.S:552 +> +552 kernel_ventry 1, irq // IRQ +> +EL1h +> +> +According to QEMU doc[1]: the default single stepping behavior is step with +> +the IRQs +> +and timer service routines off. I checked the MASK bits used to control the +> +single +> +stepping IE on my machine as bellow: +> +> +# gdb client on host (x86 plafrom) +> +(gdb) maintenance packet qqemu.sstepbits +> +sending: "qqemu.sstepbits" +> +received: "ENABLE=1,NOIRQ=2,NOTIMER=4" +> +> +The sstep MASK looks as expected, but does not work as expected. +> +> +I also try the same kernel and qemu version on X86 platform: +> +>> gdb-10.2 +> +>> qemu-6.2.0 +> +>> host kernel: 5.10.84 +> +>> VM kernel: 5.10.84 +> +> +> +The command `n` jumps to the next instruction. +> +> +# gdb client on host (x86 plafrom) +> +(gdb) b memory-failure.c:1488 +> +Breakpoint 1, memory_failure (pfn=1128931, flags=1) at +> +mm/memory-failure.c:1488 +> +1488 { +> +(gdb) n +> +1497 if (!sysctl_memory_failure_recovery) +> +(gdb) stepi +> +0xffffffff812efdbc 1497 if +> +(!sysctl_memory_failure_recovery) +> +(gdb) stepi +> +0xffffffff812efdbe 1497 if +> +(!sysctl_memory_failure_recovery) +> +(gdb) n +> +1500 p = pfn_to_online_page(pfn); +> +(gdb) l +> +1496 +> +1497 if (!sysctl_memory_failure_recovery) +> +1498 panic("Memory failure on page %lx", pfn); +> +1499 +> +1500 p = pfn_to_online_page(pfn); +> +1501 if (!p) { +> +> +Best Regrades, +> +Shuai +> +> +> +[1] +https://github.com/qemu/qemu/blob/master/docs/system/gdb.rst +Hi, Richard, + +I was wondering that do you have any comments to this? + +Best Regrades, +Shuai + diff --git a/classification_output/03/semantic/46572227 b/classification_output/03/semantic/46572227 new file mode 100644 index 00000000..49046d2a --- /dev/null +++ b/classification_output/03/semantic/46572227 @@ -0,0 +1,409 @@ +semantic: 0.965 +mistranslation: 0.946 +other: 0.927 +instruction: 0.906 +boot: 0.900 +KVM: 0.857 +network: 0.841 + +[Qemu-devel] [Bug?] Windows 7's time drift obviously while RTC rate switching frequently between high and low timer rate + +Hi, + +We tested with the latest QEMU, and found that time drift obviously (clock fast +in guest) +in Windows 7 64 bits guest in some cases. + +It is easily to reproduce, using the follow QEMU command line to start windows +7: + +# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine +pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp +4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet +-global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc +:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device +piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 +-device usb-kbd,id=input2 -monitor stdio + +Adjust the VM's time to host time, and run java application or run the follow +program +in windows 7: + +#pragma comment(lib, "winmm") +#include <stdio.h> +#include <windows.h> + +#define SWITCH_PEROID 13 + +int main() +{ + DWORD count = 0; + + while (1) + { + count++; + timeBeginPeriod(1); + DWORD start = timeGetTime(); + Sleep(40); + timeEndPeriod(1); + if ((count % SWITCH_PEROID) == 0) { + Sleep(1); + } + } + return 0; +} + +After few minutes, you will find that the time in windows 7 goes ahead of the +host time, drifts about several seconds. + +I have dug deeper in this problem. For windows systems that use the CMOS timer, +the base interrupt rate is usually 64Hz, but running some application in VM +will raise the timer rate to 1024Hz, running java application and or above +program will raise the timer rate. +Besides, Windows operating systems generally keep time by counting timer +interrupts (ticks). But QEMU seems not emulate the rate converting fine. + +We update the timer in function periodic_timer_update(): +static void periodic_timer_update(RTCState *s, int64_t current_time) +{ + + cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, get_ticks_per_sec()); + next_irq_clock = (cur_clock & ~(period - 1)) + period; + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Here we calculate the next interrupt time by align the current clock with the +new period, I'm a little confused that why we care about the *history* time ? +If VM switches from high rate to low rate, the next interrupt time may come +earlier than it supposed to be. We have observed it in our test. we printed the +interval time of interrupts and the VM's current time (We got the time from VM). + +Here is part of the log: +... ... +period=512 irq inject 1534: 15625 us +Tue Mar 29 04:38:00 2016 +*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 +us +... ... +*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 +us +Convert 32 --- > 512: 703: 96578 us +period=512 irq inject 44391: 12702 us +Convert 512 --- > 32: 704: 12704 us11 +period=32 irq inject 44392: 979 us +... ... +32 --- > 512: 705: 24388 us +period=512 irq inject 44417: 6834 us +Convert 512 --- > 32: 706: 6830 us +period=32 irq inject 44418: 978 us +... ... +Convert 32 --- > 512: 707: 60525 us +period=512 irq inject 44480: 1945 us +Convert 512 --- > 32: 708: 1955 us +period=32 irq inject 44481: 977 us +... ... +Convert 32 --- > 512: 709: 36105 us +period=512 irq inject 44518: 10741 us +Convert 512 --- > 32: 710: 10736 us +period=32 irq inject 44519: 989 us +... ... +Convert 32 --- > 512: 711: 123998 us +period=512 irq inject 44646: 974 us +period=512 irq inject 44647: 15607 us +Convert 512 --- > 32: 712: 16560 us +period=32 irq inject 44648: 980 us +... ... +period=32 irq inject 44738: 974 us +Convert 32 --- > 512: 713: 88828 us +period=512 irq inject 44739: 4885 us +Convert 512 --- > 32: 714: 4882 us +period=32 irq inject 44740: 989 us +... ... +period=32 irq inject 44842: 974 us +Convert 32 --- > 512: 715: 100537 us +period=512 irq inject 44843: 8788 us +Convert 512 --- > 32: 716: 8789 us +period=32 irq inject 44844: 972 us +... ... +period=32 irq inject 44941: 979 us +Convert 32 --- > 512: 717: 95677 us +period=512 irq inject 44942: 13661 us +Convert 512 --- > 32: 718: 13657 us +period=32 irq inject 44943: 987 us +... ... +Convert 32 --- > 512: 719: 94690 us +period=512 irq inject 45040: 14643 us +Convert 512 --- > 32: 720: 14642 us +period=32 irq inject 45041: 974 us +... ... +Convert 32 --- > 512: 721: 88848 us +period=512 irq inject 45132: 4892 us +Convert 512 --- > 32: 722: 4931 us +period=32 irq inject 45133: 964 us +... ... +Tue Mar 29 04:39:19 2016 +*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is +911520 us + +For windows 7, it has got 835 IRQs which injected during the period of 32, +and got 11 IRQs that injected during the period of 512. it updated the +wall-clock +time with one second, because it supposed it has counted +(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us. + +IMHO, we should calculate the next interrupt time based on the time of last +interrupt injected, and it seems to be more similar with hardware CMOS timer +in this way. +Maybe someone can tell me the reason why we calculated the interrupt timer +in that way, or is it a bug ? ;) + +Thanks, +Hailiang + +ping... + +It seems that we can eliminate the drift by the following patch. +(I tested it for two hours, and there is no drift, before, the timer +in Windows 7 drifts about 2 seconds per minute.) I'm not sure if it is +the right way to solve the problem. +Any comments are welcomed. Thanks. + +From bd6acd577cbbc9d92d6376c770219470f184f7de Mon Sep 17 00:00:00 2001 +From: zhanghailiang <address@hidden> +Date: Thu, 31 Mar 2016 16:36:15 -0400 +Subject: [PATCH] timer/mc146818rtc: fix timer drift in Windows OS while RTC + rate converting frequently + +Signed-off-by: zhanghailiang <address@hidden> +--- + hw/timer/mc146818rtc.c | 25 ++++++++++++++++++++++--- + 1 file changed, 22 insertions(+), 3 deletions(-) + +diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c +index 2ac0fd3..e39d2da 100644 +--- a/hw/timer/mc146818rtc.c ++++ b/hw/timer/mc146818rtc.c +@@ -79,6 +79,7 @@ typedef struct RTCState { + /* periodic timer */ + QEMUTimer *periodic_timer; + int64_t next_periodic_time; ++ uint64_t last_periodic_time; + /* update-ended timer */ + QEMUTimer *update_timer; + uint64_t next_alarm_time; +@@ -152,7 +153,8 @@ static void rtc_coalesced_timer(void *opaque) + static void periodic_timer_update(RTCState *s, int64_t current_time) + { + int period_code, period; +- int64_t cur_clock, next_irq_clock; ++ int64_t cur_clock, next_irq_clock, pre_irq_clock; ++ bool change = false; + + period_code = s->cmos_data[RTC_REG_A] & 0x0f; + if (period_code != 0 +@@ -165,14 +167,28 @@ static void periodic_timer_update(RTCState *s, int64_t +current_time) + if (period != s->period) { + s->irq_coalesced = (s->irq_coalesced * s->period) / period; + DPRINTF_C("cmos: coalesced irqs scaled to %d\n", s->irq_coalesced); ++ if (s->period && period) { ++ change = true; ++ } + } + s->period = period; + #endif + /* compute 32 khz clock */ + cur_clock = + muldiv64(current_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND); ++ if (change) { ++ int offset = 0; + +- next_irq_clock = (cur_clock & ~(period - 1)) + period; ++ pre_irq_clock = muldiv64(s->last_periodic_time, RTC_CLOCK_RATE, ++ NANOSECONDS_PER_SECOND); ++ if ((cur_clock - pre_irq_clock) > period) { ++ offset = (cur_clock - pre_irq_clock) / period; ++ } ++ s->irq_coalesced += offset; ++ next_irq_clock = pre_irq_clock + (offset + 1) * period; ++ } else { ++ next_irq_clock = (cur_clock & ~(period - 1)) + period; ++ } + s->next_periodic_time = muldiv64(next_irq_clock, +NANOSECONDS_PER_SECOND, + RTC_CLOCK_RATE) + 1; + timer_mod(s->periodic_timer, s->next_periodic_time); +@@ -187,7 +203,9 @@ static void periodic_timer_update(RTCState *s, int64_t +current_time) + static void rtc_periodic_timer(void *opaque) + { + RTCState *s = opaque; +- ++ int64_t next_periodic_time; ++ ++ next_periodic_time = s->next_periodic_time; + periodic_timer_update(s, s->next_periodic_time); + s->cmos_data[RTC_REG_C] |= REG_C_PF; + if (s->cmos_data[RTC_REG_B] & REG_B_PIE) { +@@ -204,6 +222,7 @@ static void rtc_periodic_timer(void *opaque) + DPRINTF_C("cmos: coalesced irqs increased to %d\n", + s->irq_coalesced); + } ++ s->last_periodic_time = next_periodic_time; + } else + #endif + qemu_irq_raise(s->irq); +-- +1.8.3.1 + + +On 2016/3/29 19:58, Hailiang Zhang wrote: +Hi, + +We tested with the latest QEMU, and found that time drift obviously (clock fast +in guest) +in Windows 7 64 bits guest in some cases. + +It is easily to reproduce, using the follow QEMU command line to start windows +7: + +# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine +pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp +4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet +-global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc +:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device +piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 +-device usb-kbd,id=input2 -monitor stdio + +Adjust the VM's time to host time, and run java application or run the follow +program +in windows 7: + +#pragma comment(lib, "winmm") +#include <stdio.h> +#include <windows.h> + +#define SWITCH_PEROID 13 + +int main() +{ + DWORD count = 0; + + while (1) + { + count++; + timeBeginPeriod(1); + DWORD start = timeGetTime(); + Sleep(40); + timeEndPeriod(1); + if ((count % SWITCH_PEROID) == 0) { + Sleep(1); + } + } + return 0; +} + +After few minutes, you will find that the time in windows 7 goes ahead of the +host time, drifts about several seconds. + +I have dug deeper in this problem. For windows systems that use the CMOS timer, +the base interrupt rate is usually 64Hz, but running some application in VM +will raise the timer rate to 1024Hz, running java application and or above +program will raise the timer rate. +Besides, Windows operating systems generally keep time by counting timer +interrupts (ticks). But QEMU seems not emulate the rate converting fine. + +We update the timer in function periodic_timer_update(): +static void periodic_timer_update(RTCState *s, int64_t current_time) +{ + + cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, +get_ticks_per_sec()); + next_irq_clock = (cur_clock & ~(period - 1)) + period; + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Here we calculate the next interrupt time by align the current clock with the +new period, I'm a little confused that why we care about the *history* time ? +If VM switches from high rate to low rate, the next interrupt time may come +earlier than it supposed to be. We have observed it in our test. we printed the +interval time of interrupts and the VM's current time (We got the time from VM). + +Here is part of the log: +... ... +period=512 irq inject 1534: 15625 us +Tue Mar 29 04:38:00 2016 +*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 +us +... ... +*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 +us +Convert 32 --- > 512: 703: 96578 us +period=512 irq inject 44391: 12702 us +Convert 512 --- > 32: 704: 12704 us11 +period=32 irq inject 44392: 979 us +... ... +32 --- > 512: 705: 24388 us +period=512 irq inject 44417: 6834 us +Convert 512 --- > 32: 706: 6830 us +period=32 irq inject 44418: 978 us +... ... +Convert 32 --- > 512: 707: 60525 us +period=512 irq inject 44480: 1945 us +Convert 512 --- > 32: 708: 1955 us +period=32 irq inject 44481: 977 us +... ... +Convert 32 --- > 512: 709: 36105 us +period=512 irq inject 44518: 10741 us +Convert 512 --- > 32: 710: 10736 us +period=32 irq inject 44519: 989 us +... ... +Convert 32 --- > 512: 711: 123998 us +period=512 irq inject 44646: 974 us +period=512 irq inject 44647: 15607 us +Convert 512 --- > 32: 712: 16560 us +period=32 irq inject 44648: 980 us +... ... +period=32 irq inject 44738: 974 us +Convert 32 --- > 512: 713: 88828 us +period=512 irq inject 44739: 4885 us +Convert 512 --- > 32: 714: 4882 us +period=32 irq inject 44740: 989 us +... ... +period=32 irq inject 44842: 974 us +Convert 32 --- > 512: 715: 100537 us +period=512 irq inject 44843: 8788 us +Convert 512 --- > 32: 716: 8789 us +period=32 irq inject 44844: 972 us +... ... +period=32 irq inject 44941: 979 us +Convert 32 --- > 512: 717: 95677 us +period=512 irq inject 44942: 13661 us +Convert 512 --- > 32: 718: 13657 us +period=32 irq inject 44943: 987 us +... ... +Convert 32 --- > 512: 719: 94690 us +period=512 irq inject 45040: 14643 us +Convert 512 --- > 32: 720: 14642 us +period=32 irq inject 45041: 974 us +... ... +Convert 32 --- > 512: 721: 88848 us +period=512 irq inject 45132: 4892 us +Convert 512 --- > 32: 722: 4931 us +period=32 irq inject 45133: 964 us +... ... +Tue Mar 29 04:39:19 2016 +*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is +911520 us + +For windows 7, it has got 835 IRQs which injected during the period of 32, +and got 11 IRQs that injected during the period of 512. it updated the +wall-clock +time with one second, because it supposed it has counted +(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us. + +IMHO, we should calculate the next interrupt time based on the time of last +interrupt injected, and it seems to be more similar with hardware CMOS timer +in this way. +Maybe someone can tell me the reason why we calculated the interrupt timer +in that way, or is it a bug ? ;) + +Thanks, +Hailiang + diff --git a/classification_output/03/semantic/53568181 b/classification_output/03/semantic/53568181 new file mode 100644 index 00000000..a78ad434 --- /dev/null +++ b/classification_output/03/semantic/53568181 @@ -0,0 +1,81 @@ +semantic: 0.943 +instruction: 0.932 +network: 0.925 +other: 0.921 +KVM: 0.917 +boot: 0.876 +mistranslation: 0.854 + +[BUG] x86/PAT handling severely crippled AMD-V SVM KVM performance + +Hi, I maintain an out-of-tree 3D APIs pass-through QEMU device models at +https://github.com/kjliew/qemu-3dfx +that provide 3D acceleration for legacy +32-bit Windows guests (Win98SE, WinME, Win2k and WinXP) with the focus on +playing old legacy games from 1996-2003. It currently supports the now-defunct +3Dfx propriety API called Glide and an alternative OpenGL pass-through based on +MESA implementation. + +The basic concept of both implementations create memory-mapped virtual +interfaces consist of host/guest shared memory with guest-push model instead of +a more common host-pull model for typical QEMU device model implementation. +Guest uses shared memory as FIFOs for drawing commands and data to bulk up the +operations until serialization event that flushes the FIFOs into host. This +achieves extremely good performance since virtual CPUs are fast with hardware +acceleration (Intel VT/AMD-V) and reduces the overhead of frequent VMEXITs to +service the device emulation. Both implementations work on Windows 10 with WHPX +and HAXM accelerators as well as KVM in Linux. + +On Windows 10, QEMU WHPX implementation does not sync MSR_IA32_PAT during +host/guest states sync. There is no visibility into the closed-source WHPX on +how things are managed behind the scene, but from measuring performance figures +I can conclude that it didn't handle the MSR_IA32_PAT correctly for both Intel +and AMD. Call this fair enough, if you will, it didn't flag any concerns, in +fact games such as Quake2 and Quake3 were still within playable frame rate of +40~60FPS on Win2k/XP guest. Until the same games were run on Win98/ME guest and +the frame rate blew off the roof (300~500FPS) on the same CPU and GPU. In fact, +the later seemed to be more inlined with runnng the games bare-metal with vsync +off. + +On Linux (at the time of writing kernel 5.6.7/Mesa 20.0), the difference +prevailed. Intel CPUs (and it so happened that I was on laptop with Intel GPU), +the VMX-based kvm_intel got it right while SVM-based kvm_amd did not. +To put this in simple exaggeration, an aging Core i3-4010U/HD Graphics 4400 +(Haswell GT2) exhibited an insane performance in Quake2/Quake3 timedemos that +totally crushed more recent AMD Ryzen 2500U APU/Vega 8 Graphics and AMD +FX8300/NVIDIA GT730 on desktop. Simply unbelievable! + +It turned out that there was something to do with AMD-V NPT. By loading kvm_amd +with npt=0, AMD Ryzen APU and FX8300 regained a huge performance leap. However, +AMD NPT issue with KVM was supposedly fixed in 2017 kernel commits. NPT=0 would +actually incur performance loss for VM due to intervention required by +hypervisors to maintain the shadow page tables. Finally, I was able to find the +pointer that pointed to MSR_IA32_PAT register. By updating the MSR_IA32_PAT to +0x0606xxxx0606xxxxULL, AMD CPUs now regain their rightful performance without +taking the hit of NPT=0 for Linux KVM. Taking the same solution into Windows, +both Intel and AMD CPUs no longer require Win98/ME guest to unleash the full +performance potentials and performance figures based on games measured on WHPX +were not very far behind Linux KVM. + +So I guess the problem lies in host/guest shared memory regions mapped as +uncacheable from virtual CPU perspective. As virtual CPUs now completely execute +in hardware context with x86 hardware virtualiztion extensions, the cacheability +of memory types would severely impact the performance on guests. WHPX didn't +handle it for both Intel EPT and AMD NPT, but KVM seems to do it right for Intel +EPT. I don't have the correct fix for QEMU. But what I can do for my 3D APIs +pass-through device models is to implement host-side hooks to reprogram and +restore MSR_IA32_PAT upon activation/deactivation of the 3D APIs. Perhaps there +is also a better solution of having the proper kernel drivers for virtual +interfaces to manage the memory types of host/guest shared memory in kernel +space, but to do that and the needs of Microsoft tools/DDKs, I will just forget +it. The guest stubs uses the same kernel drivers included in 3Dfx drivers for +memory mapping and the virtual interfaces remain driver-less from Windows OS +perspective. Considering the current state of halting progress for QEMU native +virgil3D to support Windows OS, I am just being pragmatic. I understand that +QEMU virgil3D will eventually bring 3D acceleration for Windows guests, but I do +not expect anything to support legacy 32-bit Windows OSes which have out-grown +their commercial usefulness. + +Regards, +KJ Liew + diff --git a/classification_output/03/semantic/80570214 b/classification_output/03/semantic/80570214 new file mode 100644 index 00000000..19371726 --- /dev/null +++ b/classification_output/03/semantic/80570214 @@ -0,0 +1,403 @@ +semantic: 0.978 +instruction: 0.978 +other: 0.978 +network: 0.975 +mistranslation: 0.973 +KVM: 0.971 +boot: 0.969 + +[Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user + +Hi, + +We catch a segfault in our project. + +Qemu version is 2.3.0 + +The Stack backtrace is: +(gdb) bt +#0 0x0000000000000000 in ?? () +#1 0x00007f7ad9280b2f in qemu_deliver_packet (sender=<optimized out>, flags=<optimized +out>, data=<optimized out>, size=100, opaque= + 0x7f7ad9d6db10) at net/net.c:510 +#2 0x00007f7ad92831fa in qemu_net_queue_deliver (size=<optimized out>, data=<optimized +out>, flags=<optimized out>, + sender=<optimized out>, queue=<optimized out>) at net/queue.c:157 +#3 qemu_net_queue_flush (queue=0x7f7ad9d39630) at net/queue.c:254 +#4 0x00007f7ad9280dac in qemu_flush_or_purge_queued_packets +(nc=0x7f7ad9d6db10, purge=true) at net/net.c:539 +#5 0x00007f7ad9280e76 in net_vm_change_state_handler (opaque=<optimized out>, +running=<optimized out>, state=100) at net/net.c:1214 +#6 0x00007f7ad915612f in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) +at vl.c:1820 +#7 0x00007f7ad906db1a in do_vm_stop (state=<optimized out>) at +/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:631 +#8 vm_stop (state=RUN_STATE_SHUTDOWN) at +/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:1325 +#9 0x00007f7ad915e4a2 in main_loop_should_exit () at vl.c:2080 +#10 main_loop () at vl.c:2131 +#11 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at +vl.c:4721 +(gdb) p *(NetClientState *)0x7f7ad9d6db10 +$1 = {info = 0x7f7ad9824520, link_down = 0, next = {tqe_next = 0x7f7ad0f06d10, +tqe_prev = 0x7f7ad98b1cf0}, peer = 0x7f7ad0f06d10, + incoming_queue = 0x7f7ad9d39630, model = 0x7f7ad9d39590 "vhost_user", name = +0x7f7ad9d39570 "hostnet0", info_str = + "vhost-user to charnet0", '\000' <repeats 233 times>, receive_disabled = 0, +destructor = + 0x7f7ad92821f0 <qemu_net_client_destructor>, queue_index = 0, +rxfilter_notify_enabled = 0} +(gdb) p *(NetClientInfo *)0x7f7ad9824520 +$2 = {type = NET_CLIENT_OPTIONS_KIND_VHOST_USER, size = 360, receive = 0, +receive_raw = 0, receive_iov = 0, can_receive = 0, cleanup = + 0x7f7ad9288850 <vhost_user_cleanup>, link_status_changed = 0, +query_rx_filter = 0, poll = 0, has_ufo = + 0x7f7ad92886d0 <vhost_user_has_ufo>, has_vnet_hdr = 0x7f7ad9288670 +<vhost_user_has_vnet_hdr>, has_vnet_hdr_len = 0, + using_vnet_hdr = 0, set_offload = 0, set_vnet_hdr_len = 0} +(gdb) + +The corresponding codes where gdb reports error are: (We have added some codes +in net.c) +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is 510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two callback +functions seem to be always NULL, +Why we can come here ? +Is it an error to add VM state change handler for vhost-user ? + +Thanks, +zhanghailiang + +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +> +The corresponding codes where gdb reports error are: (We have added some +> +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions to do so? + +> +ssize_t qemu_deliver_packet(NetClientState *sender, +> +unsigned flags, +> +const uint8_t *data, +> +size_t size, +> +void *opaque) +> +{ +> +NetClientState *nc = opaque; +> +ssize_t ret; +> +> +if (nc->link_down) { +> +return size; +> +} +> +> +if (nc->receive_disabled) { +> +return 0; +> +} +> +> +if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { +> +ret = nc->info->receive_raw(nc, data, size); +> +} else { +> +ret = nc->info->receive(nc, data, size); ----> Here is 510 line +> +} +> +> +I'm not quite familiar with vhost-user, but for vhost-user, these two +> +callback functions seem to be always NULL, +> +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) + +-- +Marc-André Lureau + +On 2015/11/3 22:54, Marc-André Lureau wrote: +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +The corresponding codes where gdb reports error are: (We have added some +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions to do so? +OK, i will try to do it. There is nothing special, we run iperf tool in VM, +and then shutdown or reboot it. There is change you can catch segfault. +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is 510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two +callback functions seem to be always NULL, +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) +I have looked at the newest codes, i think we can still have chance to +come here, since we will change nc->receive_disable to false temporarily in +qemu_flush_or_purge_queued_packets(), there is no difference between 2.3 and 2.5 +for this. +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +in qemu_net_queue_flush() for vhost-user ? + +i will try to reproduce it by using newest qemu. + +Thanks, +zhanghailiang + +On 11/04/2015 10:24 AM, zhanghailiang wrote: +> +On 2015/11/3 22:54, Marc-André Lureau wrote: +> +> Hi +> +> +> +> On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +> +> <address@hidden> wrote: +> +>> The corresponding codes where gdb reports error are: (We have added +> +>> some +> +>> codes in net.c) +> +> +> +> Can you reproduce with unmodified qemu? Could you give instructions +> +> to do so? +> +> +> +> +OK, i will try to do it. There is nothing special, we run iperf tool +> +in VM, +> +and then shutdown or reboot it. There is change you can catch segfault. +> +> +>> ssize_t qemu_deliver_packet(NetClientState *sender, +> +>> unsigned flags, +> +>> const uint8_t *data, +> +>> size_t size, +> +>> void *opaque) +> +>> { +> +>> NetClientState *nc = opaque; +> +>> ssize_t ret; +> +>> +> +>> if (nc->link_down) { +> +>> return size; +> +>> } +> +>> +> +>> if (nc->receive_disabled) { +> +>> return 0; +> +>> } +> +>> +> +>> if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { +> +>> ret = nc->info->receive_raw(nc, data, size); +> +>> } else { +> +>> ret = nc->info->receive(nc, data, size); ----> Here is +> +>> 510 line +> +>> } +> +>> +> +>> I'm not quite familiar with vhost-user, but for vhost-user, these two +> +>> callback functions seem to be always NULL, +> +>> Why we can come here ? +> +> +> +> You should not come here, vhost-user has nc->receive_disabled (it +> +> changes in 2.5) +> +> +> +> +I have looked at the newest codes, i think we can still have chance to +> +come here, since we will change nc->receive_disable to false +> +temporarily in +> +qemu_flush_or_purge_queued_packets(), there is no difference between +> +2.3 and 2.5 +> +for this. +> +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +> +in qemu_net_queue_flush() for vhost-user ? +The only thing I can image is self announcing. Are you trying to do +migration? 2.5 only support sending rarp through this. + +And it's better to have a breakpoint to see why a packet was queued for +vhost-user. The stack trace may also help in this case. + +> +> +i will try to reproduce it by using newest qemu. +> +> +Thanks, +> +zhanghailiang +> + +On 2015/11/4 11:19, Jason Wang wrote: +On 11/04/2015 10:24 AM, zhanghailiang wrote: +On 2015/11/3 22:54, Marc-André Lureau wrote: +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +The corresponding codes where gdb reports error are: (We have added +some +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions +to do so? +OK, i will try to do it. There is nothing special, we run iperf tool +in VM, +and then shutdown or reboot it. There is change you can catch segfault. +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is +510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two +callback functions seem to be always NULL, +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) +I have looked at the newest codes, i think we can still have chance to +come here, since we will change nc->receive_disable to false +temporarily in +qemu_flush_or_purge_queued_packets(), there is no difference between +2.3 and 2.5 +for this. +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +in qemu_net_queue_flush() for vhost-user ? +The only thing I can image is self announcing. Are you trying to do +migration? 2.5 only support sending rarp through this. +Hmm, it's not triggered by migration, For qemu-2.5, IMHO, it doesn't have such +problem, +since the callback function 'receive' is not NULL. It is vhost_user_receive(). +And it's better to have a breakpoint to see why a packet was queued for +vhost-user. The stack trace may also help in this case. +OK, i'm trying to reproduce it. + +Thanks, +zhanghailiang +i will try to reproduce it by using newest qemu. + +Thanks, +zhanghailiang +. + diff --git a/classification_output/03/semantic/96782458 b/classification_output/03/semantic/96782458 new file mode 100644 index 00000000..7b450a0a --- /dev/null +++ b/classification_output/03/semantic/96782458 @@ -0,0 +1,1002 @@ +semantic: 0.984 +other: 0.982 +boot: 0.980 +instruction: 0.974 +network: 0.967 +KVM: 0.963 +mistranslation: 0.949 + +[Qemu-devel] [BUG] Migrate failes between boards with different PMC counts + +Hi all, + +Recently, I found migration failed when enable vPMU. + +migrate vPMU state was introduced in linux-3.10 + qemu-1.7. + +As long as enable vPMU, qemu will save / load the +vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration. +But global_ctrl generated based on cpuid(0xA), the number of general-purpose +performance +monitoring counters(PMC) can vary according to Intel SDN. The number of PMC +presented +to vm, does not support configuration currently, it depend on host cpuid, and +enable all pmc +defaultly at KVM. It cause migration to fail between boards with different PMC +counts. + +The return value of cpuid (0xA) is different dur to cpu, according to Intel +SDNï¼18-10 Vol. 3B: + +Note: The number of general-purpose performance monitoring counters (i.e. N in +Figure 18-9) +can vary across processor generations within a processor family, across +processor families, or +could be different depending on the configuration chosen at boot time in the +BIOS regarding +Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N +=4 for processors +based on the Nehalem microarchitecture; for processors based on the Sandy Bridge +microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 +if not active). + +Also I found, N=8 if HT is not active based on the broadwellï¼, +such as CPU E7-8890 v4 @ 2.20GHz + +# ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda +/data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming +tcp::8888 +Completed 100 % +qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff +qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +kvm_put_msrs: +Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +Aborted + +So make number of pmc configurable to vm ? Any better idea ? + + +Regards, +-Zhuang Yanying + +* Zhuangyanying (address@hidden) wrote: +> +Hi all, +> +> +Recently, I found migration failed when enable vPMU. +> +> +migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> +As long as enable vPMU, qemu will save / load the +> +vmstate_msr_architectural_pmu(msr_global_ctrl) register during the migration. +> +But global_ctrl generated based on cpuid(0xA), the number of general-purpose +> +performance +> +monitoring counters(PMC) can vary according to Intel SDN. The number of PMC +> +presented +> +to vm, does not support configuration currently, it depend on host cpuid, and +> +enable all pmc +> +defaultly at KVM. It cause migration to fail between boards with different +> +PMC counts. +> +> +The return value of cpuid (0xA) is different dur to cpu, according to Intel +> +SDNï¼18-10 Vol. 3B: +> +> +Note: The number of general-purpose performance monitoring counters (i.e. N +> +in Figure 18-9) +> +can vary across processor generations within a processor family, across +> +processor families, or +> +could be different depending on the configuration chosen at boot time in the +> +BIOS regarding +> +Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; +> +N =4 for processors +> +based on the Nehalem microarchitecture; for processors based on the Sandy +> +Bridge +> +microarchitecture, N = 4 if Intel Hyper Threading Technology is active and +> +N=8 if not active). +> +> +Also I found, N=8 if HT is not active based on the broadwellï¼, +> +such as CPU E7-8890 v4 @ 2.20GHz +> +> +# ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda +> +/data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming +> +tcp::8888 +> +Completed 100 % +> +qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff +> +qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +kvm_put_msrs: +> +Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +Aborted +> +> +So make number of pmc configurable to vm ? Any better idea ? +Coincidentally we hit a similar problem a few days ago with -cpu host - it +took me +quite a while to spot the difference between the machines was the source +had hyperthreading disabled. + +An option to set the number of counters makes sense to me; but I wonder +how many other options we need as well. Also, I'm not sure there's any +easy way for libvirt etc to figure out how many counters a host supports - it's +not in /proc/cpuinfo. + +Dave + +> +> +Regards, +> +-Zhuang Yanying +-- +Dr. David Alan Gilbert / address@hidden / Manchester, UK + +On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: +> +* Zhuangyanying (address@hidden) wrote: +> +> Hi all, +> +> +> +> Recently, I found migration failed when enable vPMU. +> +> +> +> migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> +> +> As long as enable vPMU, qemu will save / load the +> +> vmstate_msr_architectural_pmu(msr_global_ctrl) register during the +> +> migration. +> +> But global_ctrl generated based on cpuid(0xA), the number of +> +> general-purpose performance +> +> monitoring counters(PMC) can vary according to Intel SDN. The number of PMC +> +> presented +> +> to vm, does not support configuration currently, it depend on host cpuid, +> +> and enable all pmc +> +> defaultly at KVM. It cause migration to fail between boards with different +> +> PMC counts. +> +> +> +> The return value of cpuid (0xA) is different dur to cpu, according to Intel +> +> SDNï¼18-10 Vol. 3B: +> +> +> +> Note: The number of general-purpose performance monitoring counters (i.e. N +> +> in Figure 18-9) +> +> can vary across processor generations within a processor family, across +> +> processor families, or +> +> could be different depending on the configuration chosen at boot time in +> +> the BIOS regarding +> +> Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom +> +> processors; N =4 for processors +> +> based on the Nehalem microarchitecture; for processors based on the Sandy +> +> Bridge +> +> microarchitecture, N = 4 if Intel Hyper Threading Technology is active and +> +> N=8 if not active). +> +> +> +> Also I found, N=8 if HT is not active based on the broadwellï¼, +> +> such as CPU E7-8890 v4 @ 2.20GHz +> +> +> +> # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda +> +> /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming +> +> tcp::8888 +> +> Completed 100 % +> +> qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff +> +> qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +> kvm_put_msrs: +> +> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +> Aborted +> +> +> +> So make number of pmc configurable to vm ? Any better idea ? +> +> +Coincidentally we hit a similar problem a few days ago with -cpu host - it +> +took me +> +quite a while to spot the difference between the machines was the source +> +had hyperthreading disabled. +> +> +An option to set the number of counters makes sense to me; but I wonder +> +how many other options we need as well. Also, I'm not sure there's any +> +easy way for libvirt etc to figure out how many counters a host supports - +> +it's not in /proc/cpuinfo. +We actually try to avoid /proc/cpuinfo whereever possible. We do direct +CPUID asm instructions to identify features, and prefer to use +/sys/devices/system/cpu if that has suitable data + +Where do the PMC counts come from originally ? CPUID or something else ? + +Regards, +Daniel +-- +|: +https://berrange.com +-o- +https://www.flickr.com/photos/dberrange +:| +|: +https://libvirt.org +-o- +https://fstop138.berrange.com +:| +|: +https://entangle-photo.org +-o- +https://www.instagram.com/dberrange +:| + +* Daniel P. Berrange (address@hidden) wrote: +> +On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: +> +> * Zhuangyanying (address@hidden) wrote: +> +> > Hi all, +> +> > +> +> > Recently, I found migration failed when enable vPMU. +> +> > +> +> > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> > +> +> > As long as enable vPMU, qemu will save / load the +> +> > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the +> +> > migration. +> +> > But global_ctrl generated based on cpuid(0xA), the number of +> +> > general-purpose performance +> +> > monitoring counters(PMC) can vary according to Intel SDN. The number of +> +> > PMC presented +> +> > to vm, does not support configuration currently, it depend on host cpuid, +> +> > and enable all pmc +> +> > defaultly at KVM. It cause migration to fail between boards with +> +> > different PMC counts. +> +> > +> +> > The return value of cpuid (0xA) is different dur to cpu, according to +> +> > Intel SDNï¼18-10 Vol. 3B: +> +> > +> +> > Note: The number of general-purpose performance monitoring counters (i.e. +> +> > N in Figure 18-9) +> +> > can vary across processor generations within a processor family, across +> +> > processor families, or +> +> > could be different depending on the configuration chosen at boot time in +> +> > the BIOS regarding +> +> > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom +> +> > processors; N =4 for processors +> +> > based on the Nehalem microarchitecture; for processors based on the Sandy +> +> > Bridge +> +> > microarchitecture, N = 4 if Intel Hyper Threading Technology is active +> +> > and N=8 if not active). +> +> > +> +> > Also I found, N=8 if HT is not active based on the broadwellï¼, +> +> > such as CPU E7-8890 v4 @ 2.20GHz +> +> > +> +> > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda +> +> > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true -incoming +> +> > tcp::8888 +> +> > Completed 100 % +> +> > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff +> +> > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +> > kvm_put_msrs: +> +> > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +> > Aborted +> +> > +> +> > So make number of pmc configurable to vm ? Any better idea ? +> +> +> +> Coincidentally we hit a similar problem a few days ago with -cpu host - it +> +> took me +> +> quite a while to spot the difference between the machines was the source +> +> had hyperthreading disabled. +> +> +> +> An option to set the number of counters makes sense to me; but I wonder +> +> how many other options we need as well. Also, I'm not sure there's any +> +> easy way for libvirt etc to figure out how many counters a host supports - +> +> it's not in /proc/cpuinfo. +> +> +We actually try to avoid /proc/cpuinfo whereever possible. We do direct +> +CPUID asm instructions to identify features, and prefer to use +> +/sys/devices/system/cpu if that has suitable data +> +> +Where do the PMC counts come from originally ? CPUID or something else ? +Yes, they're bits 8..15 of CPUID leaf 0xa + +Dave + +> +Regards, +> +Daniel +> +-- +> +|: +https://berrange.com +-o- +https://www.flickr.com/photos/dberrange +:| +> +|: +https://libvirt.org +-o- +https://fstop138.berrange.com +:| +> +|: +https://entangle-photo.org +-o- +https://www.instagram.com/dberrange +:| +-- +Dr. David Alan Gilbert / address@hidden / Manchester, UK + +On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: +> +* Daniel P. Berrange (address@hidden) wrote: +> +> On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: +> +> > * Zhuangyanying (address@hidden) wrote: +> +> > > Hi all, +> +> > > +> +> > > Recently, I found migration failed when enable vPMU. +> +> > > +> +> > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> > > +> +> > > As long as enable vPMU, qemu will save / load the +> +> > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the +> +> > > migration. +> +> > > But global_ctrl generated based on cpuid(0xA), the number of +> +> > > general-purpose performance +> +> > > monitoring counters(PMC) can vary according to Intel SDN. The number of +> +> > > PMC presented +> +> > > to vm, does not support configuration currently, it depend on host +> +> > > cpuid, and enable all pmc +> +> > > defaultly at KVM. It cause migration to fail between boards with +> +> > > different PMC counts. +> +> > > +> +> > > The return value of cpuid (0xA) is different dur to cpu, according to +> +> > > Intel SDNï¼18-10 Vol. 3B: +> +> > > +> +> > > Note: The number of general-purpose performance monitoring counters +> +> > > (i.e. N in Figure 18-9) +> +> > > can vary across processor generations within a processor family, across +> +> > > processor families, or +> +> > > could be different depending on the configuration chosen at boot time +> +> > > in the BIOS regarding +> +> > > Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom +> +> > > processors; N =4 for processors +> +> > > based on the Nehalem microarchitecture; for processors based on the +> +> > > Sandy Bridge +> +> > > microarchitecture, N = 4 if Intel Hyper Threading Technology is active +> +> > > and N=8 if not active). +> +> > > +> +> > > Also I found, N=8 if HT is not active based on the broadwellï¼, +> +> > > such as CPU E7-8890 v4 @ 2.20GHz +> +> > > +> +> > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -hda +> +> > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true +> +> > > -incoming tcp::8888 +> +> > > Completed 100 % +> +> > > qemu-system-x86_64: error: failed to set MSR 0x38f to 0x7000000ff +> +> > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +> > > kvm_put_msrs: +> +> > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +> > > Aborted +> +> > > +> +> > > So make number of pmc configurable to vm ? Any better idea ? +> +> > +> +> > Coincidentally we hit a similar problem a few days ago with -cpu host - +> +> > it took me +> +> > quite a while to spot the difference between the machines was the source +> +> > had hyperthreading disabled. +> +> > +> +> > An option to set the number of counters makes sense to me; but I wonder +> +> > how many other options we need as well. Also, I'm not sure there's any +> +> > easy way for libvirt etc to figure out how many counters a host supports - +> +> > it's not in /proc/cpuinfo. +> +> +> +> We actually try to avoid /proc/cpuinfo whereever possible. We do direct +> +> CPUID asm instructions to identify features, and prefer to use +> +> /sys/devices/system/cpu if that has suitable data +> +> +> +> Where do the PMC counts come from originally ? CPUID or something else ? +> +> +Yes, they're bits 8..15 of CPUID leaf 0xa +Ok, that's easy enough for libvirt to detect then. More a question of what +libvirt should then do this with the info.... + +Regards, +Daniel +-- +|: +https://berrange.com +-o- +https://www.flickr.com/photos/dberrange +:| +|: +https://libvirt.org +-o- +https://fstop138.berrange.com +:| +|: +https://entangle-photo.org +-o- +https://www.instagram.com/dberrange +:| + +> +-----Original Message----- +> +From: Daniel P. Berrange [ +mailto:address@hidden +> +Sent: Monday, April 24, 2017 6:34 PM +> +To: Dr. David Alan Gilbert +> +Cc: Zhuangyanying; Zhanghailiang; wangxin (U); address@hidden; +> +Gonglei (Arei); Huangzhichao; address@hidden +> +Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different +> +PMC counts +> +> +On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: +> +> * Daniel P. Berrange (address@hidden) wrote: +> +> > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: +> +> > > * Zhuangyanying (address@hidden) wrote: +> +> > > > Hi all, +> +> > > > +> +> > > > Recently, I found migration failed when enable vPMU. +> +> > > > +> +> > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> > > > +> +> > > > As long as enable vPMU, qemu will save / load the +> +> > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the +> +migration. +> +> > > > But global_ctrl generated based on cpuid(0xA), the number of +> +> > > > general-purpose performance monitoring counters(PMC) can vary +> +> > > > according to Intel SDN. The number of PMC presented to vm, does +> +> > > > not support configuration currently, it depend on host cpuid, and +> +> > > > enable +> +all pmc defaultly at KVM. It cause migration to fail between boards with +> +different PMC counts. +> +> > > > +> +> > > > The return value of cpuid (0xA) is different dur to cpu, according to +> +> > > > Intel +> +SDNï¼18-10 Vol. 3B: +> +> > > > +> +> > > > Note: The number of general-purpose performance monitoring +> +> > > > counters (i.e. N in Figure 18-9) can vary across processor +> +> > > > generations within a processor family, across processor +> +> > > > families, or could be different depending on the configuration +> +> > > > chosen at boot time in the BIOS regarding Intel Hyper Threading +> +> > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for +> +processors based on the Nehalem microarchitecture; for processors based on +> +the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology +> +is active and N=8 if not active). +> +> > > > +> +> > > > Also I found, N=8 if HT is not active based on the broadwellï¼, +> +> > > > such as CPU E7-8890 v4 @ 2.20GHz +> +> > > > +> +> > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m +> +> > > > 4096 -hda +> +> > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true +> +> > > > -incoming tcp::8888 Completed 100 % +> +> > > > qemu-system-x86_64: error: failed to set MSR 0x38f to +> +> > > > 0x7000000ff +> +> > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +kvm_put_msrs: +> +> > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +> > > > Aborted +> +> > > > +> +> > > > So make number of pmc configurable to vm ? Any better idea ? +> +> > > +> +> > > Coincidentally we hit a similar problem a few days ago with -cpu +> +> > > host - it took me quite a while to spot the difference between +> +> > > the machines was the source had hyperthreading disabled. +> +> > > +> +> > > An option to set the number of counters makes sense to me; but I +> +> > > wonder how many other options we need as well. Also, I'm not sure +> +> > > there's any easy way for libvirt etc to figure out how many +> +> > > counters a host supports - it's not in /proc/cpuinfo. +> +> > +> +> > We actually try to avoid /proc/cpuinfo whereever possible. We do +> +> > direct CPUID asm instructions to identify features, and prefer to +> +> > use /sys/devices/system/cpu if that has suitable data +> +> > +> +> > Where do the PMC counts come from originally ? CPUID or something +> +else ? +> +> +> +> Yes, they're bits 8..15 of CPUID leaf 0xa +> +> +Ok, that's easy enough for libvirt to detect then. More a question of what +> +libvirt +> +should then do this with the info.... +> +Do you mean to do a validation at the begining of migration? in +qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are +not equal, just quit migration? +It maybe a good enough first edition. +But for a further better edition, maybe it's better to support Heterogeneous +migration I think, so we might need to make PMC number configrable, then we +need to modify KVM/qemu as well. + +Regards, +-Zhuang Yanying + +* Zhuangyanying (address@hidden) wrote: +> +> +> +> -----Original Message----- +> +> From: Daniel P. Berrange [ +mailto:address@hidden +> +> Sent: Monday, April 24, 2017 6:34 PM +> +> To: Dr. David Alan Gilbert +> +> Cc: Zhuangyanying; Zhanghailiang; wangxin (U); address@hidden; +> +> Gonglei (Arei); Huangzhichao; address@hidden +> +> Subject: Re: [Qemu-devel] [BUG] Migrate failes between boards with different +> +> PMC counts +> +> +> +> On Mon, Apr 24, 2017 at 11:27:16AM +0100, Dr. David Alan Gilbert wrote: +> +> > * Daniel P. Berrange (address@hidden) wrote: +> +> > > On Mon, Apr 24, 2017 at 10:23:21AM +0100, Dr. David Alan Gilbert wrote: +> +> > > > * Zhuangyanying (address@hidden) wrote: +> +> > > > > Hi all, +> +> > > > > +> +> > > > > Recently, I found migration failed when enable vPMU. +> +> > > > > +> +> > > > > migrate vPMU state was introduced in linux-3.10 + qemu-1.7. +> +> > > > > +> +> > > > > As long as enable vPMU, qemu will save / load the +> +> > > > > vmstate_msr_architectural_pmu(msr_global_ctrl) register during the +> +> migration. +> +> > > > > But global_ctrl generated based on cpuid(0xA), the number of +> +> > > > > general-purpose performance monitoring counters(PMC) can vary +> +> > > > > according to Intel SDN. The number of PMC presented to vm, does +> +> > > > > not support configuration currently, it depend on host cpuid, and +> +> > > > > enable +> +> all pmc defaultly at KVM. It cause migration to fail between boards with +> +> different PMC counts. +> +> > > > > +> +> > > > > The return value of cpuid (0xA) is different dur to cpu, according +> +> > > > > to Intel +> +> SDNï¼18-10 Vol. 3B: +> +> > > > > +> +> > > > > Note: The number of general-purpose performance monitoring +> +> > > > > counters (i.e. N in Figure 18-9) can vary across processor +> +> > > > > generations within a processor family, across processor +> +> > > > > families, or could be different depending on the configuration +> +> > > > > chosen at boot time in the BIOS regarding Intel Hyper Threading +> +> > > > > Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for +> +> processors based on the Nehalem microarchitecture; for processors based on +> +> the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading +> +> Technology +> +> is active and N=8 if not active). +> +> > > > > +> +> > > > > Also I found, N=8 if HT is not active based on the broadwellï¼, +> +> > > > > such as CPU E7-8890 v4 @ 2.20GHz +> +> > > > > +> +> > > > > # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m +> +> > > > > 4096 -hda +> +> > > > > /data/zyy/test_qemu.img.sles12sp1 -vnc :99 -cpu kvm64,pmu=true +> +> > > > > -incoming tcp::8888 Completed 100 % +> +> > > > > qemu-system-x86_64: error: failed to set MSR 0x38f to +> +> > > > > 0x7000000ff +> +> > > > > qemu-system-x86_64: /data/zyy/git/test/qemu/target/i386/kvm.c:1833: +> +> kvm_put_msrs: +> +> > > > > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. +> +> > > > > Aborted +> +> > > > > +> +> > > > > So make number of pmc configurable to vm ? Any better idea ? +> +> > > > +> +> > > > Coincidentally we hit a similar problem a few days ago with -cpu +> +> > > > host - it took me quite a while to spot the difference between +> +> > > > the machines was the source had hyperthreading disabled. +> +> > > > +> +> > > > An option to set the number of counters makes sense to me; but I +> +> > > > wonder how many other options we need as well. Also, I'm not sure +> +> > > > there's any easy way for libvirt etc to figure out how many +> +> > > > counters a host supports - it's not in /proc/cpuinfo. +> +> > > +> +> > > We actually try to avoid /proc/cpuinfo whereever possible. We do +> +> > > direct CPUID asm instructions to identify features, and prefer to +> +> > > use /sys/devices/system/cpu if that has suitable data +> +> > > +> +> > > Where do the PMC counts come from originally ? CPUID or something +> +> else ? +> +> > +> +> > Yes, they're bits 8..15 of CPUID leaf 0xa +> +> +> +> Ok, that's easy enough for libvirt to detect then. More a question of what +> +> libvirt +> +> should then do this with the info.... +> +> +> +> +Do you mean to do a validation at the begining of migration? in +> +qemuMigrationBakeCookie() & qemuMigrationEatCookie(), if the PMC numbers are +> +not equal, just quit migration? +> +It maybe a good enough first edition. +> +But for a further better edition, maybe it's better to support Heterogeneous +> +migration I think, so we might need to make PMC number configrable, then we +> +need to modify KVM/qemu as well. +Yes agreed; the only thing I wanted to check was that libvirt would have enough +information to be able to use any feature we added to QEMU. + +Dave + +> +Regards, +> +-Zhuang Yanying +-- +Dr. David Alan Gilbert / address@hidden / Manchester, UK + diff --git a/classification_output/03/semantic/gitlab_semantic_addsubps b/classification_output/03/semantic/gitlab_semantic_addsubps new file mode 100644 index 00000000..4866461e --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_addsubps @@ -0,0 +1,31 @@ +semantic: 0.974 +instruction: 0.931 +other: 0.732 +boot: 0.465 +network: 0.393 +mistranslation: 0.299 +KVM: 0.192 + +x86 SSE/SSE2/SSE3 instruction semantic bugs with NaN + +Description of problem +The result of SSE/SSE2/SSE3 instructions with NaN is different from the CPU. From Intel manual Volume 1 Appendix D.4.2.2, they defined the behavior of such instructions with NaN. But I think QEMU did not implement this semantic exactly because the byte result is different. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x000000007fffffff; push rax; mov rax, 0x00000000ffffffff; push rax; movdqu XMM1, [rsp];"); + asm("mov rax, 0x2e711de7aa46af1a; push rax; mov rax, 0x7fffffff7fffffff; push rax; movdqu XMM2, [rsp];"); + asm("addsubps xmm1, xmm2"); +} + +Execute and compare the result with the CPU. This problem happens with other SSE/SSE2/SSE3 instructions specified in the manual, Volume 1 Appendix D.4.2.2. + +CPU xmm1[3] = 0xffffffff + +QEMU xmm1[3] = 0x7fffffff + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/classification_output/03/semantic/gitlab_semantic_adox b/classification_output/03/semantic/gitlab_semantic_adox new file mode 100644 index 00000000..30fb645a --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_adox @@ -0,0 +1,44 @@ +semantic: 0.990 +instruction: 0.944 +boot: 0.599 +mistranslation: 0.452 +network: 0.426 +other: 0.286 +KVM: 0.240 + +x86 ADOX and ADCX semantic bug +Description of problem +The result of instruction ADOX and ADCX are different from the CPU. The value of one of EFLAGS is different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("push 512; popfq;"); + asm("mov rax, 0xffffffff84fdbf24"); + asm("mov rbx, 0xb197d26043bec15d"); + asm("adox eax, ebx"); +} + + + +Execute and compare the result with the CPU. This problem happens with ADCX, too (with CF). + +CPU + +OF = 0 + + +QEMU + +OF = 1 + + + + + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/classification_output/03/semantic/gitlab_semantic_bextr b/classification_output/03/semantic/gitlab_semantic_bextr new file mode 100644 index 00000000..ba7f4513 --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_bextr @@ -0,0 +1,33 @@ +semantic: 0.993 +instruction: 0.944 +boot: 0.516 +mistranslation: 0.337 +network: 0.219 +other: 0.099 +KVM: 0.091 + +x86 BEXTR semantic bug +Description of problem +The result of instruction BEXTR is different with from the CPU. The value of destination register is different. I think QEMU does not consider the operand size limit. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x17b3693f77fb6e9"); + asm("mov rbx, 0x8f635a775ad3b9b4"); + asm("mov rcx, 0xb717b75da9983018"); + asm("bextr eax, ebx, ecx"); +} + +Execute and compare the result with the CPU. + +CPU +RAX = 0x5a + +QEMU +RAX = 0x635a775a + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/classification_output/03/semantic/gitlab_semantic_blsi b/classification_output/03/semantic/gitlab_semantic_blsi new file mode 100644 index 00000000..38f775a0 --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_blsi @@ -0,0 +1,28 @@ +semantic: 0.983 +instruction: 0.964 +boot: 0.678 +network: 0.672 +other: 0.609 +mistranslation: 0.606 +KVM: 0.412 + +x86 BLSI and BLSR semantic bug +Description of problem +The result of instruction BLSI and BLSR is different from the CPU. The value of CF is different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("blsi rax, rbx"); +} + + + +Execute and compare the result with the CPU. The value of CF is exactly the opposite. This problem happens with BLSR, too. + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/classification_output/03/semantic/gitlab_semantic_blsmsk b/classification_output/03/semantic/gitlab_semantic_blsmsk new file mode 100644 index 00000000..52dd92e0 --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_blsmsk @@ -0,0 +1,35 @@ +semantic: 0.987 +instruction: 0.962 +mistranslation: 0.603 +boot: 0.585 +network: 0.366 +other: 0.269 +KVM: 0.163 + +x86 BLSMSK semantic bug +Description of problem +The result of instruction BLSMSK is different with from the CPU. The value of CF is different. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x65b2e276ad27c67"); + asm("mov rbx, 0x62f34955226b2b5d"); + asm("blsmsk eax, ebx"); +} + +Execute and compare the result with the CPU. + +CPU + +CF = 0 + + +QEMU + +CF = 1 + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/classification_output/03/semantic/gitlab_semantic_bzhi b/classification_output/03/semantic/gitlab_semantic_bzhi new file mode 100644 index 00000000..8cbf1d76 --- /dev/null +++ b/classification_output/03/semantic/gitlab_semantic_bzhi @@ -0,0 +1,46 @@ +semantic: 0.920 +instruction: 0.623 +boot: 0.220 +network: 0.203 +mistranslation: 0.171 +other: 0.064 +KVM: 0.064 + +x86 BZHI semantic bug +Description of problem +The result of instruction BZHI is different from the CPU. The value of destination register and SF of EFLAGS are different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("mov rax, 0xb1aa9da2fe33fe3"); + asm("mov rbx, 0x80000000ffffffff"); + asm("mov rcx, 0xf3fce8829b99a5c6"); + asm("bzhi rax, rbx, rcx"); +} + + + +Execute and compare the result with the CPU. + +CPU + +RAX = 0x0x80000000ffffffff +SF = 1 + + +QEMU + +RAX = 0xffffffff +SF = 0 + + + + + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. |