diff options
Diffstat (limited to 'results/classifier/zero-shot/105/semantic')
177 files changed, 20886 insertions, 0 deletions
diff --git a/results/classifier/zero-shot/105/semantic/1013714 b/results/classifier/zero-shot/105/semantic/1013714 new file mode 100644 index 000000000..edc451ca0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1013714 @@ -0,0 +1,66 @@ +semantic: 0.859 +other: 0.828 +graphic: 0.817 +device: 0.801 +instruction: 0.799 +vnc: 0.793 +KVM: 0.790 +assembly: 0.765 +network: 0.748 +boot: 0.741 +mistranslation: 0.658 +socket: 0.538 + +Data corruption after block migration (LV->LV) + +We quite frequently use the live block migration feature to move VM's between nodes without downtime. These sometimes result in data corruption on the receiving end. It only happens if the VM is actually doing I/O (doesn't have to be all that much to trigger the issue). + +We use logical volumes and each VM has two disks. We use cache=none for all VM disks. + +All guests use virtio (a mix of various Linux distro's and Windows 2008R2). + +We currently have two stacks in use and have seen the issue on both of them: + +Fedora - qemu-kvm 0.13 +Scientific Linux 6.2 (RHEL derived) - qemu-kvm package 0.12.1.2 + +Even though we don't run the most recent versions of KVM I highly suspect this issue is still unreported and that filing a bug is therefore appropriate. (There doesn't seem to be any similar bug report in launchpad or RedHat's bugzilla and nothing related in change logs, release notes and git commit logs.) + +I have no idea where to look or where to start debugging this issue, but if there is any way I can provide useful debug information please let me know. + +Hi, I suggest that you try a newer version. There were several fixes that I think went only in 0.14, in particular commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e, commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e, commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e. RHEL6.2 doesn't have them. With the fixes, it's quite less likely that live block migration will eat your data. + +However, we were also thinking of deprecating block migration, so we are interesting of hearing about your setup. The replacement would be more powerful (it would allow migrating storage separately from the VM), more efficient (storage and RAM streams would run in parallel on different TCP ports), and easier for us to test and maintain. + +However, it would be more complicated to set the new mechanism up for migration without shared storage. This is what live block migration does, and it sounds like your usecase requires migration without shared storage. Likely, a true replacement of live block migration would not be ready in time for the next release (1.2), hence its removal would also be delayed. + +Hello Paolo, + +Thank you for your quick response! + +Did you intend to mention 3 different commits or did you accidentally paste the same commit thrice? ;) I came across that commit but somehow thought it was already included in 0.13. Thanks! + +We're of course in no position to ask, but I'll do it anyway: Would you be in a position to add patches for these commits to the qemu-kvm package for RHEL6 (assuming they apply at all)? Or perhaps ask one of the RH package maintainers to do so? We'd be very grateful! + +A little bit of background (our use case for using live block migration): We are an ISP and provide virtual private servers on KVM. + +The way we see it traditional centralized shared storage introduces one big, expensive and complicated SPOF into a VM platform. + +We actually have no problems dealing with the limitations of local storage. For example, we have automated (offline) VM migrations to other hosts when customers need to upgrade and the current host doesn't have enough resources. It would be great if live block migration would be stable enough to do this online to reduce downtime for customers. + +We sometimes use live block migration to reduce the server load by migrating off a busy VM. It doesn't really matter if the migration takes a while to complete. We also use it to migrate all VM's off a host in case the hardware is being retired or we need to reinstall. + +Live block migration is just not very useful for generic system maintenance, like a reboot for a kernel or firmware update. In that case we simply reboot the host (and most customers don't mind that once in a while). + +We would appreciate it if live block migration would not be removed until its superior replacement is ready. We don't mind if it's more complex to work with, as long as it's well documented ;) + +Hello Paolo, + +We backported most of the block migration fixes from upstream to the RHEL6 qemu-kvm package ourselves and are now unable to reproduce the disk corruption problem anymore. So I guess the issues are (mostly) fixed upstream. + +You can close this bug, but I would still appreciate it if you could fix this in the RHEL6 package (other RH customers might appreciate that as well ;). We could even provide the patches if you like. + + + +Closing according to comment #3 + diff --git a/results/classifier/zero-shot/105/semantic/1013888 b/results/classifier/zero-shot/105/semantic/1013888 new file mode 100644 index 000000000..6d46b9a0d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1013888 @@ -0,0 +1,117 @@ +semantic: 0.919 +instruction: 0.910 +other: 0.909 +assembly: 0.904 +graphic: 0.904 +device: 0.892 +socket: 0.867 +mistranslation: 0.847 +vnc: 0.846 +network: 0.826 +boot: 0.826 +KVM: 0.816 + +windows xp sp3 setup blank screen on boot + +When attempting to run Windows XP SP3 setup in qemu on a Lubuntu host with the following kernel: + +Linux michael-XPS-M1530 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux + +Qemu does not get past a blank screen after "Setup is inspecting your computer's hardware configuration" + +Qemu 1.0.1 - Doesn't have a problem +Qemu 1.1.0 - has the problem +Qemu master commit eb2aeacf983a2a88a2b31e8fee067c38bd10abd3 - has the problem + +qemu-system-x86_64 -L ../path/to/bios -cdrom winxp.iso -hda winxp.img -boot d + +where ../path/to/bios is the location of the pc-bios files from that version of qemu + +hi, +same problem on centos 6.2 with vanilla kernel 3.4.2. +I compiled qemu 1.0.1 from source and qemu 1.1.0 from source. + +/opt/qemu-1.0.1/bin/qemu-system-i386 -m 2048 -cdrom Win_XP_Pro_SP3.iso -hda test.winXP.qcow2 : works + +/opt/qemu-1.1.0/bin/qemu-system-i386 -m 2048 -cdrom Win_XP_Pro_SP3.iso -hda test.winXP.qcow2 : hangs + +/opt/qemu-1.1.0/bin/qemu-system-i386 -m 2048 -cdrom Win_XP_Pro_SP3.iso -hda test.winXP.qcow2 -L /opt/qemu-1.0.1/data/ : hangs and on stderr give: Could not open option rom 'kvmvapic.bin': No such file or directory + +/opt/qemu-1.1.0/bin/qemu-system-i386 -m 2048 -cdrom Win_XP_Pro_SP3.iso -hda test.winXP.qcow2 -L /opt/qemu-1.0.1/data/ -cpu qemu32,-apic : hangs + + +regards +Luigi + +On Fri, Jun 15, 2012 at 11:49:36PM -0000, Michael Sabino wrote: +> Qemu 1.0.1 - Doesn't have a problem +> Qemu 1.1.0 - has the problem +> Qemu master commit eb2aeacf983a2a88a2b31e8fee067c38bd10abd3 - has the problem + +I was also able to reproduce with commit: + +eb2aeacf983a2a88a2b31e8fee067c38bd10abd3 + +The problem appears to have been fixed upstream though. A reverse bisect +points to this patch being the fix: + +commit c52acf60b6c12ff5eb58eb6ac568c159ae0c8737 +Author: Pavel Hrdina <email address hidden> +Date: Wed Jun 13 15:43:11 2012 +0200 + + fdc: fix implied seek while there is no media in drive + + The Windows uses 'READ' command at the start of an instalation + without checking the 'dir' register. We have to abort the transfer + with an abnormal termination if there is no media in the drive. + + Signed-off-by: Pavel Hrdina <email address hidden> + Signed-off-by: Kevin Wolf <email address hidden> + +Please try your scenario again using that commit, and if all if it does the +trick we'll get it included in the next stable-1.1 release. + +> +> -- +> You received this bug notification because you are a member of qemu- +> devel-ml, which is subscribed to QEMU. +> https://bugs.launchpad.net/bugs/1013888 +> +> Title: +> windows xp sp3 setup blank screen on boot +> +> Status in QEMU: +> New +> +> Bug description: +> When attempting to run Windows XP SP3 setup in qemu on a Lubuntu host +> with the following kernel: +> +> Linux michael-XPS-M1530 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 +> 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux +> +> Qemu does not get past a blank screen after "Setup is inspecting your +> computer's hardware configuration" +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1013888/+subscriptions +> + + +I confirm it works. +just compiled from commit c52acf60b6c12ff5eb58eb6ac568c159ae0c8737. +Windows XP SP3 installation iso boot and start installation process. + +I tested both i368-softmmu and x86_64-softmmu targets. + +thanks +Luigi + +The bug also applies to Debian Qemu 1.1.0 + +Adding the changes of commit c52acf60b6c12ff5eb58eb6ac568c159ae0c8737 on top of the 1.1.0 Debian package fixes the issue. + +Which debian package do you mean? The fix is included is current debian qemu-kvm 1.1.0+dfsg-3 release. qemu package in debian does not have this fix however. + +Marking this bug as fixed, according to comment #4 and #5 + diff --git a/results/classifier/zero-shot/105/semantic/1034 b/results/classifier/zero-shot/105/semantic/1034 new file mode 100644 index 000000000..699036a6c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1034 @@ -0,0 +1,30 @@ +semantic: 0.805 +instruction: 0.793 +graphic: 0.658 +device: 0.564 +other: 0.491 +mistranslation: 0.410 +network: 0.323 +boot: 0.322 +socket: 0.317 +vnc: 0.235 +assembly: 0.214 +KVM: 0.157 + +Erlang/OTP 25 JIT on AArch64 fails in user mode emulation +Description of problem: +Compiling Erlang/OTP 25 fails with segfault when using user mode (but works in system mode), the Erlang devs have tracked it down in [ErlangForums](https://erlangforums.com/t/otp-25-0-rc3-release-candidate-3-is-released/1317/24) and give this explanation: + +> Thanks, I’ve found a bug in QEMU that explains this. The gist of it is: +> +> Instead of interpreting guest code, QEMU dynamically translates it to the host architecture. When the guest overwrites code for one reason or another, the translation is invalidated and redone if needed. +> +> Our JIT:ed code is mapped in two regions to work in the face of W^X restrictions: one executable but not writable, and one writable but not executable. Both of these regions point to the same physical memory and writes to the writable region are “magically” reflected in the executable one. +> +> I would’ve expected QEMU to honor the IC IVAU / ISB instructions we use to tell the processor that we’ve altered code at a particular address, but for some reason QEMU just ignores them 3 and relies entirely on trapping writes to previously translated code. +> +> In system mode QEMU emulates the MMU and sees that these two regions point at the same memory, and has no problem invalidating the executable region after writing to the writable region. +> +> In user mode it instead calls mprotect(..., PROT_READ) on all code regions it has translated, and invalidates translations in the signal handler. The problem is that we never write to the executable region – just the writable one – so the code doesn’t get invalidated. + +There doesn't seem to a open or closed QEMU bug that actually describes this problem. diff --git a/results/classifier/zero-shot/105/semantic/1034423 b/results/classifier/zero-shot/105/semantic/1034423 new file mode 100644 index 000000000..a44d8629a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1034423 @@ -0,0 +1,367 @@ +semantic: 0.905 +graphic: 0.902 +mistranslation: 0.902 +assembly: 0.897 +other: 0.893 +network: 0.885 +device: 0.873 +KVM: 0.845 +vnc: 0.825 +socket: 0.822 +instruction: 0.816 +boot: 0.816 + +Guests running OpenIndiana (and relatives) fail to boot on AMD hardware + +First observed with OpenSolaris 2009.06, and also applies to the latest OpenIndiana release. + +Version: qemu-kvm 1.1.1 + +Hardware: + +2 x AMD Opteron 6128 8-core processors, 64GB RAM. + +These guests boot on equivalent Intel hardware. + +To reproduce: + +qemu-kvm -nodefaults -m 512 -cpu host -vga cirrus -usbdevice tablet -vnc :99 -monitor stdio -hda drive.img -cdrom oi-dev-151a5-live-x86.iso -boot order=dc + +I've tested with "-vga std" and various different emulated CPU types, to no effect. + +What happens: + +GRUB loads, and offers multiple boot options, but none work. Some kind of kernel panic flies by very fast before restarting the VM, and careful use of the screenshot button reveals that it reads as follows: + +panic[cpu0]/thread=fec22de0: BAD TRAP: type=8 (#df Double fault) rp=fec2b48c add r=0 + +#df Double fault +pid=0, pc=0xault +pid=0, pc=0xfe800377, sp=0xfec40090, eflags=0x202 +cr0: 80050011<pg,wp,et,pe> cr4:b8<pge,pae,pse,de> +cr2: 0cr3: ae2f000 + gs: 1b0 fs: 0 es: 160 ds: 160 + edi: 0 esi: 0 ebp: 0 esp: fec2b4c4 + ebx: c0010015 edx: 0 ecx: 0 eax: fec40400 + trp: 8 err: 0 eip: fe800377 cs: 158 + efl: 202 usp: fec40090 ss: 160 +tss.tss_link: 0x0 +tss.tss_esp0: 0x0 +tss.tss_ss0: 0x160 +tss.tss_esp1: 0x0 +tss.tss_ss1: 0x0 +tss.tss esp2: 0x0 +tss.tss_ss2: 0x0 +tss.tss_cr3: 0xae2f000 +tss.tss_eip: 0xfec40400 +tss.tss_eflags: 0x202 +tss.tss_eax: 0xfec40400 +tss.tss_ebx: 0xc0010015 +tss.tss_ecx: 0xc0010000 +tss.tss_edx: 0x0 +tss.tss_esp: 0xfec40090 + +Warning - stack not written to the dumpbuf +fec2b3c8 unix:due+e4 (8, fec2b48c, 0, 0) +fec2b478 unix:trap+12fa (fec2b48c, 0, 0) +fec2b48c unix:_cmntrap+7c (1b0, 0, 160, 160, 0) + +If there's any more, I haven't managed to catch it. + +Solaris 11 does not seem to suffer from the same issue, although the first message that appears at boot (after the version info) is "trap: Unkown trap type 8 in user mode". Could be related? + +As always, thanks in advance and please let me know if I can help to test, or provide any more information. + +Triaging old bug tickets ... can you still reproduce this issue with the latest version of QEMU (currently v2.9)? + +This is an old ticket! I had completely forgotten about it, but will test +when I get a chance and let you know. + +Cheers, + +Owen + +On Fri, May 19, 2017 at 11:25 AM, Thomas Huth <email address hidden> +wrote: + +> Triaging old bug tickets ... can you still reproduce this issue with the +> latest version of QEMU (currently v2.9)? +> +> ** Changed in: qemu +> Status: New => Incomplete +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1034423 +> +> Title: +> Guests running OpenIndiana (and relatives) fail to boot on AMD +> hardware +> +> Status in QEMU: +> Incomplete +> +> Bug description: +> First observed with OpenSolaris 2009.06, and also applies to the +> latest OpenIndiana release. +> +> Version: qemu-kvm 1.1.1 +> +> Hardware: +> +> 2 x AMD Opteron 6128 8-core processors, 64GB RAM. +> +> These guests boot on equivalent Intel hardware. +> +> To reproduce: +> +> qemu-kvm -nodefaults -m 512 -cpu host -vga cirrus -usbdevice tablet +> -vnc :99 -monitor stdio -hda drive.img -cdrom oi-dev- +> 151a5-live-x86.iso -boot order=dc +> +> I've tested with "-vga std" and various different emulated CPU types, +> to no effect. +> +> What happens: +> +> GRUB loads, and offers multiple boot options, but none work. Some kind +> of kernel panic flies by very fast before restarting the VM, and +> careful use of the screenshot button reveals that it reads as follows: +> +> panic[cpu0]/thread=fec22de0: BAD TRAP: type=8 (#df Double fault) +> rp=fec2b48c add r=0 +> +> #df Double fault +> pid=0, pc=0xault +> pid=0, pc=0xfe800377, sp=0xfec40090, eflags=0x202 +> cr0: 80050011<pg,wp,et,pe> cr4:b8<pge,pae,pse,de> +> cr2: 0cr3: ae2f000 +> gs: 1b0 fs: 0 es: +> 160 ds: 160 +> edi: 0 esi: 0 ebp: +> 0 esp: fec2b4c4 +> ebx: c0010015 edx: 0 ecx: 0 eax: +> fec40400 +> trp: 8 err: 0 eip: fe800377 +> cs: 158 +> efl: 202 usp: fec40090 ss: 160 +> tss.tss_link: 0x0 +> tss.tss_esp0: 0x0 +> tss.tss_ss0: 0x160 +> tss.tss_esp1: 0x0 +> tss.tss_ss1: 0x0 +> tss.tss esp2: 0x0 +> tss.tss_ss2: 0x0 +> tss.tss_cr3: 0xae2f000 +> tss.tss_eip: 0xfec40400 +> tss.tss_eflags: 0x202 +> tss.tss_eax: 0xfec40400 +> tss.tss_ebx: 0xc0010015 +> tss.tss_ecx: 0xc0010000 +> tss.tss_edx: 0x0 +> tss.tss_esp: 0xfec40090 +> +> Warning - stack not written to the dumpbuf +> fec2b3c8 unix:due+e4 (8, fec2b48c, 0, 0) +> fec2b478 unix:trap+12fa (fec2b48c, 0, 0) +> fec2b48c unix:_cmntrap+7c (1b0, 0, 160, 160, 0) +> +> If there's any more, I haven't managed to catch it. +> +> Solaris 11 does not seem to suffer from the same issue, although the +> first message that appears at boot (after the version info) is "trap: +> Unkown trap type 8 in user mode". Could be related? +> +> As always, thanks in advance and please let me know if I can help to +> test, or provide any more information. +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1034423/+subscriptions +> + + +[Expired for QEMU because there has been no activity for 60 days.] + +Despite the age of the report, I am also reproducing the issue. + +I am using Qemu 6.2.0 with KVM on Linux kernel 6.0.5 under Linux Mint 21. + +The guest is OpenIndiana Hipster 2021.10. + +A guest console capture is attached. + +The guest is managed using libvirt 8.0.0 + +The dump of the libvirt domain configuration is as follows: + +<domain type='kvm' id='62'> + <name>openindiana</name> + <uuid>7a7adcc0-889c-4daf-a73b-21a3fac3d8e7</uuid> + <metadata> + <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0"> + <libosinfo:os id="http://libosinfo.org/linux/2020"/> + </libosinfo:libosinfo> + </metadata> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static'>4</vcpu> + <resource> + <partition>/machine</partition> + </resource> + <os> + <type arch='x86_64' machine='pc-i440fx-jammy'>hvm</type> + <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.fd</loader> + <nvram template='/usr/share/OVMF/OVMF_VARS_4M.fd'>/var/lib/libvirt/qemu/nvram/openindiana_VARS.fd</nvram> + </os> + <features> + <acpi/> + <apic/> + <vmport state='off'/> + </features> + <cpu mode='host-passthrough' check='none' migratable='on'/> + <clock offset='utc'> + <timer name='rtc' tickpolicy='catchup'/> + <timer name='pit' tickpolicy='delay'/> + <timer name='hpet' present='no'/> + </clock> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <pm> + <suspend-to-mem enabled='no'/> + <suspend-to-disk enabled='no'/> + </pm> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/srv/vhd/openindiana.qcow2' index='2'/> + <backingStore/> + <target dev='sda' bus='sata'/> + <alias name='sata0-0-0'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw'/> + <source file='/srv/img/OI-hipster-minimal-20211031.iso' index='1'/> + <backingStore/> + <target dev='sdb' bus='sata'/> + <readonly/> + <boot order='1'/> + <alias name='sata0-0-1'/> + <address type='drive' controller='0' bus='0' target='0' unit='1'/> + </disk> + <controller type='usb' index='0' model='ich9-ehci1'> + <alias name='usb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci1'> + <alias name='usb'/> + <master startport='0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci2'> + <alias name='usb'/> + <master startport='2'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci3'> + <alias name='usb'/> + <master startport='4'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci.0'/> + </controller> + <controller type='ide' index='0'> + <alias name='ide'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='virtio-serial' index='0'> + <alias name='virtio-serial0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </controller> + <controller type='sata' index='0'> + <alias name='sata0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> + </controller> + <interface type='network'> + <mac address='52:54:00:20:40:9c'/> + <source network='default' portid='77a38fbb-c35e-4f78-b377-e823963eb30e' bridge='virbr0'/> + <target dev='vnet61'/> + <model type='e1000'/> + <alias name='net0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <serial type='pty'> + <source path='/dev/pts/3'/> + <target type='isa-serial' port='0'> + <model name='isa-serial'/> + </target> + <alias name='serial0'/> + </serial> + <console type='pty' tty='/dev/pts/3'> + <source path='/dev/pts/3'/> + <target type='serial' port='0'/> + <alias name='serial0'/> + </console> + <channel type='spicevmc'> + <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> + <alias name='channel0'/> + <address type='virtio-serial' controller='0' bus='0' port='1'/> + </channel> + <input type='tablet' bus='usb'> + <alias name='input0'/> + <address type='usb' bus='0' port='1'/> + </input> + <input type='mouse' bus='ps2'> + <alias name='input1'/> + </input> + <input type='keyboard' bus='ps2'> + <alias name='input2'/> + </input> + <graphics type='spice'> + <listen type='none'/> + <image compression='off'/> + </graphics> + <sound model='ac97'> + <alias name='sound0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </sound> + <audio id='1' type='spice'/> + <video> + <model type='vga' vram='16384' heads='1' primary='yes'/> + <alias name='video0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </video> + <redirdev bus='usb' type='spicevmc'> + <alias name='redir0'/> + <address type='usb' bus='0' port='2'/> + </redirdev> + <redirdev bus='usb' type='spicevmc'> + <alias name='redir1'/> + <address type='usb' bus='0' port='3'/> + </redirdev> + <memballoon model='virtio'> + <alias name='balloon0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> + </memballoon> + </devices> + <seclabel type='dynamic' model='apparmor' relabel='yes'> + <label>libvirt-7a7adcc0-889c-4daf-a73b-21a3fac3d8e7</label> + <imagelabel>libvirt-7a7adcc0-889c-4daf-a73b-21a3fac3d8e7</imagelabel> + </seclabel> + <seclabel type='dynamic' model='dac' relabel='yes'> + <label>+64055:+130</label> + <imagelabel>+64055:+130</imagelabel> + </seclabel> +</domain> + + +This bug tracker here is not used anymore. Could you please open a new ticket here: + +https://gitlab.com/qemu-project/qemu/-/issues + +Thanks! + diff --git a/results/classifier/zero-shot/105/semantic/1038 b/results/classifier/zero-shot/105/semantic/1038 new file mode 100644 index 000000000..df38ca8bf --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1038 @@ -0,0 +1,24 @@ +semantic: 0.944 +KVM: 0.827 +device: 0.819 +graphic: 0.780 +mistranslation: 0.736 +other: 0.709 +instruction: 0.675 +vnc: 0.617 +network: 0.606 +socket: 0.430 +boot: 0.429 +assembly: 0.306 + +ppc 'max' CPU model is unlike the other targets 'max' CPU model +Description of problem: +On most targets the 'max' CPU model is either equivalent to 'host' (for KVM) or equivalent to all available CPU features (for TCG). + +On PPC64, however, this is not the case. Instead the 'max' model is an alias of the old '7400_v2.9' and simply doesn't work. + +This is confusing. At the very least the 'max' model alias should be deleted. Ideally a proper replacement would be introduced that matches semantics on other arches. +Steps to reproduce: +1. qemu-system-ppc64 -cpu max + +should be equiv to '-cpu host' or should expose all TCG features. diff --git a/results/classifier/zero-shot/105/semantic/1042654 b/results/classifier/zero-shot/105/semantic/1042654 new file mode 100644 index 000000000..c5445b116 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1042654 @@ -0,0 +1,103 @@ +semantic: 0.587 +device: 0.551 +other: 0.543 +network: 0.541 +assembly: 0.524 +instruction: 0.484 +socket: 0.442 +vnc: 0.404 +graphic: 0.379 +mistranslation: 0.335 +boot: 0.314 +KVM: 0.310 + +Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1 + +When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on Qemu 1.2 rc, they are not recognized and the network is not working even though I set it correctly (especially the AMD PCnet adapter) +Here's some screenshot of the floppy error: +http://i49.tinypic.com/j77wcw.png + +On Tue, Aug 28, 2012 at 10:29 AM, TC1988 <email address hidden> wrote: +> Public bug reported: +> +> When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on Qemu 1.2 rc, they are not recognized and the network is not working even though I set it correctly (especially the AMD PCnet adapter) +> Here's some screenshot of the floppy error: +> http://i49.tinypic.com/j77wcw.png + +Thanks for testing qemu 1.2-rc! + +Can you confirm that both floppy and AMD PCnet worked in QEMU 1.1? + +If so, could you please try running git-bisect(1) to identify which +commit introduced the breakage? + +$ git clone git://git.qemu.org/qemu.git +$ cd qemu +$ git bisect start v1.2-rc1 v1.1.0 + +For more info on git-bisect(1): +http://git-scm.com/book/en/Git-Tools-Debugging-with-Git#Binary-Search +http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html + +Stefan + + +they worked in Qemu 1.1 and also in the previous versions and, by the way, I can't compile, I used a third party Win32 build of Qemu 1.2 rc1. :( + +On Wed, Aug 29, 2012 at 1:53 PM, TC1988 <email address hidden> wrote: +> they worked in Qemu 1.1 and also in the previous versions and, by the +> way, I can't compile, I used a third party Win32 build of Qemu 1.2 rc1. +> :( + +Have you tried any other guest operating systems? If there is a more +readily available guest OS that shows the same bug it would allow +others to reproduce and debug this. + +Stefan + +> https://bugs.launchpad.net/bugs/1042654 +> +> Title: +> Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1 +> +> Status in QEMU: +> New +> +> Bug description: +> When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on Qemu 1.2 rc, they are not recognized and the network is not working even though I set it correctly (especially the AMD PCnet adapter) +> Here's some screenshot of the floppy error: +> http://i49.tinypic.com/j77wcw.png + + +it does not happen on NT 3.5, 3.51 or 4.0, only on 3.1. + +Found someone who had a copy of NT 3.1 handy and he bisected it to: + +commit 2fee00885a9ea4db69bbfc1ba8ccf95f2ae9aec6 +Author: Pavel Hrdina <email address hidden> +Date: Fri Jun 22 12:33:55 2012 +0200 + + fdc: fix interrupt handling + + If you call the SENSE INTERRUPT STATUS command while there is no interrupt + waiting you get as result unknown command. + + Fixed status0 register handling for read/write/format commands. + + Signed-off-by: Pavel Hrdina <email address hidden> + Signed-off-by: Kevin Wolf <email address hidden> + +nice :) but what about the network? + +It appears that the fdc issue were addressed in this patch series: + + http://thread.gmane.org/gmane.comp.emulators.qemu/168836 + +Unfortunately the URL from comment #7 is dead nowadays ... has this fix been committed to the QEMU repository? + +The floppy fix appears to be commit 34abf9a7 (contained in qemu 1.3). + +I don't think anyone ever looked into the networking problem. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1077116 b/results/classifier/zero-shot/105/semantic/1077116 new file mode 100644 index 000000000..111b3e8aa --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1077116 @@ -0,0 +1,113 @@ +semantic: 0.887 +graphic: 0.882 +instruction: 0.881 +device: 0.862 +socket: 0.847 +assembly: 0.822 +network: 0.812 +KVM: 0.798 +other: 0.794 +boot: 0.779 +vnc: 0.718 +mistranslation: 0.696 + +automoc4 segfaults when building in an armhf pbuilder on an amd64 host + +When trying to build kde4libs in an armhf pbuilder created with the pbuilder-scripts running on an amd64 host automoc4 recieves a segmentation fault and I can't get any useful information out of it: + +root@yofel-thinkpad:/tmp/kde4libs-4.9.3/build/kdeui# /usr/bin/automoc4 kdeui_automoc.cpp ../../kdeui/ . moc-qt4 cmake +unable to dump 00102c00 +Segmentation fault (core dumped) +root@yofel-thinkpad:/tmp/kde4libs-4.9.3/build/kdeui# gdb /usr/bin/automoc4 qemu_automoc4_20121108-211818_15839.core +GNU gdb (GDB) 7.5-ubuntu +Copyright (C) 2012 Free Software Foundation, Inc. +License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. Type "show copying" +and "show warranty" for details. +This GDB was configured as "arm-linux-gnueabihf". +For bug reporting instructions, please see: +<http://www.gnu.org/software/gdb/bugs/>... +Reading symbols from /usr/bin/automoc4...done. +BFD: Warning: /tmp/kde4libs-4.9.3/build/kdeui/qemu_automoc4_20121108-211818_15839.core is truncated: expected core file size >= 5150720, found: 974848. +[New LWP 15839] +[New LWP 15866] +Cannot access memory at address 0xf67fe954 +Cannot access memory at address 0xf67fe950 +(gdb) bt +#0 0xf6630306 in ?? () +#1 0xf6415ff8 in ?? () +#2 0xf6415ff8 in ?? () +Backtrace stopped: previous frame identical to this frame (corrupt stack?) +(gdb) + +automoc4 runs fine when building on a nexus7 so this sounds like an issue in qemu. +Tested in quantal and raring. + +ProblemType: Bug +DistroRelease: Ubuntu 13.04 +Package: qemu-user-static 1.2.0-2012.09-0ubuntu1 +Uname: Linux 3.6.2-030602-generic x86_64 +NonfreeKernelModules: nvidia +ApportVersion: 2.6.2-0ubuntu3 +Architecture: amd64 +Date: Fri Nov 9 19:29:28 2012 +EcryptfsInUse: Yes +InstallationDate: Installed on 2011-10-08 (398 days ago) +InstallationMedia: Kubuntu 11.10 "Oneiric Ocelot" - Beta amd64 (20111007) +MarkForUpload: True +ProcEnviron: + SHELL=/bin/bash + TERM=xterm + PATH=(custom, user) + LANG=en_US.UTF-8 + LANGUAGE=en_US.UTF-8 +SourcePackage: qemu-linaro +UpgradeStatus: No upgrade log present (probably fresh install) + + + +This still applies to raring's qemu with the linaro patches. + +Thanks for reporting this bug. There seem to be a few bugs in the armhf qemu-user-static right now. I'll test against bleeding edge upstream. + +Buildlog from an armfh PPA build as reference. + +Same for me + +make[2]: Entering directory `/builddir/build/BUILD/kdelibs-4.10.5/build' + +cd /builddir/build/BUILD/kdelibs-4.10.5/build/kdeui && /usr/bin/automoc4 /builddir/build/BUILD/kdelibs-4.10.5/build/kdeui/kdeui_automoc.cpp /builddir/build/BUILD/kdelibs-4.10.5/kdeui /builddir/build/BUILD/kdelibs-4.10.5/build/kdeui /usr/lib/qt4/bin/moc /usr/bin/cmake + +Unable to load library icui18n "Cannot load library icui18n: (icui18n: cannot open shared object file: No such file or directory)" + +qemu: uncaught target signal 11 (Segmentation fault) - core dumped + +/bin/sh: line 1: 8056 Segmentation fault (core dumped) /usr/bin/automoc4 /builddir/build/BUILD/kdelibs-4.10.5/build/kdeui/kdeui_automoc.cpp /builddir/build/BUILD/kdelibs-4.10.5/kdeui /builddir/build/BUILD/kdelibs-4.10.5/build/kdeui /usr/lib/qt4/bin/moc /usr/bin/cmake + +make[2]: *** [kdeui/CMakeFiles/kdeui_automoc] Error 139 + +make[2]: Leaving directory `/builddir/build/BUILD/kdelibs-4.10.5/build' + +make[1]: *** [kdeui/CMakeFiles/kdeui_automoc.dir/all] Error 2 + +make[1]: Leaving directory `/builddir/build/BUILD/kdelibs-4.10.5/build' + +make: *** [all] Error 2 + +make: Leaving directory `/builddir/build/BUILD/kdelibs-4.10.5/build' + +error: Bad exit status from /var/tmp/rpm-tmp.50015 (%install) + +RPM build errors: + + Bad exit status from /var/tmp/rpm-tmp.50015 (%install) + +I was able to reproduce this failure with QEMU 2.5, and the code runs OK under QEMU current master, so I think this is fixed by the threading/signal handling bugfixes we've done between then and now. I'm going to close this as will-be-fixed-in-2.11 (though it's quite possible it's already fixed in 2.10). + + +We have had a few more issues around armhf qemu-static that mostly resolved in Artful (qemu 2.10) and finally one that was good in Bionic (qemu 2.11). +This also included some updates to other components but should be good now. + +If the issue here really still applies to a newer version please reopen and state an updated test and version that it ran on. + diff --git a/results/classifier/zero-shot/105/semantic/1077838 b/results/classifier/zero-shot/105/semantic/1077838 new file mode 100644 index 000000000..f0de6f9cc --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1077838 @@ -0,0 +1,199 @@ +semantic: 0.713 +other: 0.693 +assembly: 0.614 +instruction: 0.611 +KVM: 0.592 +graphic: 0.565 +device: 0.563 +mistranslation: 0.559 +vnc: 0.483 +boot: 0.445 +socket: 0.325 +network: 0.299 + +qemu-nbd -r -c taints device for subsequent usage, even after -d + +Something about qemu-nbd -r -c /dev/nbd0 someimg leaves cruft behind - subsequent connections get marked readonly. + +This is on quantal, haven't checked precise or raring. + +To demonstrate: +# use one image +qemu-img create -f qcow2 /tmp/1.qcow2 100M +sudo qemu-nbd -c /dev/nbd2 /tmp/1.qcow2 +sudo mkfs -t ext4 /dev/nbd2 +sudo qemu-nbd -d /dev/nbd2 +# use a second one on the same nbd device, shows that reuse works: +qemu-img create -f qcow2 /tmp/2.qcow2 100M +sudo qemu-nbd -c /dev/nbd2 /tmp/2.qcow2 +sudo mkfs -t ext4 /dev/nbd2 +sudo qemu-nbd -d /dev/nbd2 +# connect an image in read only mode +sudo qemu-nbd -r -c /dev/nbd2 /tmp/2.qcow2 +sudo dumpe2fs /dev/nbd2 | head -n 3 +sudo qemu-nbd -d /dev/nbd2 +# now try to reuse in read-write mode again: +qemu-img create -f qcow2 /tmp/3.qcow2 100M +sudo qemu-nbd -c /dev/nbd2 /tmp/3.qcow2 +sudo mkfs -t ext4 /dev/nbd2 +# here it goes boom: +mke2fs 1.42.5 (29-Jul-2012) +/dev/nbd2: Operation not permitted while setting up superblock +# still need to cleanup +sudo qemu-nbd -d /dev/nbd2 + +Happens on Precise as well. + +Quick code read - I think that this block: + if (flags & NBD_FLAG_READ_ONLY) { + int read_only = 1; + TRACE("Setting readonly attribute"); + + if (ioctl(fd, BLKROSET, (unsigned long) &read_only) < 0) { + int serrno = errno; + LOG("Failed setting read-only attribute"); + return -serrno; + } + } + +in nbd.c should be + { + int read_only = 0; + if (flags & NBD_FLAG_READ_ONLY) + read_only = 1; + TRACE("Setting readonly attribute"); + if (ioctl(fd, BLKROSET, (unsigned long) &read_only) < 0) { + int serrno = errno; + LOG("Failed setting read-only attribute"); + return -serrno; + } + } + + +http://paste.ubuntu.com/1352684/ is a debdiff, uploading the source format 3 patch as well + + + + + +Fixed patch - I had my sense inverted... http://paste.ubuntu.com/1352711/ + +Thanks, this still applies upstream as well. + +To some extent it is a bug in the upstream kernel, which doesn't reset state properly. However, the qemu patch is also good. Thanks! + +Thanks, Paul, I'll cherrypick commit c8969eded252058e90e91f12f75f32aceae46ec9 into the ubuntu packages + +This bug was fixed in the package qemu-kvm - 1.2.0+noroms-0ubuntu4 + +--------------- +qemu-kvm (1.2.0+noroms-0ubuntu4) raring; urgency=low + + [ Serge Hallyn ] + * debian/qemu-kvm.postinst: use udevadm trigger to change /dev/kvm perms as + recommended by Steve Langasek (LP: #1057024) + * apply debian/patches/nbd-fixes-to-read-only-handling.patch from upstream to + make read-write mount after read-only mount work. (LP: #1077838) + + [ Robert Collins ] + * Fix upstart job to succeed if ksm settings can't be altered in the same way + other settings are handled. (LP: #1078530) + -- Serge Hallyn <email address hidden> Wed, 14 Nov 2012 11:30:14 -0600 + +Hello Robert, or anyone else affected, + +Accepted qemu-kvm into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu-kvm/1.0+noroms-0ubuntu14.4 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Hello Robert, or anyone else affected, + +Accepted qemu-kvm into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu-kvm/1.0+noroms-0ubuntu14.5 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Hello Robert, or anyone else affected, + +Accepted qemu-kvm into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu-kvm/1.2.0+noroms-0ubuntu2.12.10.1 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Verified on precise. + +Verified on quantal. + +Hello Robert, or anyone else affected, + +Accepted qemu-kvm into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu-kvm/1.0+noroms-0ubuntu14.6 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Re-verified in precise. + +The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions. + +This bug was fixed in the package qemu-kvm - 1.2.0+noroms-0ubuntu2.12.10.1 + +--------------- +qemu-kvm (1.2.0+noroms-0ubuntu2.12.10.1) quantal-proposed; urgency=low + + [ Serge Hallyn ] + * debian/qemu-kvm.postinst: use udevadm trigger to change /dev/kvm perms as + recommended by Steve Langasek (LP: #1057024) + * apply debian/patches/nbd-fixes-to-read-only-handling.patch from upstream to + make read-write mount after read-only mount work. (LP: #1077838) + * make qemu-kvm depend on udev (LP: #1080912) + + [ Robert Collins ] + * Fix upstart job to succeed if ksm settings can't be altered in the same way + other settings are handled. (LP: #1078530) + -- Serge Hallyn <email address hidden> Mon, 19 Nov 2012 09:15:42 -0600 + +This bug was fixed in the package qemu-kvm - 1.0+noroms-0ubuntu14.6 + +--------------- +qemu-kvm (1.0+noroms-0ubuntu14.6) precise-proposed; urgency=low + + * Fix qemu-kvm.upstart: just don't run in a container. Otherwise we'll + still try to load/unload kernel modules. Also undo the || true after + sysfs writes. Since setting those is a part of configuring qemu-kvm + on the host, failing when they fail makes sense. + +qemu-kvm (1.0+noroms-0ubuntu14.5) precise-proposed; urgency=low + + * add udev to qemu-kvm Depends to ensure that postinst succeeds. + (LP: #1080912) + +qemu-kvm (1.0+noroms-0ubuntu14.4) precise-proposed; urgency=low + + [ Serge Hallyn ] + * debian/qemu-kvm.postinst: use udevadm trigger to change /dev/kvm perms as + recommended by Steve Langasek (LP: #1057024) + * apply debian/patches/nbd-fixes-to-read-only-handling.patch from upstream to + make read-write mount after read-only mount work. (LP: #1077838) + + [ Robert Collins ] + * Fix upstart job to succeed if ksm settings can't be altered in the same way + other settings are handled. (LP: #1078530) + -- Serge Hallyn <email address hidden> Thu, 20 Dec 2012 12:34:52 -0600 + +According to comment #9 this bug has been fixed by this commit here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=c8969eded252058 +... so I think it should be OK to close this bug now. + diff --git a/results/classifier/zero-shot/105/semantic/1094564 b/results/classifier/zero-shot/105/semantic/1094564 new file mode 100644 index 000000000..d200f02b3 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1094564 @@ -0,0 +1,367 @@ +semantic: 0.818 +graphic: 0.806 +instruction: 0.790 +boot: 0.784 +assembly: 0.783 +mistranslation: 0.782 +other: 0.778 +network: 0.775 +device: 0.766 +vnc: 0.712 +socket: 0.665 +KVM: 0.521 + +images used as scsi disks not readable (qemu-system-arm, macos 10.8) + +Using a arm1176 kernel and the raspbian image (10-28 or 12-16) as my disk, I get as far as mounting root and then get SCSI errors with 1.3.0 and the current origin/master. git bisect says the issue is + +commit f563a5d7a820424756f358e747238f03e866838a +Merge: a273652 aee0bf7 +Author: Paolo Bonzini <email address hidden> +Date: Wed Oct 31 10:42:51 2012 +0100 + + Merge remote-tracking branch 'origin/master' into threadpool + + Signed-off-by: Paolo Bonzini <email address hidden> + + +I am using: +qemu-system-arm -no-reboot -M versatilepb -cpu arm1176 -m 256 -hda 2012-12-16-wheezy-raspbian.img -kernel kernel-qemu -append "root=/dev/sda2 rootfstype=ext4 elevator=deadline rootwait panic=1" -serial stdio -usbdevice tablet -net nic -net user,hostfwd=tcp::40022-:22 + +Configured on MacOS 10.8.2 with current Xcode and MacPorts installed, thus: +CPATH=/opt/local/include CFLAGS="-pipe -O2 -arch x86_64" CPPFLAGS="-I/opt/local/include" CXXFLAGS="-pipe -O2 -arch x86_64" LIBRARY_PATH="/opt/local/lib" MACOSX_DEPLOYMENT_TARGET="10.8" CXX="/usr/bin/clang++" LDFLAGS="-L/opt/local/lib -arch x86_64" OBJC=/usr/bin/clang FCFLAGS="-pipe -O2 -m64" INSTALL="/usr/bin/install -c" OBJCFLAGS="-pipe -O2 -arch x86_64" CC="/usr/bin/clang" ./configure --prefix=/opt/local --cpu=x86_64 --cc=/usr/bin/clang --objcc=/usr/bin/clang --host-cc=/usr/bin/clang --python=/opt/local/bin/python2.7 --target-list=arm-softmmu + +I duplicated this as well. I tried both the qemu-system-arm available in macports and also from homebrew with the same results. Host system is also 10.8 "mountain lion". + +My boot command: qemu-system-arm -kernel kernel/zImage -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=1" -hda pifi-4g.img -redir tcp:5022::22 + +I'm running QEMU emulator version 1.4.1 on OS X 10.8.3 +Interestingly, this exact same boot command works perfectly on my Ubuntu virtual (in virtualbox) running QEMU emulator version 1.4.0 (Debian 1.4.0+dfsg-1expubuntu4) + +So the bug would seem to either be specific to OS X or to the 1.4.1 release. + +See the attached screenshot to see what happens in the boot process. + +I managed to capture a little more info about this bug by passing -drive file='myharddrive.img'. The kernel panic is happening in the sym53c8xx driver. See the attached screenshot for detail. + +I can also attach the kernel that I'm using if needed. Just let me know. + + +I suspect this may be because we were defaulting to a broken coroutine backend (a bug fixed with commit 7c2acc7). Can you retry with the current 1.5 release candidate? (source download available at http://wiki.qemu.org/Download) + + +Hi Peter, + +Thanks, that made an improvement. Now I'm just stuck in a loop of the kernel resetting the scsi bus :) + +(see attachment) + +And the same QEMU/kernel/image works fine on a Linux host? + +If you can provide the files I need to reproduce I might be able to take a look at it. (If it did the same thing on linux host that would be higher priority for me, so if you can cross-check that would be helpful.) + + +I just compiled 1.5.0-rc1 on my Linux host with the same configure/compiler flags and duplicated the error (see screenshot). The configure flags are: +./configure --disable-guest-agent --disable-bsd-user --enable-sdl --target-list="arm-linux-user armeb-linux-user arm-softmmu" + +As before, it goes into an infinite loop of reseting the scsi controller for each (emulated) channel. Note that the Ubuntu provided qemu-system-arm works perfectly. They are using 1.4.0 with a rather large number of patches ( I did a `dpkg source qemu` to examine the Ubuntu build setup). + +Looking through the Ubuntu patches, this one looks like a likely fix: +dpkg-source: info: applying patches-arm-1.4.0/0002-hw-sd-Expose-sd_reset-as-public-function.patch + +Please find a zip of my raspberrypi image (hda) and the kernel that I built from https://github.com/raspberrypi/linux.git +at https://www.dropbox.com/sh/mbz8jh4fcjvdj4m/Gh3bKFyJyC + +I included the boot command in a txt file in the tarball. + +cheers, +Joss + + + +It's very unlikely to be the patch you mention, since that's for SD card emulation and you're not using SD card emulation. It's probably just a regression between 1.4 and 1.5, and I'm fairly sure it's in some changes I made to the versatilepb PCI controller model -- I will investigate. + + +Ah, I interpreted it to mean "scsi disk" instead of SD card :) + +I'll leave this to the experts. Thanks so much for looking into this and please let me know if I can be of further assistance. + +-Joss + +Hi Peter, + +Thanks so much for the patch and including me on the thread. I can confirm that it did fix the problem running on a Linux host, but the OS X bug cited by myself and the OP still remains elusive. It's rather puzzling as I pulled from HEAD and built using the same options on both. I've gotten a bit better with the qemu options now, so I will paste the console output here instead of doing yet another screenshot :) As you can see, it's still getting a fatal exception in the interrupt code. Do you know of a kernel version that would be better behaved than the 3.6.11+ from the "raspberrypi/linux" repo on github? Could I provide a core file that would help? + +Thanks again for your efforts. +Joss + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +phoenix:RaspberryPi mysfitt$ qemu-system-arm -kernel kernel/zImage -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=1 console=ttyAMA0" -redir tcp:5022::22 -bt hci,null -global versatile_pci.broken-irq-mapping=1 pifi-4g.img +Uncompressing Linux... done, booting the kernel. +Booting Linux on physical CPU 0 +Linux version 3.6.11+ (root@jossibox) (gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1) ) #1 Fri May 10 16:46:40 EDT 2013 +CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d +CPU: VIPT aliasing data cache, unknown instruction cache +Machine: ARM-Versatile PB +Memory policy: ECC disabled, Data cache writeback +sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956ms +Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024 +Kernel command line: root=/dev/sda2 panic=1 console=ttyAMA0 +PID hash table entries: 1024 (order: 0, 4096 bytes) +Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) +Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) +Memory: 256MB = 256MB total +Memory: 255388k/255388k available, 6756k reserved, 0K highmem +Virtual kernel memory layout: + vector : 0xffff0000 - 0xffff1000 ( 4 kB) + fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB) + vmalloc : 0xd0800000 - 0xff000000 ( 744 MB) + lowmem : 0xc0000000 - 0xd0000000 ( 256 MB) + modules : 0xbf000000 - 0xc0000000 ( 16 MB) + .text : 0xc0008000 - 0xc03f6458 (4026 kB) + .init : 0xc03f7000 - 0xc04162bc ( 125 kB) + .data : 0xc0418000 - 0xc043fb60 ( 159 kB) + .bss : 0xc043fb84 - 0xc045abb0 ( 109 kB) +NR_IRQS:192 +VIC @f1140000: id 0x00041190, vendor 0x41 +FPGA IRQ chip 0 "SIC" @ f1003000, 21 irqs +Console: colour dummy device 80x30 +Calibrating delay loop... 626.68 BogoMIPS (lpj=3133440) +pid_max: default: 32768 minimum: 301 +Mount-cache hash table entries: 512 +CPU: Testing write buffer coherency: ok +Setting up static identity map for 0x305ce0 - 0x305d3c +devtmpfs: initialized +NET: Registered protocol family 16 +DMA: preallocated 256 KiB pool for atomic coherent allocations +Serial: AMBA PL011 UART driver +dev:f1: ttyAMA0 at MMIO 0x101f1000 (irq = 12) is a PL011 rev1 +console [ttyAMA0] enabled +dev:f2: ttyAMA1 at MMIO 0x101f2000 (irq = 13) is a PL011 rev1 +dev:f3: ttyAMA2 at MMIO 0x101f3000 (irq = 14) is a PL011 rev1 +fpga:09: ttyAMA3 at MMIO 0x10009000 (irq = 38) is a PL011 rev1 +PCI core found (slot 11) +PCI host bridge to bus 0000:00 +pci_bus 0000:00: root bus resource [io 0x0000-0xffff] +pci_bus 0000:00: root bus resource [mem 0x50000000-0x5fffffff] +pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff pref] +pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff] +PCI: bus0: Fast back to back transfers disabled +pci 0000:00:0c.0: BAR 2: assigned [mem 0x50000000-0x50001fff] +pci 0000:00:0c.0: BAR 1: assigned [mem 0x50002000-0x500023ff] +pci 0000:00:0c.0: BAR 0: can't assign io (size 0x100) +bio: create slab <bio-0> at 0 +vgaarb: loaded +SCSI subsystem initialized +Switching to clocksource timer3 +NET: Registered protocol family 2 +TCP established hash table entries: 8192 (order: 4, 65536 bytes) +TCP bind hash table entries: 8192 (order: 3, 32768 bytes) +TCP: Hash tables configured (established 8192 bind 8192) +TCP: reno registered +UDP hash table entries: 256 (order: 0, 4096 bytes) +UDP-Lite hash table entries: 256 (order: 0, 4096 bytes) +NET: Registered protocol family 1 +RPC: Registered named UNIX socket transport module. +RPC: Registered udp transport module. +RPC: Registered tcp transport module. +RPC: Registered tcp NFSv4.1 backchannel transport module. +NetWinder Floating Point Emulator V0.97 (double precision) +Installing knfsd (copyright (C) 1996 <email address hidden>). +jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc. +ROMFS MTD (C) 2007 Red Hat, Inc. +fuse init (API version 7.20) +msgmni has been set to 498 +Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) +io scheduler noop registered +io scheduler deadline registered +io scheduler cfq registered (default) +clcd-pl11x dev:20: PL110 rev0 at 0x10120000 +clcd-pl11x dev:20: Versatile hardware, VGA display +Console: switching to colour frame buffer device 80x30 +brd: module loaded +PCI: enabling device 0000:00:0c.0 (0100 -> 0102) +sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 27 +sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking +sym0: SCSI BUS has been reset. +scsi0 : sym-2.2.3 +sym0: unknown interrupt(s) ignored, ISTAT=0x5 DSTAT=0x80 SIST=0x0 +scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 1.4. PQ: 0 ANSI: 5 +scsi target0:0:0: tagged command queuing enabled, command queue depth 16. +scsi target0:0:0: Beginning Domain Validation +scsi target0:0:0: Domain Validation skipping write tests +scsi target0:0:0: Ending Domain Validation +scsi 0:0:2:0: CD-ROM QEMU QEMU CD-ROM 1.4. PQ: 0 ANSI: 5 +scsi target0:0:2: tagged command queuing enabled, command queue depth 16. +scsi target0:0:2: Beginning Domain Validation +scsi target0:0:2: Domain Validation skipping write tests +scsi target0:0:2: Ending Domain Validation +sr0: scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray +cdrom: Uniform CD-ROM driver Revision: 3.20 +sd 0:0:0:0: [sda] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB) +sd 0:0:0:0: [sda] Write Protect is off +sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA +physmap platform flash device: 04000000 at 34000000 +physmap-flash.0: Found 1 x32 devices at 0x0 in 32-bit bank. Manufacturer ID 0x000000 Chip ID 0x000000 +Intel/Sharp Extended Query Table at 0x0031 +Using buffer write method + sda: sda1 sda2 sda3 sda4 +smc91x.c: v1.1, sep 22 2004 by Nicolas Pitre <email address hidden> +eth0: SMC91C11xFD (rev 1) at d09ca000 IRQ 25 [nowait] +eth0: Ethernet addr: 52:54:00:12:34:56 +sd 0:0:0:0: [sda] Attached SCSI disk +mousedev: PS/2 mouse device common for all mice +TCP: cubic registered +NET: Registered protocol family 17 +VFP support v0.3: implementor 41 architecture 1 part 20 variant b rev 5 +input: AT Raw Set 2 keyboard as /devices/fpga:06/serio0/input/input0 +input: ImExPS/2 Generic Explorer Mouse as /devices/fpga:07/serio1/input/input1 +EXT3-fs (sda2): error: couldn't mount because of unsupported optional features (240) +EXT2-fs (sda2): error: couldn't mount because of unsupported optional features (240) +EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) +VFS: Mounted root (ext4 filesystem) readonly on device 8:2. +devtmpfs: mounted +Freeing init memory: 124K +sd 0:0:0:0: [sda] ABORT operation started +scsi target0:0:0: control msgout: + 80 20 51 d. +sd 0:0:0:0: ABORT operation complete. +Unable to handle kernel NULL pointer dereference at virtual address 00000358 +pgd = c0004000 +[00000358] *pgd=00000000 +Internal error: Oops: 5 [#1] ARM +Modules linked in: +CPU: 0 Not tainted (3.6.11+ #1) +PC is at sym_interrupt+0x7c8/0x1b88 +LR is at sym53c8xx_intr+0x40/0x7c +pc : [<c02193a0>] lr : [<c0214e0c>] psr: 80000193 +sp : c0419e30 ip : cf844800 fp : 00000001 +r10: cf935400 r9 : c043fb00 r8 : d0804084 +r7 : 00000012 r6 : c045588c r5 : 00000000 r4 : d0804000 +r3 : 00000008 r2 : 0000000d r1 : 00000000 r0 : 00000000 +Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel +Control: 00c5387d Table: 0fb40008 DAC: 00000017 +Process swapper (pid: 0, stack limit = 0xc0418268) +Stack: (0xc0419e30 to 0xc041a000) +9e20: 1be06241 00000000 00000001 00200200 +9e40: cf997af0 c004256c cf997ac0 cf844800 c0427898 c043fae4 c0418000 00000100 +9e60: c0419e7c cf9cbca0 c045587c 00000000 00000000 0000001b c043fb00 c0428e54 +9e80: 00000001 c0214e0c 00000001 00000080 0000001b cf9cbca0 0000001b c0054620 +9ea0: c0445420 c0419ec0 00000000 c0428e54 0000001b 00000000 c043fee8 00000000 +9ec0: cf997ac0 c0307e44 c0419f74 c00547a8 c0428e54 c0056860 c04321e8 c0053fe4 +9ee0: 000000c0 c001470c c043fee8 c0419f10 00000000 c00084f8 c003f840 20000013 +9f00: ffffffff c0419f44 c04223d8 c00134c0 00000000 00000000 00000002 cfb20c8c +9f20: cfb20c60 cf997ac0 00000001 c0427898 c04223d8 cf997ac0 c0307e44 c0419f74 +9f40: 00000000 c0419f58 c030536c c003f840 20000013 ffffffff 00000000 00000000 +9f60: c0427898 c0418000 c0427898 c04223d8 c0419fa4 c030536c c04230c0 cfb20c60 +9f80: c0423348 c0418000 c0418000 c043fc68 c0418000 c04230c0 410fb767 c0423348 +9fa0: 00000000 c0014a18 c0425fbc c04200d0 ffffffff c04123fc c065b880 00004008 +9fc0: 00410e7c c03f771c ffffffff ffffffff c03f728c 00000000 00000000 c04123fc +9fe0: 00000000 00c5387d c042004c c04123f8 c04230b4 00008040 00000000 00000000 +[<c02193a0>] (sym_interrupt+0x7c8/0x1b88) from [<c0214e0c>] (sym53c8xx_intr+0x40/0x7c) +[<c0214e0c>] (sym53c8xx_intr+0x40/0x7c) from [<c0054620>] (handle_irq_event_percpu+0x50/0x1b0) +[<c0054620>] (handle_irq_event_percpu+0x50/0x1b0) from [<c00547a8>] (handle_irq_event+0x28/0x38) +[<c00547a8>] (handle_irq_event+0x28/0x38) from [<c0056860>] (handle_level_irq+0x80/0xd4) +[<c0056860>] (handle_level_irq+0x80/0xd4) from [<c0053fe4>] (generic_handle_irq+0x24/0x38) +[<c0053fe4>] (generic_handle_irq+0x24/0x38) from [<c001470c>] (handle_IRQ+0x30/0x84) +[<c001470c>] (handle_IRQ+0x30/0x84) from [<c00084f8>] (vic_handle_irq+0x58/0x98) +[<c00084f8>] (vic_handle_irq+0x58/0x98) from [<c00134c0>] (__irq_svc+0x40/0x54) +Exception stack(0xc0419f10 to 0xc0419f58) +9f00: 00000000 00000000 00000002 cfb20c8c +9f20: cfb20c60 cf997ac0 00000001 c0427898 c04223d8 cf997ac0 c0307e44 c0419f74 +9f40: 00000000 c0419f58 c030536c c003f840 20000013 ffffffff +[<c00134c0>] (__irq_svc+0x40/0x54) from [<c003f840>] (finish_task_switch.constprop.68+0x78/0xec) +[<c003f840>] (finish_task_switch.constprop.68+0x78/0xec) from [<c030536c>] (__schedule+0x1a0/0x3bc) +[<c030536c>] (__schedule+0x1a0/0x3bc) from [<c0014a18>] (cpu_idle+0xa4/0xc0) +[<c0014a18>] (cpu_idle+0xa4/0xc0) from [<c03f771c>] (start_kernel+0x26c/0x2bc) +Code: e5d42540 e3a03008 e5c43540 e5842550 (e5951358) +---[ end trace 25ce2cfc77dea57b ]--- +Kernel panic - not syncing: Fatal exception in interrupt + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On May 14, 2013, at 6:58 AM, Peter Maydell <email address hidden> wrote: + +> It's very unlikely to be the patch you mention, since that's for SD card +> emulation and you're not using SD card emulation. It's probably just a +> regression between 1.4 and 1.5, and I'm fairly sure it's in some changes +> I made to the versatilepb PCI controller model -- I will investigate. +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1094564 +> +> Title: +> images used as scsi disks not readable (qemu-system-arm, macos 10.8) +> +> Status in The MacPorts Project: +> New +> Status in QEMU: +> New +> +> Bug description: +> Using a arm1176 kernel and the raspbian image (10-28 or 12-16) as my +> disk, I get as far as mounting root and then get SCSI errors with +> 1.3.0 and the current origin/master. git bisect says the issue is +> +> commit f563a5d7a820424756f358e747238f03e866838a +> Merge: a273652 aee0bf7 +> Author: Paolo Bonzini <email address hidden> +> Date: Wed Oct 31 10:42:51 2012 +0100 +> +> Merge remote-tracking branch 'origin/master' into threadpool +> +> Signed-off-by: Paolo Bonzini <email address hidden> +> +> +> I am using: +> qemu-system-arm -no-reboot -M versatilepb -cpu arm1176 -m 256 -hda 2012-12-16-wheezy-raspbian.img -kernel kernel-qemu -append "root=/dev/sda2 rootfstype=ext4 elevator=deadline rootwait panic=1" -serial stdio -usbdevice tablet -net nic -net user,hostfwd=tcp::40022-:22 +> +> Configured on MacOS 10.8.2 with current Xcode and MacPorts installed, thus: +> CPATH=/opt/local/include CFLAGS="-pipe -O2 -arch x86_64" CPPFLAGS="-I/opt/local/include" CXXFLAGS="-pipe -O2 -arch x86_64" LIBRARY_PATH="/opt/local/lib" MACOSX_DEPLOYMENT_TARGET="10.8" CXX="/usr/bin/clang++" LDFLAGS="-L/opt/local/lib -arch x86_64" OBJC=/usr/bin/clang FCFLAGS="-pipe -O2 -m64" INSTALL="/usr/bin/install -c" OBJCFLAGS="-pipe -O2 -arch x86_64" CC="/usr/bin/clang" ./configure --prefix=/opt/local --cpu=x86_64 --cc=/usr/bin/clang --objcc=/usr/bin/clang --host-cc=/usr/bin/clang --python=/opt/local/bin/python2.7 --target-list=arm-softmmu +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/macports/+bug/1094564/+subscriptions + + + +On 15 May 2013 19:02, Joss Reeves <email address hidden> wrote: +> Thanks so much for the patch and including me on the thread. I can +> confirm that it did fix the problem running on a Linux host, but the OS +> X bug cited by myself and the OP still remains elusive. It's rather +> puzzling as I pulled from HEAD and built using the same options on both. + +QEMU itself actually hangs in my tests (the main loop is waiting +to lock the iothread but it never does; the cpu thread seems to +be stuck trying to do a bdrv_aio_cancel for the scsi device model). +This should never happen, regardless of what the guest does... + +I suspect that if you configure on linux with --with-coroutine=sigaltstack +you might then find they both behave the same (MacOSX can't do the +ucontext coroutines we default to on linux). OTOH it might also +involve some of MacOSX's slightly different signal behaviour. + +I'll continue to prod, though past experience is that MacOSX +gdb is weirdly broken and things behave differently when run +under it, which doesn't help :-( + +-- PMM + + +On 15 May 2013 21:18, Peter Maydell <email address hidden> wrote: +> I suspect that if you configure on linux with --with-coroutine=sigaltstack +> you might then find they both behave the same (MacOSX can't do the +> ucontext coroutines we default to on linux). + +They don't, so it's a MacOSX specific issue of some kind. +PS: you don't need "-global versatile_pci.broken-irq-mapping=1" +in the command line because we do correctly autodetect and +handle your kernel now. + +thanks +-- PMM + + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + diff --git a/results/classifier/zero-shot/105/semantic/1094786 b/results/classifier/zero-shot/105/semantic/1094786 new file mode 100644 index 000000000..590343d40 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1094786 @@ -0,0 +1,88 @@ +semantic: 0.924 +instruction: 0.918 +graphic: 0.915 +assembly: 0.901 +device: 0.897 +other: 0.883 +vnc: 0.878 +socket: 0.860 +network: 0.847 +KVM: 0.830 +boot: 0.830 +mistranslation: 0.810 + +static build with curses fails if requires -ltinfo + +On my system (amd64 Debian wheezy/sid) static ncurses build requires -ltinfo: +$ pkg-config --libs --static ncurses +-lncurses -ltinfo + +$ ../../configure --enable-curses --static +# Actually this fails on line + if compile_prog "" "$curses_lib" ; then +# with +ERROR +ERROR: User requested feature curses +ERROR: configure was not able to find it +ERROR +# but if we add -ltinfo to this line check succeds +... +static build yes +... + +$ make +... +... + CC i386-softmmu/hw/i386/../kvm/pci-assign.o + LINK i386-softmmu/qemu-system-i386 +../os-posix.o: In function `change_process_uid': +/home/vadim/soft/qemu/os-posix.c:205: warning: Using 'initgroups' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking # and many alike warnings +... +../ui/curses.o: In function `curses_cursor_position': +/home/vadim/soft/qemu/ui/curses.c:137: undefined reference to `COLS' +/home/vadim/soft/qemu/ui/curses.c:137: undefined reference to `LINES' +/home/vadim/soft/qemu/ui/curses.c:138: undefined reference to `stdscr' +/home/vadim/soft/qemu/ui/curses.c:139: undefined reference to `curs_set' +../ui/curses.o: In function `curses_calc_pad': +/home/vadim/soft/qemu/ui/curses.c:68: undefined reference to `stdscr' +/home/vadim/soft/qemu/ui/curses.c:69: undefined reference to `stdscr' +... and so on + +I tried to build the very minimal static qemu executable. Actual configure line I tried first was +../../configure --target-list=i386-softmmu --disable-sdl --disable-virtfs --disable-vnc --disable-xen --disable-brlapi --disable-bluez --disable-slirp --disable-kvm --disable-user --disable-vde --disable-vhost-net --disable-spice --disable-libiscsi --disable-smartcard --disable-usb-redir --disable-guest-agent --audio-drv-list= --audio-card-list= --enable-curses --static + +and the errors was the same. + +I can reproduce this issue. + +I tried + +./configure --static --target-list="x86_64-softmmu" --enable-curse + +I get + +ERROR +ERROR: User requested feature curses +ERROR: configure was not able to find it +ERROR + +Please try qemu.git/master. + +If the error still occurs, please attach config.log. + +The problem may have to do with the way ./configure compile_prog and pkg_config interact with the --static option. The --static option is supposed to set up LDFLAGS -static and pkg-config --static. + +The curses probing code tries building -lncurses, -lcurses, and finally pkg-config ncurses. Try the following change: +curses_list="$($pkg_config --libs ncurses 2>/dev/null):-lncurses:-lcurses" + +That will probe pkg-config ncurses first. + +I ran into the same issue on FreeBSD, and just posted my patch to the qemu-devel list. It's the same solution stefanha describes above. + +(On FreeBSD we have an additional issue; we don't ship the .pc file with the ncurses port right now. I just hacked one together to include -ltinfo in Libs.private.) + + +Patch had been included here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=cfeda5f4b8710b6ba14 +So I think we can now mark this ticket as "Fix released". + diff --git a/results/classifier/zero-shot/105/semantic/1102027 b/results/classifier/zero-shot/105/semantic/1102027 new file mode 100644 index 000000000..030e7b2fe --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1102027 @@ -0,0 +1,125 @@ +semantic: 0.697 +device: 0.668 +assembly: 0.664 +graphic: 0.659 +other: 0.655 +instruction: 0.646 +network: 0.638 +socket: 0.626 +vnc: 0.607 +KVM: 0.603 +boot: 0.595 +mistranslation: 0.576 + +QED Time travel + +This night after a reboot of a VM, it was back to 8 Oct. 2012, i've lost all data between 8 Oct 2012 and now. I've check the QED file and mount on another VM, all seems OK. + +This QED has a raw backfile with the base OS (debian) shared with many others QED. It has NO snapshot. + +QEMU emulator version 1.1.2 + +Does anyone have a hint ? + +On Sun, Jan 20, 2013 at 11:54:33AM -0000, Mekza wrote: +> Public bug reported: +> +> This night after a reboot of a VM, it was back to 8 Oct. 2012, i've lost +> all data between 8 Oct 2012 and now. I've check the QED file and mount +> on another VM, all seems OK. + +Hi Mekza, +Are you able to reproduce this issue or did this happen once only? + +Does "all seems OK" mean that you still only saw the files from 8 Oct +2012 when you attached the image to another VM? + +> Does anyone have a hint ? + +There's not a lot of information here to go by. If the issue is +reproducible it should be possible to collect more information, starting +with the steps to reproduce the issue. + +Stefan + + +On Tue, Jan 29, 2013 at 1:46 PM, Stefan Hajnoczi <<email address hidden> +> wrote: + +> On Sun, Jan 20, 2013 at 11:54:33AM -0000, Mekza wrote: +> > Public bug reported: +> > +> > This night after a reboot of a VM, it was back to 8 Oct. 2012, i've lost +> > all data between 8 Oct 2012 and now. I've check the QED file and mount +> > on another VM, all seems OK. +> +> Hi Mekza, +> Are you able to reproduce this issue or did this happen once only? +> + +Hi Stefan, + +I already had this bug once and after a unmeasurable period (days or even +weeks) and a reboot of the VM, the FS was back. + + +> +> Does "all seems OK" mean that you still only saw the files from 8 Oct +> 2012 when you attached the image to another VM? +> + +"all seems OK" means the QED file is not corrupted and consistent. I still +have files from 8 Oct 2012. + + +> +> > Does anyone have a hint ? +> +> There's not a lot of information here to go by. If the issue is +> reproducible it should be possible to collect more information, starting +> with the steps to reproduce the issue. +> + +I know, i'm gonna copy the QED and then convert to RAW and attach it to a +new VM. I'll keep you in touch. + +> +> Stefan +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1102027 +> +> Title: +> QED Time travel +> +> Status in QEMU: +> New +> +> Bug description: +> This night after a reboot of a VM, it was back to 8 Oct. 2012, i've +> lost all data between 8 Oct 2012 and now. I've check the QED file and +> mount on another VM, all seems OK. +> +> This QED has a raw backfile with the base OS (debian) shared with many +> others QED. It has NO snapshot. +> +> QEMU emulator version 1.1.2 +> +> Does anyone have a hint ? +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1102027/+subscriptions +> + + + +-- +Martin-Zack Mekkaoui + + +Have you ever been able to reproduce this issue? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1152 b/results/classifier/zero-shot/105/semantic/1152 new file mode 100644 index 000000000..73373f703 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1152 @@ -0,0 +1,41 @@ +semantic: 0.913 +instruction: 0.878 +other: 0.854 +KVM: 0.763 +graphic: 0.740 +device: 0.624 +boot: 0.565 +vnc: 0.421 +socket: 0.384 +mistranslation: 0.380 +network: 0.323 +assembly: 0.150 + +Windows crashes on resuming from sleep if hv-tlbflush is enabled +Description of problem: +The above steps cause my Windows VM to BSOD immediately upon waking up (even before restarting the display driver in my case). +Steps to reproduce: +1. Boot Windows +2. Tell Windows to go to sleep (observe that qemu's state switches to suspended) +3. Cause windows to wake up (e.g. using the `system_wakeup` HMP command) +Additional information: +Looking at the crash dumps always shows the "ATTEMPTED WRITE TO READONLY MEMORY" error, and always with this stack trace: + +``` +nt!KeBugCheckEx +nt!MiRaisedIrqlFault+0x1413a6 +nt!MmAccessFault+0x4ef +nt!KiPageFault+0x35e +nt!MiIncreaseUsedPtesCount+0x12 +nt!MiBuildForkPte+0xc6 +nt!MiCloneVads+0x4ab +nt!MiCloneProcessAddressSpace+0x261 +nt!MmInitializeProcessAddressSpace+0x1cb631 +nt!PspAllocateProcess+0x1d13 +nt!PspCreateProcess+0x242 +nt!NtCreateProcessEx+0x85 +nt!KiSystemServiceCopyEnd+0x25 +ntdll!NtCreateProcessEx+0x14 +``` + +However, the process that is being created here is always `WerFault.exe`, i.e. the crash reporter. The crashing process is seemingly random. Removing `hv-tlbflush` from the command line resolves the problem. Hence, my hypothesis is that due to improper TLB flushing during wakeup, a random application on the core will crash, which spawns `WerFault.exe` which then immediately crashes again inside the kernel (also because of bad/stale TLB contents) and causes the BSOD. Perhaps one core wakes up first, requests a TLB flush, which is then *not* propagated to sleeping cores due to hv-tlbflush. Then one of those cores wakes up without the TLB flush? diff --git a/results/classifier/zero-shot/105/semantic/1156313 b/results/classifier/zero-shot/105/semantic/1156313 new file mode 100644 index 000000000..54d2d0f95 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1156313 @@ -0,0 +1,128 @@ +semantic: 0.869 +mistranslation: 0.841 +assembly: 0.789 +device: 0.777 +instruction: 0.707 +other: 0.705 +vnc: 0.702 +graphic: 0.657 +socket: 0.605 +boot: 0.576 +network: 0.533 +KVM: 0.518 + +X86-64 flags handling broken + +The current qemu sources cause improper handling of flags on x86-64. +This bug seems to have shown up a few weeks ago. + +A plain install of Debian GNU/Linux makes user processes catch +spurious signals. The kernel seems to run stably, though. + +The ADX feature works very poorly. It might be related; at least it +allows for reproducibly provoking invalid behaviour. + +Here is a test case: + +================================================================ +qemumain.c +#include <stdio.h> +long adx(); +int +main () +{ + printf ("%lx\n", adx (0xffbeef, 17)); + return 0; +} +================================================================ +qemuadx.s: + .globl adx +adx: xor %rax, %rax +1: dec %rdi + jnz 1b + .byte 0xf3, 0x48, 0x0f, 0x38, 0xf6, 0xc0 # adox %rax, %rax + .byte 0x66, 0x48, 0x0f, 0x38, 0xf6, 0xc0 # adcx %rax, %rax + ret +================================================================ + +Compile and execute: +$ gcc -m64 qemumain.c qemuadx.s +$ a.out +ffffff8000378cd8 + +Expected output is simply "0". The garbage value varies between qemu +compiles and guest systems. + +Note that one needs a recent GNU assembler in order to handle adox and +adcx. For convenience I have supplied them as byte sequences. + +Exaplanation and feeble analysis: + +The 0xffbeef argument is a loop count. It is necessary to loop for a +while in order to trigger this bug. If the loop count is decreased, +the bug will seen intermittently; the lower the count, the less +frequent the invalid behaviour. + +It seems like a reasonable assumption that this bug is related to +flags handling at context switch. Presumably, qemu keeps flags state +in some internal format, then recomputes then when needing to form the +eflags register, as needed for example for context switching. + +I haven't tried to reproduce this bug using qemu-x86_64 and SYSROOT, +but I strongly suspect that to be impossible. I use +qemu-system-x86_64 and the guest Debian GNU/Linux x86_64 (version +6.0.6) . + +The bug happens also with the guest FreeBSD x86_64 version 9.1. (The +iteration count for triggering the problem 50% of the runs is not the +same when using the kernel Linux and FreeBSD's kernel, presumably due +to different ticks.) + +The bug happens much more frequently for a loaded system; in fact, the +loop count can be radically decreased if two instances of the trigger +program are run in parallel. + +Richard Henderson <email address hidden> writes: + + Patch at http://patchwork.ozlabs.org/patch/229139/ + +Thanks. I can confirm that this fixes the bug triggered by my test case +(and yours). However, the instability of Debian GNU/Linux x86_64 has +not improved. + +The exact same Debian version (debian "testing") updated at the same +time runs well on hardware. + +My qemu Debian system now got messed up, since I attempted an upgrade in +the buggy qemu, which segfaulted several times during the upgrade. I +need to reinstall, and then rely on -snapshot. + +There is a problem with denorms which is reproducible, but whether that +is a qemu bug, and whether it can actually cause the observed +instability, is questionable. Here is a testcase for that problem: + + + + +It should terminate. The observed buggy behaviour is that it hangs. + +The instability problem can be observed at gmplib.org/devel/tm-date.html. +hwl-deb.gmplib.org is Debian under qemu with -cpu Haswell,+adx. + +Not that the exact same qemu runs FreeBSD flawlessly (hwl.gmplib.org). +It is neither instable nor does it run the denorms testcase poorly. + +I fully realise this is a hopeless bug report, but I am sure you can +reproduce it, since it is far from GMP specific. After all apt-get +update; apt-get upgrade triggered it. Debugging it will be a nightmare. + +Qemu version: main git repo from less than a week ago + Richard ADX +patch. + +-- +Torbjörn + + +It looks from this bug that we fixed the initial ADOX bug in commit c53de1a2896cc (2013), and I've just tried the 'qemu-denorm-problem.s' test case from comment #1 and it works OK, so I think we've fixed that denormals bug too. Given that, and that this bug report is 4 years old, I'm going to close it. If you're still having problems with recent versions of QEMU, please open a new bug. + + diff --git a/results/classifier/zero-shot/105/semantic/1180924 b/results/classifier/zero-shot/105/semantic/1180924 new file mode 100644 index 000000000..192c2b114 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1180924 @@ -0,0 +1,65 @@ +semantic: 0.912 +other: 0.903 +instruction: 0.899 +assembly: 0.890 +mistranslation: 0.890 +graphic: 0.882 +device: 0.867 +KVM: 0.858 +vnc: 0.851 +boot: 0.833 +socket: 0.756 +network: 0.747 + +fails to handle a usb serial port with a specific vendorid + +If I run qemu-system-i386 with arguments +-usb -usbdevice serial:vendorid=1221:pty +(this is what the documentation says about how I shoud add a usb device which has a serial port interface and which has a specific vendor id, I used the documentation located here: +http://qemu.weilnetz.de/qemu-doc.html +), it says +char device redirected to /dev/pts/<something> (label usbserial0) +qemu-system-i386: -usbdevice serial:vendorid=1221:pty: Property '.vendorid' not found +Aborted +and exits. Moreover, if I try to add such a device to a running machine by typing usb_add serial:vendorid=1221:pty in the machine's control terminal (to reach it, I press ctrl-alt-2), qemu also writes +char device redirected to /dev/pts/<something> (label usbserial0) +Aborted +to the terminal where I run it from and exits. To the quest OS this looks like a power failure which causes all the programs inside the virtual machine to lose their unsaved data. +I have tested this with qemu-1.5.0-rc2, actually, the issue occured in a similar way since 1.0.1, but did not occur in 0.11.1. +The issue is reproducible always, even if I don't specify any hard disk in the command line, i. e. +$ qemu-system-i386 -usb -usbdevice serial:vendorid=1221:pty +, so I believe it is guest OS -independent. + + Hi, + +>> (this is what the documentation says about how I shoud add a usb device which has a serial port interface and which has a specific vendor id, I used the documentation located here: +>> http://qemu.weilnetz.de/qemu-doc.html +>> ), it says +>> char device redirected to /dev/pts/<something> (label usbserial0) +>> qemu-system-i386: -usbdevice serial:vendorid=1221:pty: Property '.vendorid' not found +>> Aborted +> [...] +> +> Regression; this definitely worked when I wrote docs/qdev-device-use.txt. + +> Not a release blocker, since it regressed a long time ago (v0.12). + +Guess the docs should be updated, unless someone can come up with a +reasonable use case for the vendorid + deviceid properties. + +cheers, + Gerd + + +I think the ability to specify a different vendorid + deviceid can be useful. Suppose there is a USB device such that the specifications are open and officially published, but the driver is proprietary. (As far as I know, this is similar to the situation with ATI video cards, but they are not USB devices.) And I suspect that the driver is buggy (i. e. it does not send the data according to the specifications). I want to figure out where exactly it works incorrectly to submit a bug report to the developer of the driver. Or suppose I have a physical device, but it works a bit incorrectly. I want to figure out where exactly the problem is, in the driver or in the device. Since I am not sure that the device is OK, I don't want to write my own driver and interact with the device, maybe I will damage it even more. In both cases, I can emulate the device according to the specifications, install the driver in a guest system, and then see whether the driver sends correct data or where and when exactly the data are incorrect. + +Anyway, I think it is more or less ok if qemu crashes right after it starts due to bad command line parameters (nevertherless, the functionality lost this way could be useful as I explained). But I think IT IS NOT OK WHEN A WORKING VM WITH PROGRAMS INSIDE CRASHES after user enters a bad command in the machine's control terminal, unless the user explicitly requests termination (e. g. enters the q command). + +Regressed in commit f29783f72ea77dfbd7ea0c993d62d253d4c4e023. + +I've just run into this in a similar circumstance: trying to reverse-engineer a driver for a phone to which I can only connect via Bluetooth. No problem, I can just have it pretend to be a USB device. Except that I can't, because the driver won't recognise it. + +The crash has now been fixed here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=aa612b364ecbe1dc +Please also note that the "-usbdevice serial" syntax is considered as deprecated nowadays - use "-device usb-serial" instead. + diff --git a/results/classifier/zero-shot/105/semantic/1212 b/results/classifier/zero-shot/105/semantic/1212 new file mode 100644 index 000000000..47e920224 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1212 @@ -0,0 +1,22 @@ +semantic: 0.916 +device: 0.885 +instruction: 0.875 +graphic: 0.865 +network: 0.785 +vnc: 0.763 +socket: 0.711 +boot: 0.532 +assembly: 0.180 +mistranslation: 0.078 +other: 0.075 +KVM: 0.018 + +A NULL pointer dereference issue in elf2dmp +Description of problem: +SIGSEGV in get_pml4e for it didn't handle NULL result properly. +Steps to reproduce: +1.launch qemu and running "gab attach -p $QEMU_PID", run "gcore" inside gdb to generate coredump +2../elf2dmp ./core.111 ./out.dmp +3.get segemantation fault +Additional information: + diff --git a/results/classifier/zero-shot/105/semantic/1223467 b/results/classifier/zero-shot/105/semantic/1223467 new file mode 100644 index 000000000..7b0111e4c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1223467 @@ -0,0 +1,60 @@ +semantic: 0.445 +instruction: 0.361 +mistranslation: 0.350 +device: 0.266 +other: 0.228 +socket: 0.216 +graphic: 0.201 +boot: 0.145 +network: 0.138 +vnc: 0.106 +assembly: 0.092 +KVM: 0.065 + +Unable to use USB as hda in Windows + +I built qemu 1.6.0 from source in MinGW (and all dependents not available with mingw-get) +The command line: +qemu-system-i386.exe -m 1024 -hda \\.\PhysicalDrive1 -L pc-bios +or +qemu-system-x86_64.exe -m 1024 -hda \\.\PhysicalDrive1 -L pc-bios +(or the *w.exe equivalents) +reports in stderr.txt: +qemu-system-i386.exe: -hda \\.\PhysicalDrive1: Block protocol 'host_device' doesn't support the option 'filename' +qemu-system-i386.exe: -hda \\.\PhysicalDrive1: could not open disk image \\.\PhysicalDrive1: Invalid argument + +I have also found this bug in 1.5 but not in 1.4 + +Some Help: +The code in Qemu is a bit beyond me at 1am, but I was able to determine the root cause seems to be that block.c is becoming confused about referring to a file but not having a file name. I have been able to work around this by changing line 860 of block.c from: "if (qdict_size(options) != 0) {" to "if (qdict_size(options) != 0 && !is_windows_drive(filename)) {" + +But I don't think this is a good solution (it is assuming that nothing else could be wrong), and I can't be sure that I'm not masking some real issue. + +FWIW; Build is on XP, but execution is on Win7. + +Thanks. + +I can confirm the same bug. I am not building from source, but rather using the unofficial Windows binaries linked to by Qemu. + +http://wiki.qemu.org/Links + +I'm running as Administrator on Win8.1 x86_64 + +qemu-system-i386.exe -L . -hda \\.\PhysicalDrive3 + +qemu-system-i386.exe: -hda \\.\PhysicalDrive3: Block protocol 'host_device' doesn't support the option 'filename' +qemu-system-i386.exe: -hda \\.\PhysicalDrive3: could not open disk image \\.\PhysicalDrive3: Invalid argument + +I see this error in the 1.51, 1.53, and 1.60 builds from a couple different sources + +PhysicalDrive3 is a USB device that's visible in the Windows Disk Management Snap-In. I see the activity light on the drive blink when running this command. + +I've found some older Qemu binaries from various random sources https://code.google.com/p/kqemu-portable-win/ and they seem to be able to access physical devices without issue, though they also seem to have other problems of their own... + +I think this has been fixed by commit 68dc0364, which was included in +qemu 1.7.0 and also backported to 1.6.1. Can you please try upgrading and +confirm whether it fixes the problem? + + +I found some newer Windows binaries at http://qemu.weilnetz.de/ and can confirm I do not see the issue any more. + diff --git a/results/classifier/zero-shot/105/semantic/12360755 b/results/classifier/zero-shot/105/semantic/12360755 new file mode 100644 index 000000000..3e04cfd8d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/12360755 @@ -0,0 +1,304 @@ +semantic: 0.911 +device: 0.902 +graphic: 0.899 +instruction: 0.894 +assembly: 0.893 +other: 0.886 +mistranslation: 0.844 +boot: 0.818 +vnc: 0.810 +socket: 0.805 +KVM: 0.770 +network: 0.738 + +[Qemu-devel] [BUG] virtio-net linux driver fails to probe on MIPS Malta since 'hw/virtio-pci: fix virtio behaviour' + +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? + +Cheers +James +signature.asc +Description: +Digital signature + +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +Cheers +James + +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +> +On 03/17/2017 11:57 PM, James Hogan wrote: +> +> Hi, +> +> +> +> I've bisected the following failure of the virtio_net linux v4.10 driver +> +> to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: +> +> +> +> virtio_net virtio0: virtio: device uses modern interface but does not have +> +> VIRTIO_F_VERSION_1 +> +> virtio_net: probe of virtio0 failed with error -22 +> +> +> +> To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). +> +> +> +> It appears that adding ",disable-modern=on,disable-legacy=off" to the +> +> virtio-net -device makes it work again. +> +> +> +> I presume this should really just work out of the box. Any ideas why it +> +> isn't? +> +> +> +> +Hi, +> +> +> +This is strange. This commit changes virtio devices from legacy to virtio +> +"transitional". +> +(your command line changes it to legacy) +> +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +> +side +> +there is nothing new. +> +> +Michael, do you have any idea? +> +> +Thanks, +> +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. + +> +> Cheers +> +> James +> +> + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Sure, + +Thanks, +Marcel +Cheers +James + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Hi James, + +Can you please check if the below patch fixes the problem? +Please note it is not a solution. + +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +index f9b7244..5b4d429 100644 +--- a/hw/virtio/virtio-pci.c ++++ b/hw/virtio/virtio-pci.c +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +Error **errp) + } + + pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +- PCI_BASE_ADDRESS_SPACE_MEMORY | +- PCI_BASE_ADDRESS_MEM_PREFETCH | +- PCI_BASE_ADDRESS_MEM_TYPE_64, ++ PCI_BASE_ADDRESS_SPACE_MEMORY, + &proxy->modern_bar); + + proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); + + +Thanks, +Marcel + +Hi Marcel, + +On Tue, Mar 21, 2017 at 04:16:58PM +0200, Marcel Apfelbaum wrote: +> +Can you please check if the below patch fixes the problem? +> +Please note it is not a solution. +> +> +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +> +index f9b7244..5b4d429 100644 +> +--- a/hw/virtio/virtio-pci.c +> ++++ b/hw/virtio/virtio-pci.c +> +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +> +Error **errp) +> +} +> +> +pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +> +- PCI_BASE_ADDRESS_SPACE_MEMORY | +> +- PCI_BASE_ADDRESS_MEM_PREFETCH | +> +- PCI_BASE_ADDRESS_MEM_TYPE_64, +> ++ PCI_BASE_ADDRESS_SPACE_MEMORY, +> +&proxy->modern_bar); +> +> +proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); +Sorry for the delay trying this, I was away last week. + +No, it doesn't seem to make any difference. + +Thanks +James +signature.asc +Description: +Digital signature + diff --git a/results/classifier/zero-shot/105/semantic/124 b/results/classifier/zero-shot/105/semantic/124 new file mode 100644 index 000000000..f265deea2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/124 @@ -0,0 +1,14 @@ +semantic: 0.952 +device: 0.844 +instruction: 0.817 +network: 0.595 +graphic: 0.461 +boot: 0.240 +assembly: 0.173 +mistranslation: 0.172 +vnc: 0.123 +other: 0.085 +socket: 0.059 +KVM: 0.053 + +SIGSEGV when reading ARM GIC registers through GDB stub diff --git a/results/classifier/zero-shot/105/semantic/1242765 b/results/classifier/zero-shot/105/semantic/1242765 new file mode 100644 index 000000000..ebe8e700a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1242765 @@ -0,0 +1,89 @@ +semantic: 0.378 +graphic: 0.294 +other: 0.273 +mistranslation: 0.233 +instruction: 0.208 +assembly: 0.174 +device: 0.169 +KVM: 0.169 +network: 0.161 +vnc: 0.143 +socket: 0.100 +boot: 0.083 + +USB passthrough to Windows 7 guest fails with error -110, hangs + +Description of problem: + +Using a Sandisk Cruzer Fit 16GB USB thumb drive. +Using virt-manager on Fedora 19 host, and Windows 7 32 bit guest. + +I set up a USB2 controller on Windows 7 guest in virt-manager. Windows sees the USB drive and can open the file manager and correctly show the files. I can copy a file from the thumb drive to the Fedora desktop, and then play the file on the desktop. However, any attempt to open a file directly on the thumb drive (example, play an MP3 using Windows Media Player) results in guest hang and host kernel messages: + + +Oct 19 21:15:35 localhost kernel: [187592.977839] usb 1-1.3: reset high-speed USB device number 13 using ehci-pci +Oct 19 21:15:40 localhost kernel: [187598.065274] usb 1-1.3: device descriptor read/all, error -110 +Oct 19 21:15:40 localhost kernel: [187598.138167] usb 1-1.3: reset high-speed USB device number 13 using ehci-pci +Oct 19 21:15:56 localhost kernel: [187613.218119] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:16:11 localhost kernel: [187628.399275] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:16:11 localhost kernel: [187628.573355] usb 1-1.3: reset high-speed USB device number 13 using ehci-pci +Oct 19 21:16:16 localhost kernel: [187633.587778] usb 1-1.3: device descriptor read/8, error -110 +Oct 19 21:16:21 localhost kernel: [187638.702244] usb 1-1.3: device descriptor read/8, error -110 +Oct 19 21:16:21 localhost kernel: [187638.876201] usb 1-1.3: reset high-speed USB device number 13 using ehci-pci +Oct 19 21:16:26 localhost kernel: [187643.890642] usb 1-1.3: device descriptor read/8, error -110 +Oct 19 21:16:31 localhost kernel: [187649.005071] usb 1-1.3: device descriptor read/8, error -110 +Oct 19 21:16:31 localhost kernel: [187649.106188] usb 1-1.3: USB disconnect, device number 13 +Oct 19 21:16:31 localhost kernel: [187649.178969] usb 1-1.3: new high-speed USB device number 14 using ehci-pci +Oct 19 21:16:47 localhost kernel: [187664.258945] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:17:02 localhost kernel: [187679.440092] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:17:02 localhost kernel: [187679.614194] usb 1-1.3: new high-speed USB device number 15 using ehci-pci +Oct 19 21:17:17 localhost kernel: [187694.694148] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:17:32 localhost kernel: [187709.875297] usb 1-1.3: device descriptor read/64, error -110 +Oct 19 21:17:32 localhost kernel: [187710.049386] usb 1-1.3: new high-speed USB device number 16 using ehci-pci +Oct 19 21:17:37 localhost kernel: [187715.063803] usb 1-1.3: device descriptor read/8, error -110 +Oct 19 21:17:41 localhost kernel: [187719.005453] usb 1-1.3: device descriptor read/8, error -71 + +After that -71 error, the thumb drive completely disappears from the host, as if it is powered down. + +I read that -110 is supposedly a power issue. I can play media files directly from the thumb drive on the host, so the power seems fine on the host. + + +How reproducible: +always + + +Steps to reproduce: +1. use virt-manager, create a Windows 7 32 bit guest +2. in virt-manager, set Controller USB to USB2 +3. on host, insert Sandisk Cruser Fit thumb drive FAT32 format, with an MP3 file on it +4. in virt-manager, add a USB passthrough device and assign it to thumb drive +5. boot Windows 7 guest +6. verify that Windows 7 can see the thumb drive +7. use Windows Media Player to play MP3 + +Actual results: +guest hangs, then host powers off thumb drive + +Expected results: +The MP3 file should play :) + + +Additional info: + +Fedora 19 + +Installed Packages +qemu-common.x86_64 2:1.4.2-11.fc19 @updates +qemu-guest-agent.x86_64 2:1.4.2-11.fc19 @updates +qemu-img.x86_64 2:1.4.2-11.fc19 @updates +qemu-kvm.x86_64 2:1.4.2-11.fc19 @updates +qemu-system-x86.x86_64 2:1.4.2-11.fc19 @updates +virt-manager.noarch 0.10.0-3.fc19 @updates +kernel.x86_64 3.11.1-200.fc19 @updates + +Can you still reproduce this problem with the latest version of QEMU, or could we close this ticket nowadays? + +You may close. It's since worked fine for me. + +Also the ticket is years old :D + diff --git a/results/classifier/zero-shot/105/semantic/1285505 b/results/classifier/zero-shot/105/semantic/1285505 new file mode 100644 index 000000000..2f5400943 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1285505 @@ -0,0 +1,115 @@ +semantic: 0.816 +instruction: 0.812 +device: 0.755 +assembly: 0.754 +graphic: 0.736 +other: 0.718 +mistranslation: 0.713 +socket: 0.701 +vnc: 0.691 +KVM: 0.624 +network: 0.577 +boot: 0.510 + +[ppa 2.0~git-20140225] SIGABRT with -virtfs + +As requested on u-devel@, I tested QEMU 2.0~git-20140225.aa0d1f4-0ubuntu2 from ppa:ubuntu-virt/candidate. This has a regression with virtfs. + +I created a simple cloud-image based VM with autopkgtest: + + $ sudo apt-get install autopkgtest + $ adt-buildvm-ubuntu-cloud + $ mkdir -p /tmp/shared + $ qemu-system-x86_64 -enable-kvm -m 1024 -nographic -virtfs local,id=autopkgtest,path=/tmp/shared,security_model=none,mount_tag=myshare -snapshot adt-trusty-amd64-cloud.img +** +ERROR:/build/buildd/qemu-2.0~git-20140225.aa0d1f4/qom/object.c:331:object_initialize_with_type: assertion failed: (type != NULL) +Abgebrochen (Speicherabzug geschrieben) + +It should be easy enough to reproduce and the assertion message already should be clear, but this is the intersting part of the (unretraced) stack trace: + + #2 0x00007fe3329a9195 in g_assertion_message (domain=domain@entry=0x0, file=file@entry=0x7fe334c5a8e8 "/build/buildd/qemu-2.0~git-20140225.aa0d1f4/qom/object.c", line=line@entry=331, func=func@entry=0x7fe334c5ac40 "object_initialize_with_type", message=message@entry=0x7fe336e06e40 "assertion failed: (type != NULL)") at /build/buildd/glib2.0-2.39.90/./glib/gtestutils.c:2291 + lstr = "331\000\377\177\000\000\320ޠF\377\177\000\000\220\026\321\066\343\177\000\000R\252\305\064\343\177\000" + s = 0x7fe336e06e70 "pp\340\066\343\177" + #3 0x00007fe3329a922a in g_assertion_message_expr (domain=0x0, file=0x7fe334c5a8e8 "/build/buildd/qemu-2.0~git-20140225.aa0d1f4/qom/object.c", line=331, func=0x7fe334c5ac40 "object_initialize_with_type", expr=<optimized out>) at /build/buildd/glib2.0-2.39.90/./glib/gtestutils.c:2306 + s = 0x7fe336e06e40 "assertion failed: (type != NULL)" + +Thanks, trivially reproduced. Will bisect. + +Actually, the interesting bit of the stack trace starts just below where you cut it off, because object_initialize_with_type() is just asserting that it wasn't called with a NULL pointer, so what we really want to know is what the caller was... + +Hm, sadly bisect gives me: + +ubuntu@c-trusty-0:~/qemu$ git bisect good +ba1183da9a10b94611cad88c44a5c6df005f9b55 is the first bad commit +commit ba1183da9a10b94611cad88c44a5c6df005f9b55 +Author: Fam Zheng <email address hidden> +Date: Mon Feb 10 14:48:52 2014 +0800 + + rules.mak: fix $(obj) to a real relative path + + Makefile.target includes rule.mak and unnested common-obj-y, then prefix + them with '../', this will ignore object specific QEMU_CFLAGS in subdir + Makefile.objs: + + $(obj)/curl.o: QEMU_CFLAGS += $(CURL_CFLAGS) + + Because $(obj) here is './block', instead of '../block'. This doesn't + hurt compiling because we basically build all .o from top Makefile, + before entering Makefile.target, but it will affact arriving per-object + libs support. + + The starting point of $(obj) is passed in as argument of unnest-vars, as + well as nested variables, so that different Makefiles can pass in a + right value. + + Signed-off-by: Fam Zheng <email address hidden> + Signed-off-by: Paolo Bonzini <email address hidden> + +:100644 100644 807054b3a1797c911c81c58ae04c15cc599551f6 52b1958b4b43e9e90c68a6339b818f7893ba2551 M Makefile +:100644 100644 ac1d0e1c285073e1dc881061a3395189074289d0 1914080134d2559c63e76fe0650c9b86e57d3cd8 M Makefile.objs +:100644 100644 af6ac7eaa19f922cf5d006ee7eebd8ef2dfde3d4 9a6e7dd1b85e75baed2580d51456e614a0ba096f M Makefile.target +:100755 100755 4648117957465e554bd0005dc51d7e1750b97526 66b1d30c99a83a8856ecfa72bceb55a17928c70c M configure +:100644 100644 391d6eb8e612e5f7361249fe68040f50eb1c7bcc a95fb76626d5e30b5ac1b4ef5528ba5383f3ccd0 M rules.mak + + +gdb stack dump: + +Program received signal SIGABRT, Aborted. +0x00007ffff2849f79 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 +56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. +(gdb) where +#0 0x00007ffff2849f79 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 +#1 0x00007ffff284d388 in __GI_abort () at abort.c:89 +#2 0x00007ffff793c195 in g_assertion_message () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#3 0x00007ffff793c22a in g_assertion_message_expr () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#4 0x0000555555783d09 in object_initialize_with_type (data=data@entry=0x7fffe27fd910, + size=size@entry=6300032, type=0x0) at qom/object.c:331 +#5 0x0000555555783d78 in object_initialize (data=data@entry=0x7fffe27fd910, size=size@entry=6300032, + typename=typename@entry=0x555555914f28 "virtio-9p-device") at qom/object.c:350 +#6 0x0000555555735153 in virtio_9p_pci_instance_init (obj=0x7fffe27fd010) at hw/virtio/virtio-pci.c:943 +#7 0x0000555555783c23 in object_initialize_with_type (data=data@entry=0x7fffe27fd010, + size=<optimized out>, type=type@entry=0x5555561c0580) at qom/object.c:342 +#8 0x0000555555783dc0 in object_new_with_type (type=0x5555561c0580) at qom/object.c:441 +#9 0x0000555555783e55 in object_new (typename=typename@entry=0x5555561c79b0 "virtio-9p-pci") + at qom/object.c:451 +#10 0x0000555555768d2d in qdev_device_add (opts=0x5555561c7900) at qdev-monitor.c:526 +#11 0x00005555557b8289 in device_init_func (opts=<optimized out>, opaque=<optimized out>) at vl.c:2259 +#12 0x00005555558da5fb in qemu_opts_foreach (list=<optimized out>, + func=0x5555557b8270 <device_init_func>, opaque=0x0, abort_on_failure=<optimized out>) + at util/qemu-option.c:1149 +#13 0x00005555555e9743 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) + at vl.c:4249 + + +type passed to type_get_by_name() was "virtio-9p-device". It returned NULL. + +Looking at the compile output, in fact ./hw/9pfs/virtio-9p-device.o did not get compiled after commit ba1183da9a10b94611cad88c44a5c6df005f9b55. + +@pitti, + +the version now in ppa should fix your virtfs problem. + +Confirmed, this works now. Thanks! Closing, as this only affected the PPA. + +If I get the last two comments right, the problem was only about the Ubuntu PPA package, so I'm closing this for upstream QEMU, too. If you still have problems with upstream QEMU here, please feel free to open the ticket again. + diff --git a/results/classifier/zero-shot/105/semantic/1288385 b/results/classifier/zero-shot/105/semantic/1288385 new file mode 100644 index 000000000..48e3f4ec7 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1288385 @@ -0,0 +1,103 @@ +semantic: 0.923 +other: 0.920 +graphic: 0.920 +vnc: 0.911 +KVM: 0.900 +assembly: 0.889 +instruction: 0.884 +mistranslation: 0.857 +boot: 0.849 +device: 0.849 +network: 0.787 +socket: 0.734 + +VFIO passthrough causes assertation failure + +Since commit 5e95494380ec I am no longer able to passthrough my Nvidia GTX 770 using VFIO. Qemu terminates with: + +qemu-system-x86_64: hw/pci/pcie.c:240: pcie_cap_slot_hotplug_common: Assertion `((pci_dev->devfn) & 0x07) == 0' failed. + +Above output was generated using commit f55ea6297cc0. + + +Lspci of the vga card: + +01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1) + Subsystem: Gigabyte Technology Co., Ltd Device 360c + Kernel driver in use: vfio-pci +01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1) + Subsystem: Gigabyte Technology Co., Ltd Device 360c + Kernel driver in use: vfio-pci + + +Commandline used to start qemu: + +qemu-system-x86_64 -machine accel=kvm \ + -nodefaults \ + -name VFIO-Test \ + -machine q35 \ + -cpu host \ + -smp 1 \ + -enable-kvm \ + -m 1024 \ + -k de \ + -vga none \ + -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \ + -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \ + -device vfio-pci,host=01:00.1,bus=root.1,addr=00.1 \ + -rtc base=utc \ + -boot order=d \ + -device ide-cd,drive=drive-cd-disk1,id=cd-disk1,unit=0 \ + -drive file=/home/bluebird/Downloads/systemrescuecd-x86-4.0.0.iso,if=none,id=drive-cd-disk1,media=cdrom \ + -nographic + + +Full output of git bisect: + +5e95494380ecf83c97d28f72134ab45e0cace8f9 is the first bad commit +commit 5e95494380ecf83c97d28f72134ab45e0cace8f9 +Author: Igor Mammedov <email address hidden> +Date: Wed Feb 5 16:36:52 2014 +0100 + + hw/pci: switch to a generic hotplug handling for PCIDevice + + make qdev_unplug()/device_set_realized() to call hotplug handler's + plug/unplug methods if available and remove not needed anymore + hot(un)plug handling from PCIDevice. + + In case if hotplug handler is not available, revert to the legacy + hotplug method for compatibility with not yet converted buses. + + Signed-off-by: Igor Mammedov <email address hidden> + Reviewed-by: Michael S. Tsirkin <email address hidden> + Signed-off-by: Michael S. Tsirkin <email address hidden> + +:040000 040000 9bdab0d75fbc9be4fe2e4274e58e0cdcd347ac7e d6d6294ea9c06e80a0fc8fcabd6345dfae5137ad M hw +:040000 040000 d064d75ca8b8f169c41eee2683082e8f9104e968 f2abbf9bee754ada0f49135968455fd1a69b2186 M include +:040000 040000 c515daff6c77f9bd2cc32873be4c5c3a1c20cbb9 c506f5587afe8f7ee129a7ca6e3ae2e5118254f9 M tests + +commit 6e1f0a55a14bad1d0c8b9d29626ef4e4b2617c74 +Author: Igor Mammedov <email address hidden> +Date: Mon Feb 17 15:00:06 2014 +0100 + + PCIE: fix regression with coldplugged multifunction device + + PCIE is causing asserts each time a multifunction device is added + on command line (coldplug). + + This is caused by + commit a66e657e18cd9b70e9f57ae5512c07faf2bc508f + pci/pcie: convert PCIE hotplug to use hotplug-handler API + QEMU abort is caused by misplaced assertion, which should + be checked only when device is hotplugged. + + Reference to regression report: + http://<email address hidden>/msg216226.html + + Fixes: a66e657e18cd9b70e9f57ae5512c07faf2bc508f + + Reported-By: Nigel Kukard <email address hidden> + Signed-off-by: Igor Mammedov <email address hidden> + Reviewed-by: Michael S. Tsirkin <email address hidden> + Signed-off-by: Michael S. Tsirkin <email address hidden> + diff --git a/results/classifier/zero-shot/105/semantic/1288620 b/results/classifier/zero-shot/105/semantic/1288620 new file mode 100644 index 000000000..21892ca0b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1288620 @@ -0,0 +1,179 @@ +semantic: 0.642 +other: 0.641 +mistranslation: 0.536 +instruction: 0.534 +graphic: 0.530 +network: 0.527 +vnc: 0.520 +assembly: 0.466 +device: 0.446 +KVM: 0.435 +socket: 0.353 +boot: 0.332 + +memory leak with config file + +I have a Windows 7 SP1 Professional 64-bit installation on a QCOW2 image with compat=1.1, which I launch via + +qemu-system-x86_64 -drive file=windows_base_HDD.img,index=0,media=disk -enable-kvm -m 512M -vga std -net nic,vlan=0 -net user,vlan=0 + +As soon as I start using the network in any application — for example, visiting www.google.com in Internet Explorer — QEMU starts gobbling memory until the (host) kernel kills it because of an OOM condition. If I run the QEMU with the same options, but with model=e1000 option set for the NIC (i.e. -net -nic,vlan=0,model=e1000), I can use the network from the guest OS without any noticeable effect on QEMU's memory consumption. + +I do not have this problem when running QEMU with the exact same options (as above, without model=e1000) but with a Debian wheezy installation (on a QCOW image of the same format). My host system in Ubuntu 13.10 x86_64, kernel image 3.11.0-17-generic, but with the QEMU packages from trusty (the codename for the next release): +Output of `dpkg -l \*qemu\* | grep '^ii'`: +ii ipxe-qemu 1.0.0+git-20130710.936134e-0ubuntu1 all Virtual package to support use of kvm-ipxe with qemu +ii qemu-keymaps 1.7.0+dfsg-3ubuntu2 all QEMU keyboard maps +ii qemu-system-common 1.7.0+dfsg-3ubuntu2 amd64 QEMU full system emulation binaries (common files) +ii qemu-system-x86 1.7.0+dfsg-3ubuntu2 amd64 QEMU full system emulation binaries (x86) +ii qemu-utils 1.7.0+dfsg-3ubuntu2 amd64 QEMU utilities + +(If necessary, I can try to reproduce this with QEMU built from the upstream source or the latest source from version control.) + +Please try to reproduce this with a debug built (configure --enable-debug) from latest QEMU. If it shows the same memory leak, you can try to find the cause of this leak with valgrind: + +valgrind --leak-check=full qemu-system-x86_64 -drive file=windows_base_HDD.img,index=0,media=disk -enable-kvm -m 512M -vga std -net nic,vlan=0 -net user,vlan=0 2>&1 | tee valgrind.log + +As soon as you terminate the running qemu process, valgrind will write a protocol of all allocated memory. Leaks can usually be found quite easily in this protocol: after a long run, the leak will dominate any other memory allocation. If your executable still contains the full debug information, you will also see the exact line of source code which allocates repeatedly memory. + +Please note that valgrind slows down your process by a factor of 10 to 20, so it takes some time to run Windows. + +If valgrind does not work (which sometimes happens), you can attach a debugger (gdb) to the running qemu process and try to detect the buggy memory allocation by setting breakpoints on the memory allocator functions (malloc or g_malloc, g_malloc0, g_new, g_new0). + + +I can not reproduce this even with the Ubuntu package after rebooting after a kernel update. I will try again with the previous kernel image to confirm this is the relevant variable. Is it possible that the behaviour I described in the initial report is/was caused by code in the KVM module? + +Even after rebooting with the kernel I was using when I had the problem behaviour in QEMU, I can not reproduce the issue. It certainly was not a one-off, because QEMU was gobbling memory consistently on my system, in consecutive sessions. + +I have been able to consistently reproduce the bug again, and have run QEMU with Valgrind until OOM. It is unrelated to networking; it is caused by loading a config file. + +I ran QEMU from Git commit 7f6613cedc59fa849105668ae971dc31004bca1c under valgrind via... + +valgrind qemu-system-x86_64 -readconfig windows8_throwaway_VM.conf -m 1G -vga std 2>&1 | tee valgrind.log + +...where the contents of windows8_throwaway_VM.conf is... + +[drive] + file = "windows8_throwaway_HDD.img" + index = "0" + media = "disk" + if = "virtio" + +[net] + type = "nic" + vlan = "0" + model = "virtio" + +[net] + type = "user" + vlan = "0" + +[rtc] + base = "localtime" + +[machine] + accel = "kvm" + +(I will attach the file in a separate comment, because launchpad appears to only allow at most one attachment per comment.) + +It does not seem to matter whether VirtIO is used, as I have had this problem when not using any VirtIO devices, but the Windows guest I had on-hand was already using it. + +If I invoke QEMU with the equivalent settings all via the command line, it does not gobble memory (again, regardless of VirtIO). + +qemu-system-x86_64 -drive file=windows8_throwaway_HDD.img,index=0,media=disk,if=virtio -enable-kvm -m 1G -vga std -net nic,vlan=0,model=virtio -net user,vlan=0 -rtc localtime + +Attaching config file mentioned in previous comment. + +On Sat, Mar 29, 2014 at 03:02:23AM -0000, Aidan Gauland wrote: +> I have been able to consistently reproduce the bug again, and have run +> QEMU with Valgrind until OOM. It is unrelated to networking; it is +> caused by loading a config file. +> +> I ran QEMU from Git commit 7f6613cedc59fa849105668ae971dc31004bca1c +> under valgrind via... +> +> valgrind qemu-system-x86_64 -readconfig windows8_throwaway_VM.conf -m 1G +> -vga std 2>&1 | tee valgrind.log +> +> ...where the contents of windows8_throwaway_VM.conf is... +> +> [drive] +> file = "windows8_throwaway_HDD.img" +> index = "0" +> media = "disk" +> if = "virtio" +> +> [net] +> type = "nic" +> vlan = "0" +> model = "virtio" +> +> [net] +> type = "user" +> vlan = "0" +> +> [rtc] +> base = "localtime" +> +> [machine] +> accel = "kvm" +> +> (I will attach the file in a separate comment, because launchpad appears +> to only allow at most one attachment per comment.) +> +> It does not seem to matter whether VirtIO is used, as I have had this +> problem when not using any VirtIO devices, but the Windows guest I had +> on-hand was already using it. +> +> If I invoke QEMU with the equivalent settings all via the command line, +> it does not gobble memory (again, regardless of VirtIO). +> +> qemu-system-x86_64 -drive +> file=windows8_throwaway_HDD.img,index=0,media=disk,if=virtio -enable-kvm +> -m 1G -vga std -net nic,vlan=0,model=virtio -net user,vlan=0 -rtc +> localtime + +So this is a problem that only happens under Valgrind? Perhaps this is +a valgrind bug. + +Stefan + + +On Wed, 23 Apr 2014 13:10:39 -0000, Stefan Hajnoczi wrote: +> So this is a problem that only happens under Valgrind? Perhaps this +> is +> a valgrind bug. + +No, it happens outside of Valgrind as well. It only happens when QEMU +is told to read a config file (with -readconfig). + + + +On Wed, Apr 23, 2014 at 08:18:21PM -0000, Aidan Gauland wrote: +> On Wed, 23 Apr 2014 13:10:39 -0000, Stefan Hajnoczi wrote: +> > So this is a problem that only happens under Valgrind? Perhaps this +> > is +> > a valgrind bug. +> +> No, it happens outside of Valgrind as well. It only happens when QEMU +> is told to read a config file (with -readconfig). + +Weird, I tried yesterday and couldn't reproduce it against +qemu.git/master (2d03b49c3f225994c4b0b46146437d8c887d6774) with your +config file. + +I wonder if your guest is repeatedly doing something that causes QEMU to +leak memory. My guest was Red Hat Enterprise Linux 6.4. + +Does it happen if you provide a non-bootable disk image so the guest is +stuck at the BIOS screen? Use dd if=/dev/zero of=test.img bs=1M +count=1024 to create an empty 1 GB raw file. + +Stefan + + +It does seem to be related to the guest, because with a dummy (non-bootable, garbage data) disk image, the rapid memory leak does not occur. + +Can you still reproduce this issue with the latest version of QEMU (currently version 2.9)? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1299190 b/results/classifier/zero-shot/105/semantic/1299190 new file mode 100644 index 000000000..b79560b1a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1299190 @@ -0,0 +1,76 @@ +semantic: 0.700 +other: 0.582 +mistranslation: 0.574 +instruction: 0.522 +device: 0.396 +graphic: 0.335 +network: 0.333 +boot: 0.296 +vnc: 0.276 +socket: 0.269 +assembly: 0.224 +KVM: 0.173 + +Access to /proc/self/exe in linux-user mode + +This is based on a recent bug in GCC Bugzilla: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60681 + +It looks like libbacktrace (GCC runtime library used for obtaining stack traces) uses /proc/self/exe for error reporting. Currently this is mapped to qemu-arm which effectively disables libbacktrace on linux-user. + +It seems that QEMU already supports /proc/self/{maps,stat,auxv} so addition of /proc/self/exe may be trivial. + +This tiny patch seems to work. + +I think the problem is not in libbacktrace per se but rather libsanitizer initializing libbacktrace with contents of /proc/self/exe. Patch is still relevant though. + +Looks good, I'll get this to linux-user que once QEMU 2.0 is released. + +That patch will copy the whole of the target executable into a temporary file without changing any of it -- the fake_open mechanism is really intended for cases where we need to return modified results. Wouldn't it be easier to just have something in do_open() that said: + if (is_proc_myself(pathname, "exe")) { + return get_errno(open(exec_path), flags, mode); + } + +That will then give the right behaviour for read-only executables and other error-related corner cases. + +(See also the logic in the readlink/readlinkat handling which already specialcases /proc/self/exe using exec_path.) + + +(I got the bracket placement wrong there so as you can tell the code is untested :-)) + + +Yes, it works. Here is updated patch. + +Some nits: + The "(CPUArchState *)" cast isn't necessary + We should use exec_path, not ts->bprm->argv[0] (the guest argv[0] isn't necessarily the executable path) + We don't want to call path() here -- exec_path is a host path, and only guest filename paths need to go through path(). + +Looking a little more closely at the logic in main.c I wonder if we actually want: + + if (is_proc_myself(pathname, "exe")) { + execfd = qemu_getauxval(AT_EXECFD); + if (execfd) { + return execfd; + } + return get_errno(open(exec_path, flags, mode)); + } + +Also if you'd like us to apply your patches we'll need at least a "Signed-off-by: " line from you. + + +Ok, fixed. + +Thanks. That version +Reviewed-by: Peter Maydell <email address hidden> + + +Hi, + +Is this patch deployed in new version of QEMU? + +Thanks, +Maxim + +This bug was fixed by commit aa07f5ecf9828 in 2014 and has been released in QEMU. + + diff --git a/results/classifier/zero-shot/105/semantic/1299858 b/results/classifier/zero-shot/105/semantic/1299858 new file mode 100644 index 000000000..40345388f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1299858 @@ -0,0 +1,43 @@ +semantic: 0.749 +graphic: 0.746 +instruction: 0.670 +other: 0.607 +device: 0.502 +socket: 0.405 +network: 0.234 +mistranslation: 0.232 +boot: 0.215 +vnc: 0.191 +assembly: 0.142 +KVM: 0.103 + +qemu all apps crash on OS X 10.6.8 + +qemu-2.0.0-rc0 (and 1.7.1) crashes with SIGABORT in all apps when configured with --with-coroutine=sigaltstack (which is what configure selects by default) but all run fine if configured with --with-coroutine=gthread. + +Crash is at line 253 (last line of Coroutine *qemu_coroutine_new(void)) in coroutine-sigaltstack.c in 2.0.0-rc0 tarball. + +Platform is OS X 10.6.8 (Darwin Kernel Version 10.8.0), compiler gcc 4.2.1 + +Sorry for the sparse report but I'm short on time today. + +My test system is OS X 10.8.5 built with clang "Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)", and QEMU works fine there, which suggests a problem either with that version of GCC or that version of MacOSX. + +You might try building with clang rather than gcc; otherwise since I don't have a system to reproduce on (or indeed much interest in tracking down bugs in old versions of MacOS, to be honest) I'm afraid you'll have to investigate this bug yourself if you want a fix for it. + + +I'm not personally worried about a fix for this, I reported it primarily for the benefit of others/the quality of the codebase as a whole. As I said, I got it working with gthreads as the coroutine provider so it's working for my needs. + +Although this seems on the surface to be a problem with the specific platform versions involved it's always possible that this sheds light on something that is either an undiscovered problem on more recent platform versions or will become a problem. + +It's notable that the version of xcode (and hence gcc) involved is the last from Apple with PPC support. It's precisely why I'm using it and it's precisely why someone who's targeting multiple platforms might be using it and qemu in concert. + +It's possible that a fix might be to get configure to select gthreads support for OS X platforms below a certain compiler or OS version, or it may be a deeper issue. + +Unfortunately the gthreads backend is pretty strongly disrecommended -- it is really mostly there as a debug convenience when working with the block code, as there are some bad interactions between signal masking and coroutine switches that mean it's likely to cause problems when using QEMU proper. + + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1307225 b/results/classifier/zero-shot/105/semantic/1307225 new file mode 100644 index 000000000..a51e95668 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1307225 @@ -0,0 +1,485 @@ +semantic: 0.873 +other: 0.844 +mistranslation: 0.796 +assembly: 0.788 +graphic: 0.784 +device: 0.772 +instruction: 0.756 +KVM: 0.750 +vnc: 0.707 +boot: 0.671 +network: 0.643 +socket: 0.527 + +Running a virtual machine on a Haswell system produces machine check events + +I'm running a virtual Windows SBS 2003 installation on a Xeon E3 Haswell system running Gentoo Linux. First, I used Qemu 1.5.3 (the latest stable version on Gentoo). I got a lot of machine check events ("mce: [Hardware Error]: Machine check events logged") in dmesg that always looked like (using mcelog): + +Hardware event. This is not a software error. +MCE 7 +CPU 2 BANK 0 +TIME 1390267908 Tue Jan 21 02:31:48 2014 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 6 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 + +I found this discussion on the vmware community: https://communities.vmware.com/thread/452344 + +It seems that this is (at least partly) caused by the Qemu machine. I switched to Qemu 1.7.0, the first version to use "pc-i440fx-1.7". With this version, the errors almost disappeared, but from time to time, I still get machine check events. Anyways, they so not seem to affect neither the vm, nor the host. + +I created the virtual machine on an older Core 2 Duo machine and ran it for several weeks without a single error message, so I think this is actually some problem with the Haswell architecture. The errors didn't show up until I copied the virtual machine to my new machine. + +Still happens with qemu 2.0.0 and the same environment (Windows SBS 2003 32 bit guest on a Gentoo Linux amd64 Haswell host). + +Running the VM with "-cpu Haswell" set still causes those "Internal Parity Errors", but not so many … + +Used QEMU this morning, noticed mce error in log, searched, found this. + +* model name: Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz (it's a Haswell) +* kernel 3.14.4-gentoo +* app-emulation/qemu-1.6.1 +* qemu-system-i386 -enable-kvm andsoon +* [73468.545378] mce: [Hardware Error]: Machine check events logged + +# mcelog +Hardware event. This is not a software error. +MCE 0 +CPU 0 BANK 0 +TIME 1400824994 Fri May 23 08:03:14 2014 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c07 APICID 0 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 + +I don't have anything to contribute other than that Tobias is not the only one who gets this hardware error message when using QEMU on a Haswell. + +I can confirm this. + +Using qemu-kvm for three virtual machines on Ubuntu 14.04 LTS using a Intel i7-4770 Haswell based server. + +dmesg: +[63429.847437] mce: [Hardware Error]: Machine check events logged +[65996.795630] mce: [Hardware Error]: Machine check events logged + +mcelog: +Hardware event. This is not a software error. +MCE 0 +CPU 2 BANK 0 +TIME 1406265172 Fri Jul 25 07:12:52 2014 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 + +It's the same error everytime, only APICID and CPU numbers are different. +The mce errors did not happen until i migrated the virtual machines from another system, the haswell-server was up for three days without any incidents, now, while running qemu-kvm there is a mce error every 6-12 hours. +After the first errors, i called the support of my server provider, they first exchanged RAM, upgraded BIOS... +Then, they replaced the whole server, only swapping my harddisks to the new one. But even that didn't help, i still got MCE errors. The harddisks where replaced too, one at a time (to resync raid). Now, i have a completely swapped hardware, but the MCE errors are still popping up. + +system information attached + +attachment +logfiles, dmidecode, system information + +I got a new Haswell based system a few days ago. It has been running fine without warnings but today I started a VirtualBox VM and got a MCE soon afterwards. "MCA: Internal parity error" like everyone else. From reading this bug and the vmware link in the first post it seems like this problem occurs on all virtualization solutions using hardware acceleration on Haswell based systems. It happens on Qemu, Virtualbox and Vmware and it happens on both Linux and Windows. + +Do anyone have connections within Intel and can pull some strings to have them look at this? It looks like the MCE is always non fatal but perhaps there are other unknown side effects. A microcode update might solve it. + +Try adding this to the Linux commandline, in your bootloader: + +mce=nobootlog + +From Documentation/x86/x86_64/boot-options.txt: + + mce=bootlog + Enable logging of machine checks left over from booting. + Disabled by default on AMD because some BIOS leave bogus ones. + If your BIOS doesn't do that it's a good idea to enable though + to make sure you log even machine check events that result + in a reboot. On Intel systems it is enabled by default. + mce=nobootlog + Disable boot machine check logging. + +How will this help to solve the problem? + +I think this is related to the Haswell erratum 131 of the 'Intel® Xeon® Processor E3-1200 v3 Product Family Specification Update' at: +http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e3-1200v3-spec-update.pdf + + HSW131. Spurious Corrected Errors May be Reported + Problem: Due this erratum, spurious corrected errors may be logged in the IA32_MC0_STATUS + register with the valid field (bit 63) set, the uncorrected error field (bit 61) not set, a + Model Specific Error Code (bits [31:16]) of 0x000F, and an MCA Error Code (bits + [15:0]) of 0x0005. If CMCI is enabled, these spurious corrected errors also signal interrupts. + Implication: When this erratum occurs, software may see corrected errors that are benign. These + corrected errors may be safely ignored. + Workaround: None identified. + Status: For the steppings affected, see the Summary Table of Changes + + +I propose to work around this by mce=ignore_ce, as this is a spurious 'corrected error': +From Documentation/x86/x86_64/boot-options.txt: + mce=ignore_ce + Disable features for corrected errors, e.g. polling timer + and CMCI. All events reported as corrected are not cleared + by OS and remained in its error banks. + Usually this disablement is not recommended, however if + there is an agent checking/clearing corrected errors + (e.g. BIOS or hardware monitoring applications), conflicting + with OS's error handling, and you cannot deactivate the agent, + then this option will be a help. + +But I have not tried this yet. + + +So, at least, this does not seem to be something to worry about. But anyways, why does it only happen if a virtual machine is executed? + +Just my 2 cents. I have two Haswell boxes with Ubuntu Server 14.04 each running bunch of VMs. The first one is Intel Core i7-4770K and it runs only Linux VMs. There is no single MCE here for at least one year. The second box is Intel Core i7-4790K and it runs mix of Linux and Windows 2003 VMs. MCEs regularly appear in logs here. + +mce=ignore_ce indeed "fixes" the messages. However, it will mask real (important) errors as well. + +Since Intel can't or won't correct the bug with a microcode update, how about filtering it in the kernel? + +http://svnweb.freebsd.org/base/head/sys/x86/x86/mca.c?r1=269052&r2=269051&pathrev=269052 + +I'm seeing these MCE messages too. + +My hardware is i7 4790K on a Gigabyte Z97X Gaming GT motherboard. + +I run a mixture of Linux and Windows (client and server editions) guests. Hipervisor is kvm. I'm seeing these MCE messages since I virtualized a Windows Server 2008 R2 SP1. Neither Windows XP nor Windows 8.1 guests showed any messages. + +For a few minutes I was worried my hardware was faulty, but this bug reports somewhat gives me hope the hardware is OK. + +Pasted below is my /var/log/mcelog + + + +mcelog: failed to prefill DIMM database from DMI data +Hardware event. This is not a software error. +MCE 0 +CPU 0 BANK 0 +TIME 1440943174 Sun Aug 30 10:59:34 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 0 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 1 +CPU 0 BANK 0 +TIME 1441015741 Mon Aug 31 07:09:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 0 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 0 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 1 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 2 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 3 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 4 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 5 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 6 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 7 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 8 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 9 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 10 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 11 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 12 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 13 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 0 +CPU 2 BANK 0 +TIME 1441064341 Mon Aug 31 20:39:01 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 90000040000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 +Hardware event. This is not a software error. +MCE 0 +CPU 2 BANK 0 +TIME 1441064371 Mon Aug 31 20:39:31 2015 +MCG status: +MCi status: +Error overflow +Corrected error +Error enabled +MCA: Internal parity error +STATUS d0000200000f0005 MCGSTATUS 0 +MCGCAP c09 APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 60 + + +Minor Update: Bug occurs under Intel Skylake, too. + +System-information: Intel Core i7-6700 with 4x8 GB Samsung M378A1G43DB0-CPB DDR4-2133 RAM, Motherboard: Fujitsu D3401-H1 + + +Dec 15 06:53:30 srv01 kernel: [224214.850599] mce: [Hardware Error]: Machine check events logged +Dec 15 06:55:08 srv01 kernel: [224312.001142] mce: [Hardware Error]: Machine check events logged +Dec 15 06:57:12 srv01 kernel: [224435.836130] mce: [Hardware Error]: Machine check events logged +Dec 15 07:03:35 srv01 kernel: [224818.079136] mce: [Hardware Error]: Machine check events logged +Dec 15 07:07:55 srv01 kernel: [225077.697589] mce_notify_irq: 1 callbacks suppressed +Dec 15 07:07:55 srv01 kernel: [225077.697592] mce: [Hardware Error]: Machine check events logged +Dec 15 07:08:51 srv01 kernel: [225134.136571] mce: [Hardware Error]: Machine check events logged +Dec 15 07:12:25 srv01 kernel: [225347.598995] mce_notify_irq: 1 callbacks suppressed +Dec 15 07:12:25 srv01 kernel: [225347.598998] mce: [Hardware Error]: Machine check events logged +Dec 15 07:15:03 srv01 kernel: [225504.880462] mce: [Hardware Error]: Machine check events logged +Dec 15 07:17:49 srv01 kernel: [225670.907609] mce: [Hardware Error]: Machine check events logged +Dec 15 07:21:49 srv01 kernel: [225911.163547] mce: [Hardware Error]: Machine check events logged +Dec 15 07:22:57 srv01 kernel: [225978.227807] mce: [Hardware Error]: Machine check events logged +Dec 15 07:24:32 srv01 kernel: [226073.681985] mce: [Hardware Error]: Machine check events logged +Dec 15 07:28:31 srv01 kernel: [226312.111733] mce: [Hardware Error]: Machine check events logged +Dec 15 07:34:04 srv01 kernel: [226644.639095] mce: [Hardware Error]: Machine check events logged +Dec 15 07:35:58 srv01 kernel: [226757.904937] mce_notify_irq: 2 callbacks suppressed +Dec 15 07:35:58 srv01 kernel: [226757.904940] mce: [Hardware Error]: Machine check events logged +Dec 15 07:36:10 srv01 kernel: [226770.139237] mce: [Hardware Error]: Machine check events logged +Dec 15 07:41:14 srv01 kernel: [227073.719040] mce: [Hardware Error]: Machine check events logged +Dec 15 07:41:16 srv01 kernel: [227075.399257] mce: [Hardware Error]: Machine check events logged +Dec 15 07:44:14 srv01 kernel: [227253.699541] mce: [Hardware Error]: Machine check events logged +Dec 15 07:44:57 srv01 kernel: [227296.490305] mce: [Hardware Error]: Machine check events logged +Dec 15 07:52:44 srv01 kernel: [227762.621344] mce: [Hardware Error]: Machine check events logged +Dec 15 07:52:49 srv01 kernel: [227767.372259] mce: [Hardware Error]: Machine check events logged +Dec 15 07:54:39 srv01 kernel: [227877.219677] mce_notify_irq: 1 callbacks suppressed +Dec 15 07:54:39 srv01 kernel: [227877.219680] mce: [Hardware Error]: Machine check events logged +... + +mcelog: Family 6 Model 5e CPU: only decoding architectural errors +Hardware event. This is not a software error. +MCE 29 +CPU 0 BANK 0 +TIME 1450162369 Tue Dec 15 07:52:49 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 9000004000010005 MCGSTATUS 0 +MCGCAP c0a APICID 0 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 94 +mcelog: Family 6 Model 5e CPU: only decoding architectural errors +Hardware event. This is not a software error. +MCE 30 +CPU 2 BANK 0 +TIME 1450162422 Tue Dec 15 07:53:42 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 9000004000010005 MCGSTATUS 0 +MCGCAP c0a APICID 4 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 94 +mcelog: Family 6 Model 5e CPU: only decoding architectural errors +Hardware event. This is not a software error. +MCE 31 +CPU 1 BANK 0 +TIME 1450162479 Tue Dec 15 07:54:39 2015 +MCG status: +MCi status: +Corrected error +Error enabled +MCA: Internal parity error +STATUS 9000004000010005 MCGSTATUS 0 +MCGCAP c0a APICID 2 SOCKETID 0 +CPUID Vendor Intel Family 6 Model 94 + + + + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +I'm not sure if this can still be reproduces but I've found a workaround quite a while ago. The problem disappeared once I migrated the virtual machines using 32 bit OS images to 64 bit. The mix of 32 and 64 bit VMs was the causing these problems at least on my server. + +Last time I saw this error in my mcelog was in August. Probably, some update fixed it. I'll check the next days/weeks if I still see it. This is a quite long time, at the time of my original bug report, I got the errors multiple times a day and later multiple times a week. + +About the workaround moving to 64 bit OS images: Well, if you're (like in my case) stuck with dinosaur OS (Windows SBS 2003), there's no way to simply move to a 64 bit image ;-) + +But as said: I think it simply disappeared by some update. I'm using 2.10.0 at the moment. + +The errors still keep appearing. The mcelog still shows the exact errors posted in the very fist comment. + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/101 + + diff --git a/results/classifier/zero-shot/105/semantic/1310324 b/results/classifier/zero-shot/105/semantic/1310324 new file mode 100644 index 000000000..45d96f9ac --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1310324 @@ -0,0 +1,93 @@ +semantic: 0.820 +boot: 0.816 +instruction: 0.808 +other: 0.785 +device: 0.783 +socket: 0.745 +network: 0.734 +graphic: 0.726 +assembly: 0.702 +vnc: 0.699 +mistranslation: 0.673 +KVM: 0.615 + +Commit 0f842f8a introduces regression when using tcg-interpreter + +Hi. + +Commit 0f842f8a246f2b5b51a11c13f933bf7a90ae8e96 apparently introduces a regression when using --enable-tcg-interpreter. The regression is manifested as follows: + + 1. Checkout any qemu commit later or equal that the one said above. Beside that one, I tested v1.7.1, v2.0.0 and a few other commits suggested to my by git bisect. + 2. Possibly cherry-pick commit a32b12741bf45bf3f46bffe5a79cb2548a060cd8, which fixes a compilation bug with --enable-tcg-interpreter. + 3. Compile with: ./configure --target-list=i386-softmmu --enable-tcg-interpreter && make -j8 + 4. Create an empty virtual disk and try to install Windows XP on it booting from Windows CD-ROM. After the loading program, the installer immediately crashes with blue screen (it should instead show the installation confirmation dialog and then the EULA acceptance dialog, if it worked correctly). + +I'm mentioning Windows XP because it is the problem I found. Probably other operating systems would fail as well. I can test others, if you think it would be helpful. I can also give you access to the very exact CD-ROM image I'm using. + +The exact command line I'm using is: +build_location/i386-softmmu/qemu-system-i386 -m 512 -drive file=winxp_test.img -cdrom wipxp_cdrom.iso + +Attached is the blue screen that I see (unfortunately it is Italian, but that's a standard error message and I hope this is not a problem). + +I'm not able to understand the nature of the commit to identify what could be the problem. My nose tells me that it may be some stupid mistake, for example in some offset constant, that nobody ever saw because tcg-interpreter is not much used. + +Thanks, Giovanni. + + + +I forgot: winxp_test.img is just an empty 15 GB (sparse) file. + +On 04/21/2014 06:14 AM, Stefan Weil wrote: +> That commit changed the use of the GETPC macro. I just tried to debug +> the tci.c code and noticed that cputlb.c no longer works as expected: + +Ouch, yes, I see that. + +> This is not specific for the TCG interpreter, but I don't know how the +> normal TCG is affected. + +I believe that normal TCG is not affected, because the value returned for the +return address is outside the code_buffer, so tb_find_pc returns NULL, so +cpu_restore_state does nothing. Whereas the interpreter continues to produce +the address of the last opcode executed. + +To solve this, I believe you need to clear tci_tb_ptr on all exits from the +interpreter loop. That is, both on normal exit (return from tcg_qemu_tb_exec) +as well as exceptional exit (longjmp landing in cpu_exec; see the Reload env +after longjmp section). + +Only setting tci_tb_ptr at the places it's needed, calls and qemu_ld/st calls, +is a good optimization of memory traffic, but is unrelated to this bug. + +> I also noticed that other code like target-i386/seg_helper.c which +> includes exec/softmmu_template.h also results in undefined usage of the +> GETRA macro. + +Huh? That's the normal backend expansion of its load/store helpers. + + + +r~ + + +I can reproduce a similar problem when running the latest ReactOS live CD from http://downloads.sourceforge.net/reactos/ReactOS-0.3.16-REL-live.zip and see the regression caused by the same commit. + +It can be fixed by a small modification in cputlb.c: replace GETPC by GETRA in these two lines + +cputlb.c:#undef GETPC +cputlb.c:#define GETPC() ((uintptr_t)0) + +Giovanni, could you please try your XP image with this modification, so we can confirm that it fixes the regression? +Richard suggested a modification which would be even more safe, but we did not need it before commit +0f842f8a246f2b5b51a11c13f933bf7a90ae8e96, so for a first fix, replacing GETPC by GETRA might be sufficient. + +Regards +Stefan + + +I can confirm that your change fixes my problem as well. Thank you very much! + +The fix mentioned in comment #4 has been included here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=7e4e88656c1e6192e9e47 +==> Setting status to "Fix released". + diff --git a/results/classifier/zero-shot/105/semantic/1338957 b/results/classifier/zero-shot/105/semantic/1338957 new file mode 100644 index 000000000..2d86d0c71 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1338957 @@ -0,0 +1,31 @@ +semantic: 0.689 +graphic: 0.680 +instruction: 0.663 +device: 0.640 +socket: 0.515 +network: 0.493 +mistranslation: 0.477 +vnc: 0.475 +boot: 0.459 +other: 0.320 +KVM: 0.241 +assembly: 0.231 + +RFE: add an event to report block devices watermark + +Add an event to report if a block device usage exceeds a threshold. The threshold should be configurable with a monitor command. The event should report the affected block device. Additional useful information could be the offset of the highest sector , like in the query-blockstats output. + +Rationale for the RFE +Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. +In order to let the guest run flawlessly and be not unnecessarily paused, oVirt sets a watermark and automatically resized the image once the watermark is reached or exceeded. + +In order to detect the mark crossing, the managing application has no choice than aggressively polling the QEMU monitor +using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: scenarios +with hunderds of VM are becoming not unusual. + +patch posted on qemu-devel, reviewd, acked and merged into maintainer's branch: +https://github.com/stefanha/qemu/commit/f050ea639522e9dd7e501ef285a2a12709b8726a + +Upstream here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=e2462113b2003085ad16f15e1 + diff --git a/results/classifier/zero-shot/105/semantic/1347555 b/results/classifier/zero-shot/105/semantic/1347555 new file mode 100644 index 000000000..722655b8b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1347555 @@ -0,0 +1,126 @@ +semantic: 0.931 +mistranslation: 0.917 +other: 0.915 +device: 0.912 +assembly: 0.884 +graphic: 0.881 +instruction: 0.875 +socket: 0.864 +network: 0.849 +KVM: 0.840 +vnc: 0.833 +boot: 0.817 + +qemu build failure, hxtool is a bash script, not a /bin/sh script + +hxtool (part of the early build process) is a bash script. Running it with /bin/sh yields a syntax error on line 10: + + 10 STEXI*|ETEXI*|SQMP*|EQMP*) flag=$(($flag^1)) + +$(( expr )) is a bash extension, not part of /bin/sh. + +Note that replacing the sh in the first line in hxtool with /bin/bash does not help, because the script is run manually from the Makefile with sh: + +154 $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@") + +The fix is to change those lines to + +154 $(call quiet-command,bash $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@") + +(there are five or so). + +On 07/23/2014 04:21 AM, Felix von Leitner wrote: +> Public bug reported: +> +> hxtool (part of the early build process) is a bash script. Running it +> with /bin/sh yields a syntax error on line 10: +> +> 10 STEXI*|ETEXI*|SQMP*|EQMP*) flag=$(($flag^1)) +> +> $(( expr )) is a bash extension, not part of /bin/sh. + +Wrong. $(( expr )) is mandated by POSIX. What system are you on where +/bin/sh is not POSIX? (Solaris is the only platform where /bin/sh does +not try to be POSIX-compliant, but who uses that for qemu?) + +What is the actual syntax error you are seeing? Is this a bug in dash +on your distribution? I can't get dash to fail for me on Fedora: + +$ dash -c 'f=1; f=$(($f^1)); echo $f' +0 +$ dash -n scripts/hxtool; echo $? +0 + +-- +Eric Blake eblake redhat com +1-919-301-3266 +Libvirt virtualization library http://libvirt.org + + + +I actually have bash installed as /bin/sh and /bin/bash. +But I also have heirloom sh installed, which installs itself as /sbin/sh, and that happened to be first in my $PATH. + +Since the makefiles use "sh script" to run the scripts, that called the heirloom sh. + +http://heirloom.sourceforge.net/sh.html + +It is, it turns out, derived from OpenSolaris. So there you go :-) + +When I delete /sbin/sh, qemu builds. + +On 07/23/2014 10:13 AM, Felix von Leitner wrote: +> I actually have bash installed as /bin/sh and /bin/bash. +> But I also have heirloom sh installed, which installs itself as /sbin/sh, and that happened to be first in my $PATH. +> +> Since the makefiles use "sh script" to run the scripts, that called the +> heirloom sh. +> +> http://heirloom.sourceforge.net/sh.html +> +> It is, it turns out, derived from OpenSolaris. So there you go :-) +> +> When I delete /sbin/sh, qemu builds. + +Then the bug is not in qemu, but in your environment. Installing +known-broken heirloom where it can be found first on a PATH search for +sh is just asking for problems, not just with qemu, but with all SORTS +of programs that expect POSIX semantics from a Linux /bin/sh. + +Rather than change the Makefile to invoke the script with bash, we could +instead bend over backwards to rewrite the script in a way that works +with non-POSIX shells (as in, flag=`expr $flag ^ 1`), but that feels +backwards to me. Until someone is actively worried about porting qemu +to a true Solaris environment, rather than just an heirloom-as-/bin/sh +Linux environment, I don't think it's worth the effort. + +-- +Eric Blake eblake redhat com +1-919-301-3266 +Libvirt virtualization library http://libvirt.org + + + +On 23 July 2014 17:31, Eric Blake <email address hidden> wrote: +> Rather than change the Makefile to invoke the script with bash, we could +> instead bend over backwards to rewrite the script in a way that works +> with non-POSIX shells (as in, flag=`expr $flag ^ 1`), but that feels +> backwards to me. Until someone is actively worried about porting qemu +> to a true Solaris environment, rather than just an heirloom-as-/bin/sh +> Linux environment, I don't think it's worth the effort. + +My view on this has always been "we shouldn't assume bash, +but we can assume POSIX shell semantics". (And also that +we should assume /bin/sh is a POSIX shell, because it's the +21st century, and Solaris should just get with it :-)) + +thanks +-- PMM + + +It turns out that expr does not support ^ (at least according to the man page). :-) + +Still, you could do expr -$flag + 1 to do the same thing. + +Is the ruckus just about this one place where $(( )) is used or are there other non-Bourne-shell constructs? + +Closing this ticket, as it was rather a problem with the non-posix-compliant shell and not the QEMU build system. + diff --git a/results/classifier/zero-shot/105/semantic/1349722 b/results/classifier/zero-shot/105/semantic/1349722 new file mode 100644 index 000000000..137584c59 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1349722 @@ -0,0 +1,60 @@ +semantic: 0.552 +graphic: 0.481 +device: 0.463 +instruction: 0.404 +mistranslation: 0.358 +other: 0.315 +socket: 0.299 +network: 0.261 +vnc: 0.229 +boot: 0.185 +assembly: 0.124 +KVM: 0.062 + +qemu-io: Exit code is always zero + +The qemu-io always returns zero on exit independently on errors occurred during the command execution. + +Example, + +$ qemu-io -c 'write 128 234' /tmp/run1/test-1/test.img + +offset 128 is not sector aligned + +$ echo $? +0 + + +qemu.git HEAD: 41a1a9c42c4e + +On Tue, Jul 29, 2014 at 08:07:44AM -0000, Maria Kustova wrote: +> The qemu-io always returns zero on exit independently on errors occurred +> during the command execution. +> +> Example, +> +> $ qemu-io -c 'write 128 234' /tmp/run1/test-1/test.img +> +> offset 128 is not sector aligned +> +> $ echo $? +> 0 +> +> +> qemu.git HEAD: 41a1a9c42c4e + +For single commands it makes sense to return the command success as the +exit code. + +When qemu-io is used interactively or with multiple -c options we need a +error handling policy. + +For this reason it is not totally trivial to implement. + +Stefan + + +Looking through old bug tickets... is this still an issue with the latest version of QEMU? Or could we close this ticket nowadays? + +Should be fixed as of 6b3aa8485ca8e. + diff --git a/results/classifier/zero-shot/105/semantic/1366836 b/results/classifier/zero-shot/105/semantic/1366836 new file mode 100644 index 000000000..43f52454c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1366836 @@ -0,0 +1,88 @@ +semantic: 0.861 +mistranslation: 0.849 +other: 0.848 +instruction: 0.819 +KVM: 0.818 +boot: 0.817 +device: 0.812 +assembly: 0.801 +graphic: 0.794 +vnc: 0.770 +socket: 0.718 +network: 0.604 + +Core2Duo and KVM may not boot Win8 properly on 3.x kernels + +When I start up QEMU w/ KVM 1.7.0 on a Core2Duo machine running a vanilla kernel +3.4.67 or 3.10.12 to run a Windows 8.0 guest, the guest freezes at Windows 8 boot without any error. +When I dump the CPU registers via "info registers", nothing changes, that means +the system really stalled. Same happens with QEMU 2.0.0 and QEMU 2.1.0. + +But - when I run the very same guest using Kernel 2.6.32.12 and QEMU 1.7.0 or 2.0.0 on +the host side it works on the Core2Duo. Also the system above but just with an +i3 or i5 CPU it works fine. + +I already disabled networking and USB for the guest and changed the graphics +card - no effect. I assume that some mean bits and bytes have to be set up +properly to get the thing running. + +Seems to be related to a kvm/progressor incompatibility. + +Here the register dump of the stalled Win8 +QEMU 2.1.0 monitor - type 'help' for more information +(qemu) info registers +EAX=3e2009e3 EBX=3e2009e3 ECX=80000000 EDX=80000000 +ESI=3e2009e3 EDI=8220c108 EBP=81f9b33c ESP=81f9b2f0 +EIP=80c98d83 EFL=00010282 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0 +ES =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] +CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] +SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +DS =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] +FS =0030 80e65000 00004280 00409300 DPL=0 DS [-WA] +GS =0000 00000000 ffffffff 00000000 +LDT=0000 00000000 ffffffff 00000000 +TR =0028 80353000 000020ab 00008b00 DPL=0 TSS32-busy +GDT= 80a37000 000003ff +IDT= 80a37400 000007ff +CR0=8001003b CR2=8b206090 CR3=00185000 CR4=000406e9 +DR0=0000000000000000 DR1=0000000000000000 DR2=0000000500000000 DR3=0000000000000000 +DR6=00000000ffff0ff0 DR7=0000000000000400 +EFER=0000000000000800 +FCW=027f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 +FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 +FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 +FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 +FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 +XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000 +XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 +XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000 +XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 + + +I found a new trace - using the ipipe patch that I have, there seems to be an issue in the 3.4 kernels, but as it looks also in the 3.10 kernels. +http://www.xenomai.org/pipermail/xenomai/2013-March/027865.html + +Is there an update on that already existing? It was not completely clear if this issue is related either to KVM or to the ipipe patch. + +Thanks. + +attached the trace.dat (tar-gzipped) as recommended. Hope this helps finding the issue. The file should capture the following: +- windows 8 with screen that shows that the last boot attempts failed +- issued system_reset on qemu commandline +- startup of windows 8 that stalls + + +sorry for the corrupt file, this one should be fine now. + +Confirmed - the current kvm.git without any ipipe patch also causes the issue. Trace File attached. + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +Please close it, it's solved with this patch commit to kvm / kernel: +Was found and fixed with great support of Paolo Bonzini + +From: Paolo Bonzini +Date: Thu, 12 Feb 2015 17:04:47 +0100 +Subject: KVM: emulate: fix CMPXCHG8B on 32-bit hosts + + diff --git a/results/classifier/zero-shot/105/semantic/1368815 b/results/classifier/zero-shot/105/semantic/1368815 new file mode 100644 index 000000000..9552ecb7a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1368815 @@ -0,0 +1,288 @@ +semantic: 0.710 +graphic: 0.688 +assembly: 0.639 +device: 0.637 +mistranslation: 0.635 +other: 0.620 +instruction: 0.601 +vnc: 0.444 +network: 0.336 +boot: 0.309 +socket: 0.275 +KVM: 0.245 + +qemu-img convert intermittently corrupts output images + +-- Found in releases qemu-2.0.0, qemu-2.0.2, qemu-2.1.0. Tested on Ubuntu 14.04 using Ext4 filesystems. + +The command + + qemu-img convert -O raw inputimage.qcow2 outputimage.raw + +intermittently creates corrupted output images, when the input image is not yet fully synchronized to disk. While the issue has actually been discovered in operation of of OpenStack nova, it can be reproduced "easily" on command line using + + cat $SRC_PATH > $TMP_PATH && $QEMU_IMG_PATH convert -O raw $TMP_PATH $DST_PATH && cksum $DST_PATH + +on filesystems exposing this behavior. (The difficult part of this exercise is to prepare a filesystem to reliably trigger this race. On my test machine some filesystems are affected while other aren't, and unfortunately I haven't found the relevant difference between them, yet. Possible it's timing issues completely out of userspace control ...) + +The root cause, however, is the same as in + + http://lists.gnu.org/archive/html/coreutils/2011-04/msg00069.html + +and it can be solved the same way as suggested in + + http://lists.gnu.org/archive/html/coreutils/2011-04/msg00102.html + +In qemu, file block/raw-posix.c use the FIEMAP_FLAG_SYNC, i.e change + + f.fm.fm_flags = 0; + +to + + f.fm.fm_flags = FIEMAP_FLAG_SYNC; + +As discussed in the thread mentioned above, retrieving a page cache coherent map of file extents is possible only after fsync on that file. + +See also + + https://bugs.launchpad.net/nova/+bug/1350766 + +In that bug report filed against nova, fsync had been suggested to be performed by the framework invoking qemu-img. However, as the choice of fiemap -- implying this otherwise unneeded fsync of a temporary file -- is not made by the caller but by qemu-img, I agree with the nova bug reviewer's objection to put it into nova. The fsync should instead be triggered by qemu-img utilizing the FIEMAP_FLAG_SYNC, specifically intended for that purpose. + +Is there a minimum version of qemu that would be required to use the FIEMAP_FLAG_SYNC flag? + +The affected code was introduced with version 1.2.0. However, due to https://bugs.launchpad.net/qemu/+bug/1193628 I can't build these old releases to verify whether they actually expose the same behaviour. + +It seems the dust settles a bit: Found the relevant difference between my various filesystems, and how to reproduce the failure: Susceptible filesystems don't have the extent feature of ext4 enabled. + +You can create such a filesystem using + + mke2fs -t ext4 -O ^extent /dev/... + mount /mnt /dev/... + +Adapting the command line example provided above you can see + + rm -f /mnt/tmp.qcow2 + cat $SRC_PATH > /mnt/tmp.qcow2 && qemu-img convert -O raw /mnt/tmp.qcow /mnt/tmp.qcow + cksum /mnt/tmp.qcow + +creating corrupt (usually nullified) result images. By inserting a sleep of at least 33 seconds between the cat command and the qemu-img invocation I'm getting proper output. + +To me it's unclear now, where the actual defect is located. Creating ext4 filesystems with certain features disabled (such as the exetent tree) is apparently supported and ok. Is the fiemap ioctl supposed to handle this gracefully, for example by assuming FIEMAP_FLAG_SYNC in absence of an extent tree? Or are clients such as qemu-img supposed to always FIEMAP_FLAG_SYNC to be safe? + +I see seek hole is supported in the latest qemu-img so I would reorder so that's tried first like: + + if lseek(SEEK_HOLE) == ENOTSUP + use_that + if fiemap(FIEMAP_FLAG_SYNC) + use_that + +The fallback cascade Pádraig mentions is already implemented in qemu-2.1.0, in function raw_co_get_block_status. Just swap + + ret = try_fiemap( ... ) + +and + + ret = try_seek_hole( ... ) + +to reverse the order. I can confirm that it works just fine on 3.13 kernel (all version since 3.1, according to lseek(2)), while older versions will fall back to fiemap, which needs to be protected with FIEMAP_FLAG_SYNC in try_fiemap, to be safe. + +This should work under all conditions, and avoid redundant syncs where possible, right? + + + +Marking as High since duplicate bug 1350766 was marked High. + +openstack review at: + https://review.openstack.org/#/c/123957/ + +Qemu patches at: + http://patchwork.ozlabs.org/patch/393494/ ; and + http://patchwork.ozlabs.org/patch/393495/ + +FWIW the following 2 commits in qemu master resolve the issue for qemu-img. + + http://git.qemu.org/?p=qemu.git;a=commit;h=38c4d0aea3e1264c86e282d99560330adf2b6e25 + http://git.qemu.org/?p=qemu.git;a=commit;h=7c15903789953ead14a417882657d52dc0c19a24 + +If possible they should be back ported to trusty and utopic. + +You'll also need something like: + + http://git.qemu.org/?p=qemu.git;a=commit;h=4f11aa8a40351b28c0e67c7276e0003b38cc46ac + +before my 2 patches. + +Thanks for the information. Looks like we can apply these in debian too. + +Status changed to 'Confirmed' because the bug affects multiple users. + +This bug was fixed in the package qemu - 2.1+dfsg-4ubuntu7 + +--------------- +qemu (2.1+dfsg-4ubuntu7) vivid; urgency=medium + + * Apply two patches to fix intermittent qemu-img corruption + (LP: #1368815) + - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap + - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap + -- Serge Hallyn <email address hidden> Wed, 29 Oct 2014 22:31:43 -0500 + +Hi Serge, + + +Is there any chance these fixes will go into trusty? + +Hi Tony, + +yes, I've uploaded a proposed fix for trusty-proposed earlier today. It should be available for testing as soon as it is accepted. + +Awesome. + +Thanks! + +Hello Michael, or anyone else affected, + +Accepted qemu into utopic-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu/2.1+dfsg-4ubuntu6.2 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Hello Michael, or anyone else affected, + +Accepted qemu into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu/2.0.0+dfsg-2ubuntu1.8 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! + +Tested qemu-utils 2.0.0+dfsg-2ubuntu1.8. Successful. + +The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions. + +This bug was fixed in the package qemu - 2.0.0+dfsg-2ubuntu1.8 + +--------------- +qemu (2.0.0+dfsg-2ubuntu1.8) trusty-proposed; urgency=medium + + * debian/qemu-system-x86.qemu-kvm.upstart: create /dev/kvm in a + container. (LP: #1370199) + * Cherrypick upstream patch to fix intermittent qemu-img corruption + (LP: #1368815) + - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap + - (note - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap (which was + also needed in utopic) appears to be unneeded here as the code being + changed has not yet been switched to using try_fiemap) + -- Serge Hallyn <email address hidden> Thu, 20 Nov 2014 11:24:51 -0600 + +@Michael, + +by any chance would you be albe to test on utopic? + +I couldn't reproduce the bug on the old qemu myself, however Michael has verified the (same) fix on trusty, and the full qa-regression-test passed for me on utopic-proposed. So I would request that we call this verification-done. + +Looking at the fixes, I also see the following commits remove the above changes, which could mean we might encounter this again: +c4875e5 raw-posix: SEEK_HOLE suffices, get rid of FIEMAP +d1f06fe raw-posix: The SEEK_HOLE code is flawed, rewrite it + +Note there is also a related issue: +bug 1292234 +So far testing with the proposed qemu version or upstream I still encounter issues on ext4 w/ ^extent and ext3 filesystems. + +Filed a separate issue for MOS https://bugs.launchpad.net/mos/+bug/1401261 + +Hi Chris, +Markus' rework will not reintroduce this bug as it completely removes all fiemap code. + +bug 129224 is a different issue, I'll comment on that bug. + +You say: you encounter issues with upstream with ^extent and ext3 filesystems. Just to be clear: Are you saying that *this* bug is still a problem for you? + +if it's a different bug then I write it up and I'll take a look. + +Tony, + +Yea, its a different bug. I tested with the above patched package and upstream qemu from git, and I can still hit bug 129224. I was hoping this also fixed my issue, but unfortunately it seems to be a different issue that occurs when using the same types of filesystems. I have a solid reproducer on my desk so let me know which experiments / areas of code / etc I should look at. + +Just to clarify it's bug 1292234 in the previous comment. + +Chris, +I've read through 1292234 and I'll have a play with your reproducer locally and see if I can gain any insight. + +I'm sorry my fix didn't help 1292234, but glad you can't hit 1368815 with upstream, I was kinda having kittens here ;P + +This bug was fixed in the package qemu - 2.1+dfsg-4ubuntu6.2 + +--------------- +qemu (2.1+dfsg-4ubuntu6.2) utopic-proposed; urgency=medium + + * Apply two patches to fix intermittent qemu-img corruption + (LP: #1368815) + - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap + - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap + -- Serge Hallyn <email address hidden> Thu, 20 Nov 2014 16:33:09 -0600 + +I'm happy to tackle to also fix cinder with a version of the nova fix (for consistency). I propose waiting until the nova fix lands + +I'd elevate this to high so it matches nova and ubuntu but I don't have permissions to do so. + +Fix proposed to branch: master +Review: https://review.openstack.org/141259 + +> - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap +> - (note - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap (which was +> also needed in utopic) appears to be unneeded here as the code being +> changed has not yet been switched to using try_fiemap) + +Actually such a enforces fsync and drastically reduces the performance of conversion. +I propose to use seek_hole instead of FIEMAP (which is basically what + 502-block-raw-posic-use-seek-hole-ahead-of-fiemap does). + + +The second part of the fix (which does not reduce the performance) for qemu 2.0 (apparently uploading two patches at once is not so easy) + +Patchg 0500-block-raw-posix-Try-both-FIEMAP-and-SEEK_HOLE.patch appears to be part of a bigger re-write of the related code. and is ON TOP of the patches already applied in this bug. + + +No doubt the rewirtten code is "better" but backporting it contains more risk than the 2 simple fixes I already nominated. + +> Patch 0500-block-raw-posix-Try-both-FIEMAP-and-SEEK_HOLE.patch appears to be part of a bigger re-write +> of the related code. and is ON TOP of the patches already applied in this bug. + +Yep, sorry for not mentioning this. As far as I understand qemu-2.1 package contains this partially rewritten +code too (without any recent changes like disabling FIEMAP completely and rewriting the code using SEEK_HOLE). + +> No doubt the rewirtten code is "better" but backporting it contains more risk than the 2 simple fixes I already nominated. + +Can we completely disable the FIEMAP code and pretend that all blocks are allocated? I'm afraid fsync'ing 100+ GB +files might be even slower than ignoring the sparseness. + +Change abandoned by John Griffith (<email address hidden>) on branch: master +Review: https://review.openstack.org/141259 + +Fix proposed to branch: master +Review: https://review.openstack.org/143575 + +Change abandoned by Mike Perez (<email address hidden>) on branch: master +Review: https://review.openstack.org/143575 +Reason: 1 month, no update. + +Change abandoned by Tony Breeds (<email address hidden>) on branch: master +Review: https://review.openstack.org/123957 +Reason: The main distros we care about have landed or are in progress. + +Marking as Wont-Fix. + +Change abandoned by Mike Perez (<email address hidden>) on branch: master +Review: https://review.openstack.org/143575 +Reason: No activity for over a month. + +Closing based on the assumption that a working qemu-img is available now. + +According to comment #8 the fixes have been included in the upstream QEMU repository, so setting the status to "Fix released" now. + diff --git a/results/classifier/zero-shot/105/semantic/1370 b/results/classifier/zero-shot/105/semantic/1370 new file mode 100644 index 000000000..c7f125abf --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1370 @@ -0,0 +1,26 @@ +semantic: 0.986 +instruction: 0.949 +graphic: 0.878 +assembly: 0.842 +device: 0.806 +socket: 0.681 +vnc: 0.658 +network: 0.606 +boot: 0.546 +mistranslation: 0.502 +other: 0.474 +KVM: 0.210 + +x86 BLSI and BLSR semantic bug +Description of problem: +The result of instruction BLSI and BLSR is different from the CPU. The value of CF is different. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("blsi rax, rbx"); +} +``` +2. Execute and compare the result with the CPU. The value of `CF` is exactly the opposite. This problem happens with BLSR, too. +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1371 b/results/classifier/zero-shot/105/semantic/1371 new file mode 100644 index 000000000..16aca0c8f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1371 @@ -0,0 +1,32 @@ +semantic: 0.995 +instruction: 0.970 +assembly: 0.852 +graphic: 0.824 +device: 0.665 +mistranslation: 0.491 +boot: 0.468 +vnc: 0.465 +socket: 0.452 +network: 0.307 +other: 0.217 +KVM: 0.064 + +x86 BLSMSK semantic bug +Description of problem: +The result of instruction BLSMSK is different with from the CPU. The value of CF is different. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("mov rax, 0x65b2e276ad27c67"); + asm("mov rbx, 0x62f34955226b2b5d"); + asm("blsmsk eax, ebx"); +} +``` +2. Execute and compare the result with the CPU. + - CPU + - CF = 0 + - QEMU + - CF = 1 +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1372 b/results/classifier/zero-shot/105/semantic/1372 new file mode 100644 index 000000000..a5a86bb81 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1372 @@ -0,0 +1,33 @@ +semantic: 0.995 +instruction: 0.963 +graphic: 0.860 +device: 0.736 +assembly: 0.728 +mistranslation: 0.574 +vnc: 0.420 +boot: 0.411 +socket: 0.263 +network: 0.171 +other: 0.159 +KVM: 0.083 + +x86 BEXTR semantic bug +Description of problem: +The result of instruction BEXTR is different with from the CPU. The value of destination register is different. I think QEMU does not consider the operand size limit. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("mov rax, 0x17b3693f77fb6e9"); + asm("mov rbx, 0x8f635a775ad3b9b4"); + asm("mov rcx, 0xb717b75da9983018"); + asm("bextr eax, ebx, ecx"); +} +``` +2. Execute and compare the result with the CPU. + - CPU + - RAX = 0x5a + - QEMU + - RAX = 0x635a775a +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1373 b/results/classifier/zero-shot/105/semantic/1373 new file mode 100644 index 000000000..d4cf2ba67 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1373 @@ -0,0 +1,33 @@ +semantic: 0.996 +assembly: 0.951 +instruction: 0.948 +graphic: 0.898 +device: 0.784 +vnc: 0.746 +mistranslation: 0.642 +boot: 0.505 +socket: 0.502 +network: 0.373 +other: 0.248 +KVM: 0.113 + +x86 ADOX and ADCX semantic bug +Description of problem: +The result of instruction ADOX and ADCX are different from the CPU. The value of one of EFLAGS is different. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("push 512; popfq;"); + asm("mov rax, 0xffffffff84fdbf24"); + asm("mov rbx, 0xb197d26043bec15d"); + asm("adox eax, ebx"); +} +``` +2. Execute and compare the result with the CPU. This problem happens with ADCX, too (with CF). + - CPU + - OF = 0 + - QEMU + - OF = 1 +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1374 b/results/classifier/zero-shot/105/semantic/1374 new file mode 100644 index 000000000..0db6a0198 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1374 @@ -0,0 +1,35 @@ +semantic: 0.993 +instruction: 0.952 +graphic: 0.807 +device: 0.700 +assembly: 0.661 +boot: 0.382 +vnc: 0.341 +socket: 0.314 +network: 0.272 +mistranslation: 0.272 +other: 0.093 +KVM: 0.044 + +x86 BZHI semantic bug +Description of problem: +The result of instruction BZHI is different from the CPU. The value of destination register and SF of EFLAGS are different. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("mov rax, 0xb1aa9da2fe33fe3"); + asm("mov rbx, 0x80000000ffffffff"); + asm("mov rcx, 0xf3fce8829b99a5c6"); + asm("bzhi rax, rbx, rcx"); +} +``` +2. Execute and compare the result with the CPU. + - CPU + - RAX = 0x0x80000000ffffffff + - SF = 1 + - QEMU + - RAX = 0xffffffff + - SF = 0 +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1375 b/results/classifier/zero-shot/105/semantic/1375 new file mode 100644 index 000000000..3d7746382 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1375 @@ -0,0 +1,32 @@ +semantic: 0.988 +instruction: 0.945 +graphic: 0.792 +device: 0.739 +other: 0.687 +assembly: 0.675 +vnc: 0.520 +network: 0.458 +boot: 0.413 +socket: 0.407 +mistranslation: 0.269 +KVM: 0.247 + +x86 SSE/SSE2/SSE3 instruction semantic bugs with NaN +Description of problem: +The result of SSE/SSE2/SSE3 instructions with NaN is different from the CPU. From Intel manual Volume 1 Appendix D.4.2.2, they defined the behavior of such instructions with NaN. But I think QEMU did not implement this semantic exactly because the byte result is different. +Steps to reproduce: +1. Compile this code +``` +void main() { + asm("mov rax, 0x000000007fffffff; push rax; mov rax, 0x00000000ffffffff; push rax; movdqu XMM1, [rsp];"); + asm("mov rax, 0x2e711de7aa46af1a; push rax; mov rax, 0x7fffffff7fffffff; push rax; movdqu XMM2, [rsp];"); + asm("addsubps xmm1, xmm2"); +} +``` +2. Execute and compare the result with the CPU. This problem happens with other SSE/SSE2/SSE3 instructions specified in the manual, Volume 1 Appendix D.4.2.2. + - CPU + - xmm1[3] = 0xffffffff + - QEMU + - xmm1[3] = 0x7fffffff +Additional information: +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/105/semantic/1395217 b/results/classifier/zero-shot/105/semantic/1395217 new file mode 100644 index 000000000..1c7cc3f88 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1395217 @@ -0,0 +1,330 @@ +semantic: 0.905 +mistranslation: 0.859 +assembly: 0.806 +graphic: 0.801 +other: 0.800 +socket: 0.789 +instruction: 0.786 +network: 0.781 +device: 0.776 +vnc: 0.719 +KVM: 0.655 +boot: 0.588 + +Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11 + +The networking code in qemu in versions 2.0.0 and beyond is non-functional with Solaris/Illumos 5.11 images. + +Building 1.7.1, 2.0.0, 2.0.2, 2.1.2,and 2.2.0rc1with the following standard Slackware config: + +# From Slackware build tree . . . +./configure \ + --prefix=/usr \ + --libdir=/usr/lib64 \ + --sysconfdir=/etc \ + --localstatedir=/var \ + --enable-gtk \ + --enable-system \ + --enable-kvm \ + --disable-debug-info \ + --enable-virtfs \ + --enable-sdl \ + --audio-drv-list=alsa,oss,sdl,esd \ + --enable-libusb \ + --disable-vnc \ + --target-list=x86_64-linux-user,i386-linux-user,x86_64-softmmu,i386-softmmu \ + --enable-spice \ + --enable-usb-redir + + +And attempting to run the same VM image with the following command (or via virt-manager): + +macaddress="DE:AD:BE:EF:3F:A4" + +qemu-system-x86_64 nex4x -cdrom /dev/cdrom -name "Nex41" -cpu Westmere +-machine accel=kvm -smp 2 -m 4000 -net nic,macaddr=$macaddress -net bridge,br=b +r0 -net dump,file=/usr1/tmp/<FILENAME> -drive file=nex4x_d1 -drive file=nex4x_d2 + -enable-kvm + +Gives success on 1.7.1, and a deaf VM on all subsequent versions. + +Notable in validating my config, is that a Windows 7 image runs cleanly with networking on *all* builds, so my configuration appears to be good - qemu just hates Solaris at this point. + +Watching with wireshark (as well as pulling network traces from qemu as noted above) it appears that the notable difference in the two configs is that for some reason, Solaris gets stuck arping for it's own interface on startup, and never really comes on line on the network. If other hosts attempt to ping the Solaris instance, they can successfully arp the bad VM, but not the other way around. + + + + + +Note that the host system, network config, etc. are identical, qemu is built with an identical config, and started with the same command - the *ONLY* variable is the qemu version. This is utilizing the bridge-helper binary, but as noted earlier, using virt-manager whether allowing it to define it's on network, or using the existing bridge config on this box, the behaviour is the same, and only Solaris is failing. + +I note also that the failure happens with both the e1000 and the rtl8139 interfaces - this does not appear to be an issue with the drivers, but more a case of how qemu passes traffic to and from the tap device. Looking at the tap device with wireshark, I can see the external traffic as well as traffic from qemu - it just appears that some does not make it into Solaris. + +I also noted discussions several years ago regarding a very similar issue, but do not have a bug number at this point (2010 vintage). Not certain that that is relevant, but it definitely is similar. + +Host platform is Slackware 14.1, x86_64 . . . cc 4.8.2, kernel 3.10.17 + + +Can you try bisecting between 1.7 and 2.0 with git? + +Paolo - I should have some time to do that this week, as well as bone up on git (it's been a bit . . .) + +And thanks for the quick reply! + +Bisected merrily away, and this is where it definitively begins to fail . . . To verify, I checked out both commits, and confirmed change in function at this point. I attempted a revoke of this commit on my clone to test, but too many merge errors to make that a simple task, so that was not done. + +commit ef02ef5f4536dba090b12360a6c862ef0e57e3bc +Author: Eduardo Habkost <email address hidden> +Date: Wed Feb 19 11:58:12 2014 -0300 + + target-i386: Enable x2apic by default on KVM + + When on KVM mode, enable x2apic by default on all CPU models. + + Normally we try to keep the CPU model definitions as close as the real + CPUs as possible, but x2apic can be emulated by KVM without host CPU + support for x2apic, and it improves performance by reducing APIC access + overhead. x2apic emulation is available on KVM since 2009 (Linux + 2.6.32-rc1), there's no reason for not enabling x2apic by default when + running KVM. + + Signed-off-by: Eduardo Habkost <email address hidden> + Acked-by: Michael S. Tsirkin <email address hidden> + Signed-off-by: Andreas Färber <email address hidden> + +:040000 040000 ebdc1ecd08cb507db62cc465696925a4cde6174f e83d9c32f821714600c48594 +15911910d4b37c0d M hw +:040000 040000 9064bc796128ba1380b67a86af9718dcc1022f0d 5cb337c72259b54780856806 +8f56f4abfa628579 M target-i386 + + +This does not appear to be run-time selectable (or I have not found the option yet . . . ) so not quire sure how to verify if backing this out will resolve the issue in later versions. + + +Additional test (I just don't know when to go to bed . . . *sigh* . . . ). + +In a checkout of the 2.1.2 code base, and based on the above failing commit as per bisect, I removed the change in the commit for target-i386/cpu.c of the line: + +[FEAT_1_ECX] = CPUID_EXT_X1APIC, + +as added by the errant commit, recompiled, and networking is now working with Illumos in 2.1.2, so this commit is definitely not as innocent as it may appear. + +It is runtime selectable using "-cpu ...,-x2apic" (as indicated by Markus on qemu-devel). + +First thing we need to find out is if it fails on the newest CPU model that can be run in enforce mode. + +So, assuming you are running on an Intel host CPU, it would be interesting to test those CPU models in this order, until you have one that actually boots: + + -cpu Broadwell,enforce + -cpu Haswell,enforce + -cpu SandyBridge,enforce + -cpu Westmere,enforce + -cpu Nehalem,enforce + -cpu Penryn,enforce + -cpu Conroe,enforce + +Testing of: + -cpu host +would be interesting, too. + +If the latest CPU model (or -cpu host) have working networking, that means Solaris (or QEMU NIC emulation code) doesn't like to see an old CPU with x2apic enabled. If it doesn't work even using the latest CPU model (and -cpu host), that means Solaris (or QEMU NIC emulation) doesn't like the x2apic implementation of KVM at all (and that could mean a Solaris bug, a QEMU bug, or a KVM x2apic emulation bug). + + +Broadwell - Fails, Host won't support it: + +warning: host doesn't support requested feature: CPUID.01H:ECX.fma [bit 12] +warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22] +warning: host doesn't support requested feature: CPUID.07H:EBX.fsgsbase [bit 0] +warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3] +warning: host doesn't support requested feature: CPUID.07H:EBX.hle [bit 4] +warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5] +warning: host doesn't support requested feature: CPUID.07H:EBX.smep [bit 7] +warning: host doesn't support requested feature: CPUID.07H:EBX.bmi2 [bit 8] +warning: host doesn't support requested feature: CPUID.07H:EBX.erms [bit 9] +warning: host doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10] +warning: host doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11] +warning: host doesn't support requested feature: CPUID.07H:EBX.rdseed [bit 18] +warning: host doesn't support requested feature: CPUID.07H:EBX.adx [bit 19] +warning: host doesn't support requested feature: CPUID.07H:EBX.smap [bit 20] +warning: host doesn't support requested feature: CPUID.80000001H:ECX.3dnowprefetch [bit 8] +qemu-system-x86_64: Host doesn't support requested features + +Haswell fails, host won't support it: + +warning: host doesn't support requested feature: CPUID.01H:ECX.fma [bit 12] +warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22] +warning: host doesn't support requested feature: CPUID.07H:EBX.fsgsbase [bit 0] +warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3] +warning: host doesn't support requested feature: CPUID.07H:EBX.hle [bit 4] +warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5] +warning: host doesn't support requested feature: CPUID.07H:EBX.smep [bit 7] +warning: host doesn't support requested feature: CPUID.07H:EBX.bmi2 [bit 8] +warning: host doesn't support requested feature: CPUID.07H:EBX.erms [bit 9] +warning: host doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10] +warning: host doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11] +qemu-system-x86_64: Host doesn't support requested features + + +SandyBridge (this is the test box physical CPU) fails, no errors, networking dead, as per initial problem. + +Westmere fails, no networking. + +Nehalem fails, no networking + +Panryn fails, no networking + +Conroe fails, no networking + +host fails, no networking + +Just to ensure that all else was good, I tested SandyBridge, Westmere, Conroe, and host with "-x2apic" and every one works with x2apic disabled. + +This test box is a laptop, and I am only testing on it since I am away from my primary server (Dell 2950) for the holiday. Both Intel, but not even close to the same CPU . . . same problem observed on both, although workaround not tested yet on primary. + + +Test box (for this data) CPU into: + +processor : 0 +vendor_id : GenuineIntel +cpu family : 6 +model : 42 +model name : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz +stepping : 7 +microcode : 0x25 +cpu MHz : 1200.000 +cache size : 3072 KB +physical id : 0 +siblings : 4 +core id : 0 +cpu cores : 2 +apicid : 0 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov +pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm c +onstant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu + pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid s +se4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb + xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid +bogomips : 4984.29 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +(Repeats for 4 cores) + + + + +Primary system: + +processor : 0 +vendor_id : GenuineIntel +cpu family : 6 +model : 15 +model name : Intel(R) Xeon(R) CPU E5345 @ 2.33GHz +stepping : 7 +microcode : 0x6b +cpu MHz : 2000.000 +cache size : 4096 KB +physical id : 0 +siblings : 4 +core id : 0 +cpu cores : 4 +apicid : 0 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 10 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov +pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant +_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vm +x est tm2 ssse3 cx16 xtpr pdcm dca lahf_lm dtherm tpr_shadow +bogomips : 4655.23 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +(Repeats for 8 cores) + + +Note that this Illumos image is certified/runs cleanly on Intel hardware from the last 5 years when natively on it. I doubt that it is a kernel problem with Illumos with regard to the actual CPU architecture. Older releases that are OpenSolaris based also see the problem. + +Generally speaking, I don't think that an issue of this nature has ever been seen with this OS image on any Intel or AMD CPU ever tested . . . so unless there is something in Illumos that is only triggered by qemu, I find it hard to imagine it being an Illumos bug, but then again, it's not like oddities like this never happen . . . + +And thanks for all the quick attention! If nothing else, it got me to a point whereby I can work around the problem, and not be stuck on older builds that virt-manager hates . . . . + +(Wow . . . that last was incredibly redundant . . . staying up most of the night working on this has apparently left me a bit stupid this morning/afternoon . . . sorry!) + + +So, if it breaks even with -cpu SandyBridge and -cpu host, it is likely to be a KVM or QEMU bug. Thanks for the testing! + +Much appreciated! Please let me know if there is anything else I can do to help this bug progress . . . . + +- Tim + +FWIW there's some other hits on this: + +Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1040500 +Openstack mailing list: http://lists.openstack.org/pipermail/openstack-dev/2014-December/053478.html + +Hello to all, I confirm this bug in qemu. + +12 different Linux versions/distributions and 1 Windows 7 VM are running fine without any networking issue. +Solaris 5.11 Version 11.2 can be installed (text version) and is running but network is broken. + +DHCPOFFER will not be received by Solaris 5.11 VM's (RX not working) for Automatic profile. +If DefaultFixed profile is online there is the same behavior. +Arp table on Solaris containes the own entry which is completed. +If I ping another host, the IP will be added but no MAC, which indicates that also no ARP package will be received. + +I could NOT get it working with disabled x2apic (tested with different CPU types). +Is there something additional which has to be changed? + +qemu version is 2.0.0+dfsg-2ubuntu1.10 @ ubuntu 14.04.2 LTS, Kernel 3.13.0-49-generic. + + + +See also bug #638955 + +See the following bug report for a working Solaris 10 KVM guest configuration: +https://bugzilla.redhat.com/show_bug.cgi?id=1262093 + +#17 +I have the same situtaion +when I use cpu line as "-cpu qemu64,-x2apic" the network still doesn't work. +maybe there is another way to remove x2apic,but I don't get it. +for the arp ,as you say ,there is not MAC. +Have you solve the problem ? + + +host: ubuntu 14.04 +qemu img:openindiana 5.11 + + +any one have a right way ? + +I fixed this by adding the configuration in the xml configuration file: + <cpu mode='custom' match='exact'> + <model fallback='allow'>SandyBridge</model> + <feature policy='disable' name='x2apic'/> + </cpu> + +See also attachement (https://bugzilla.redhat.com/attachment.cgi?id=1072357) of bug https://bugzilla.redhat.com/show_bug.cgi?id=1262093. + +Note that I tested with Solaris 10, not openindiana 5.11 + +On Fedora, I had to use this command to edit the VM config file: +virsh edit <put_here_name_of_your_vm> + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1405 b/results/classifier/zero-shot/105/semantic/1405 new file mode 100644 index 000000000..f221dd241 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1405 @@ -0,0 +1,134 @@ +semantic: 0.917 +device: 0.914 +assembly: 0.910 +other: 0.905 +graphic: 0.903 +instruction: 0.893 +socket: 0.878 +vnc: 0.868 +boot: 0.847 +network: 0.821 +KVM: 0.792 +mistranslation: 0.643 + +linux-user: calling SYS_get_thread_area and SYS_get_thread_area has incorrent result on multithread environment +Description of problem: + +Steps to reproduce: +1. Compile test.out by Command and source code: +``` +gcc -m32 -g test.c -lpthread -o test.out +``` +``` +#include <sys/syscall.h> +#include <unistd.h> +#include <stdio.h> +#include <pthread.h> +#include <asm/ldt.h> + +static inline int set_thread_area( struct user_desc *ptr ) +{ + return syscall( SYS_set_thread_area, ptr ); +} + +static inline int get_thread_area( struct user_desc *ptr ) +{ + return syscall( SYS_get_thread_area, ptr ); +} + +static unsigned int entry_number; + +static void* start_routine(void* ptr) +{ + struct user_desc user_desc0 = { entry_number }; + struct user_desc user_desc1 = { entry_number }; + struct user_desc user_desc2 = { entry_number }; + get_thread_area(&user_desc0); + printf("child thread: %u\n", user_desc0.base_addr); + + user_desc1.base_addr = 2; + user_desc1.limit = 0xFFF; + user_desc1.seg_32bit = 1; + set_thread_area( &user_desc1 ); + + get_thread_area(&user_desc2); + printf("child thread: %u\n", user_desc2.base_addr); + return NULL; +} + +int main(void) { + struct user_desc user_desc0 = { -1 }, user_desc1 = { 0 }, user_desc2 = { 0 }; + user_desc0.seg_32bit = 1; + user_desc0.useable = 1; + set_thread_area( &user_desc0 ); + + entry_number = user_desc0.entry_number; + + user_desc1.entry_number = entry_number; + user_desc1.base_addr = 1; + user_desc1.limit = 0xFFF; + user_desc1.seg_32bit = 1; + set_thread_area( &user_desc1 ); + + pthread_t thread_id; + pthread_create(&thread_id, NULL, &start_routine, NULL); + pthread_join(thread_id, NULL); + + user_desc2.entry_number = entry_number; + get_thread_area(&user_desc2); + printf("main thread: %u\n", user_desc2.base_addr); // main thread: 1 + return 0; +} + ``` +2. Correct Result: +``` +child thread: 1 +child thread: 2 +main thread: 1 +``` +qemu-i386 Print Result: +``` +child thread: 1 +child thread: 2 +main thread: 2 +``` +Additional information: +patch for fix the bug: + +https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg02203.html + +CPUX86State::gdt::base on differect threads must have different vaules, but it points to same memory. +value of CPUX86State::gdt::base must be copied when clone thread. + +https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/tls.c + +SYS_set_thread_area call do_set_thread_area in kernel, it set user_desc to different memroy area on differernt threads. tls_array is in thread local memory. + +``` +static void set_tls_desc(struct task_struct *p, int idx, + const struct user_desc *info, int n) +{ + struct thread_struct *t = &p->thread; + struct desc_struct *desc = &t->tls_array[idx - GDT_ENTRY_TLS_MIN]; + int cpu; + + /* + * We must not get preempted while modifying the TLS. + */ + cpu = get_cpu(); + + while (n-- > 0) { + if (LDT_empty(info) || LDT_zero(info)) + memset(desc, 0, sizeof(*desc)); + else + fill_ldt(desc, info); + ++info; + ++desc; + } + + if (t == ¤t->thread) + load_TLS(t, cpu); + + put_cpu(); +} +``` diff --git a/results/classifier/zero-shot/105/semantic/1407808 b/results/classifier/zero-shot/105/semantic/1407808 new file mode 100644 index 000000000..f9a95f247 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1407808 @@ -0,0 +1,32 @@ +semantic: 0.800 +graphic: 0.736 +instruction: 0.679 +device: 0.624 +other: 0.565 +network: 0.475 +mistranslation: 0.384 +socket: 0.363 +vnc: 0.341 +boot: 0.289 +assembly: 0.200 +KVM: 0.158 + +virtual console gives strange response to ANSI DSR + +With "-serial vc" (which is the default), qemu make strange responses to the ANSI DSR escape sequence (\033[6n) which can confuse guests. + +Terminal emulators supporting the ANSI escape sequences usually support the "Device Status Report" escape sequence, \033[6n, to which as a response the terminal injects as input the response \033[n;mR, containing the current cursor position. An application running in the guest can use this escape sequence to, for example, figure out the size of the terminal it is running under, which can be useful as the guest has no other standard way to figure out a "size" for the serial port. + +Unfortunately, it seems that qemu when run with "-serial vc" (which appears to be the default), when qemu gets the \033[6n escape sequence on the serial port, it just responds with a single \033, and that's it! This can confuse an application, could concievably assume that a terminal either supports this escape sequence and injects the correct response (\033[n;mR), or doesn't support it and injects absolutely nothing as input - but not something in between. + +This caused a problem on one shell implementation on OSv that tried to figure out the terminal's size, and had to work around this unexpected behavior (see https://github.com/cloudius-systems/osv/commit/b79223584be40459861d1c12e1cb67e3e49e2a12). + +Looking through old bug tickets... is this still an issue with the latest version of QEMU? Or could we close this ticket nowadays? + + +The bug still very much exists (I tested qemu 4.2.1): +If you don't use "-serial stdio" (or its newer variants), by default Qemu opens a new black "console" to run the application. It is not clear to me exactly which terminal this console is supposed to emulate, but it does seem to support most ANSI escape sequences I tried. However, it supports the ANSI "DSR" (Device Status Report) escape sequence, ESC [ 6 n (see https://en.wikipedia.org/wiki/ANSI_escape_code), incorrectly, just as I reported in the original issue. This is still true today. + +This should be fixed in head-of-git by commit 8eb13bbbac08a, which will be in QEMU 6.0. (The underlying bug is that when the GTK front-end tries to send sequences of more than one byte to a UART, it didn't account for UARTs which don't have a FIFO capable of holding the whole sequence at once.) + + diff --git a/results/classifier/zero-shot/105/semantic/1425597 b/results/classifier/zero-shot/105/semantic/1425597 new file mode 100644 index 000000000..a2a884997 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1425597 @@ -0,0 +1,44 @@ +semantic: 0.710 +instruction: 0.624 +other: 0.609 +graphic: 0.586 +device: 0.544 +mistranslation: 0.462 +boot: 0.420 +assembly: 0.411 +vnc: 0.259 +socket: 0.167 +network: 0.167 +KVM: 0.155 + +moving window + changing screen resolution = bug + +Steps to reproduce: +1. Run qemu (sdl) +2. Start moving the window +3. At that moment the virtualized OS should change its screen resolution (for example, when switching from initial qemu screen to grub) + +What I see: +Window size doesn't change, but internal screen resolution changes, so, image scale stops to be 1:1, now I see virtualized OS in wrong scale. + +What I expected to see: +Window size changes so, that it keeps synchronized with internal resolution (as usual) + +This bug preserves at lastest git version at the moment, i. e. 3d30395f7fb3315e4ecf0de4e48790e1326bbd47 + +Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU, using SDL2? Or could we close this ticket nowadays? In case the problem persists, please also specify which host system (Linux? Window manager? Or Windows?) you are using! + +The bug is not present in Qemu 2.12 (version reported by dpkg: qemu-system-x86 1:2.12+dfsg-1+b1). There is similar minor bug instead. The new bug doesn't harm me and I am not even sure is this a bug. + +So, this is info about new minor bug. + +Host is Debian Sid installed today (23 May MSK) with mentioned Qemu 2.12. I run Qemu in X using: + +# qemu-system-x86_64 -m 1024 -daemonize -snapshot -drive file=/dev/sda,format=raw,cache=none -kernel /boot/vmlinuz* -initrd /boot/initrd* -append root=/dev/sda + +Then I started to move Qemu window using mouse. I kept Qemu window catched by my mouse while guest OS boot. At some moment guest OS (Linux) switched from text mode to framebuffer. Internal screen resolution changed, but Qemu window didn't change its size (I kept moving Qemu window all this time), but likely window content didn't scale (this is as opposed to 3d30395f [3d30395f7fb3315e4ecf0de4e48790e1326bbd47] behavior, 3d30395f scaled window content). So text in window remain concrete (not fluid, as opposed to 3d30395f). Then I tried to resize Qemu window and window size imidiately became normal, i. e. it started to fit content. + +All this is acceptable for me. I. e. I see this is nothing wrong when Qemu cannot change its window size when I move Qemu window + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1428352 b/results/classifier/zero-shot/105/semantic/1428352 new file mode 100644 index 000000000..b63b8cedc --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1428352 @@ -0,0 +1,63 @@ +semantic: 0.788 +other: 0.739 +graphic: 0.673 +mistranslation: 0.665 +instruction: 0.659 +assembly: 0.653 +device: 0.631 +vnc: 0.613 +KVM: 0.595 +network: 0.538 +boot: 0.531 +socket: 0.491 + +SYSRET instruction incorrectly implemented + +The Intel architecture manual states that when returning to user mode, the SYSRET instruction will re-load the stack selector (%ss) from the IA32_STAR model specific register using the following logic: + +SS.Selector <-- (IA32_STAR[63:48]+8) OR 3; (* RPL forced to 3 *) + +Another description of the instruction behavior which shows the same logic in a slightly different form can also be found here: + +http://tptp.cc/mirrors/siyobik.info/instruction/SYSRET.html + +[...] + SS(SEL) = IA32_STAR[63:48] + 8; + SS(PL) = 0x3; +[...] + +In other words, the value of the %ss register is supposed to be loaded from bits 63:48 of the IA32_STAR model-specific register, incremented by 8, and then ORed with 3. ORing in the 3 sets the privilege level to 3 (user). This is done since SYSRET returns to user mode after a system call. + +However, helper_sysret() in target-i386/seg_helper.c does not do the "OR 3" step. The code looks like this: + + cpu_x86_load_seg_cache(env, R_SS, selector + 8, + 0, 0xffffffff, + DESC_G_MASK | DESC_B_MASK | DESC_P_MASK | + DESC_S_MASK | (3 << DESC_DPL_SHIFT) | + DESC_W_MASK | DESC_A_MASK); + +It should look like this: + + cpu_x86_load_seg_cache(env, R_SS, (selector + 8) | 3, + 0, 0xffffffff, + DESC_G_MASK | DESC_B_MASK | DESC_P_MASK | + DESC_S_MASK | (3 << DESC_DPL_SHIFT) | + DESC_W_MASK | DESC_A_MASK); + +The code does correctly set the privilege level bits for the code selector register (%cs) but not for the stack selector (%ss). + +The effect of this is that when SYSRET returns control to the user-mode caller, %ss will be have the privilege level bits cleared. In my case, it went from 0x2b to 0x28. This caused a crash later: when the user-mode code was preempted by an interrupt, and the interrupt handler would do an IRET, a general protection fault would occur because the %ss value being loaded from the exception frame was not valid for user mode. (At least, I think that's what happened.) + +This behavior seems inconsistent with real hardware, and also appears to be wrong with respect to the Intel documentation, so I'm pretty confident in calling this a bug. :) + +Note that this issue seems to have been around for a long time. I discovered it while using QEMU 2.2.0, but I happened to have the sources for QEMU 0.10.5, and the problem is there too (in os_helper.c). I am using FreeBSD/amd64 9.1-RELEASE as my host system, without KVM. + +The fix is fairly simple. I'm attaching a patch which worked for me. Using this fix, the code that I'm testing now behaves the same on the QEMU virtual machine as on real hardware. + +- Bill (<email address hidden>) + + + +If I've got that right, this has been fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=ac57622985220de0 + diff --git a/results/classifier/zero-shot/105/semantic/1469 b/results/classifier/zero-shot/105/semantic/1469 new file mode 100644 index 000000000..5163be6b7 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1469 @@ -0,0 +1,61 @@ +semantic: 0.776 +vnc: 0.770 +device: 0.733 +instruction: 0.683 +socket: 0.621 +graphic: 0.601 +KVM: 0.590 +assembly: 0.583 +network: 0.579 +mistranslation: 0.405 +boot: 0.375 +other: 0.149 + +QEMU 7.2.0 - make install fail +Description of problem: +`[10055/10057] Generating docs/QEMU manual with a custom command +[10056/10057] Generating docs/QEMU man pages with a custom command +[10056/10057] Installing files. +Traceback (most recent call last): + File "/home/clive/.local/bin/meson", line 5, in <module> + from mesonbuild.mesonmain import main +ModuleNotFoundError: No module named 'mesonbuild' +FAILED: meson-internal__install +/home/clive/.local/bin/meson install --no-rebuild +ninja: build stopped: subcommand failed. +make: *** [Makefile:165: run-ninja] Error 1 +[clive@localhost build]$ +` +Steps to reproduce: +1. as user in shell +2. `wget https://download.qemu.org/qemu-7.2.0.tar.xz` +2. `tar xvJf qemu-7.2.0.tar.xz` +3. `cd qemu-7.2.0` +4. `./configure` +5. `make install` +Additional information: +installed meson via `pip3 --user` + +`pip3 --list` **Output** `meson version 1.0.0` + +**Using** - python version 3.11.1 + +`ninja-build` installed via package manager `dnf` + +**Using** - ninja-build version 1.8.2 + +Used `dnf builddep` on `ninja-build`, `meson`, and `qemu-kvm` before and after installation confirming I have dependencies. + + File "/home/clive/.local/bin/meson" contains +``` +#!/usr/local/bin/python3.11 +# -*- coding: utf-8 -*- +import re +import sys +from mesonbuild.mesonmain import main +if __name__ == '__main__': + sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) + sys.exit(main()) + + +``` diff --git a/results/classifier/zero-shot/105/semantic/1477683 b/results/classifier/zero-shot/105/semantic/1477683 new file mode 100644 index 000000000..20f21dafb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1477683 @@ -0,0 +1,33 @@ +semantic: 0.831 +graphic: 0.790 +other: 0.700 +device: 0.671 +instruction: 0.668 +mistranslation: 0.644 +network: 0.635 +socket: 0.539 +vnc: 0.501 +boot: 0.485 +KVM: 0.410 +assembly: 0.392 + +FPU in qemu-system-i386 works incorrectly + +FPU bug in qemu-system-i386 makes software which use floating point numbers work incorrectly. For instance, the one included in attachment prints out 0 instead of 2147483648. The same code works ok in qemu-system-x86_64. + +I have this issue in QEMU 2.3.0 on two different x86 guests (Parabola GNU/Linux-libre and libreCMC). + + + +I think, that I have the same issue. After some git bisect, I found out that commit ea32aaf1a72af102b855317b47a22e75ac2965a9 has introduced the problem. Attached is a patch that fixes the issue for me. Maybe you can try this out, too. + +Thanks! That patch solves the issue for me. May I ask maintainer to commit the fix? + +Someone has posted a similar fix a few weeks ago, and it has just been merged. + +Great, thanks for the information. I was just about to send the patch to the mailing list, but this seems to unnecessary now. + +If I've got that right, the fix had been included here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=178846bdd93994c1acaf +... so closing this ticket now. + diff --git a/results/classifier/zero-shot/105/semantic/1478376 b/results/classifier/zero-shot/105/semantic/1478376 new file mode 100644 index 000000000..bbe591c68 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1478376 @@ -0,0 +1,62 @@ +semantic: 0.907 +device: 0.743 +instruction: 0.681 +mistranslation: 0.625 +socket: 0.577 +other: 0.530 +graphic: 0.511 +vnc: 0.458 +network: 0.436 +assembly: 0.371 +boot: 0.283 +KVM: 0.225 + +PL050 KMIDATA register does not reset + +static uint32_t pl050_read(void *opaque, target_phys_addr_t offset){ + ... + case 2: /* KMIDATA */ + if (s->pending) + s->last = ps2_read_data(s->dev); + return s->last; +} + +When the receive queue is empty (s->pending is false), is the KMIDATA register supposed to be reset to 0x00? In the current implementation, the KMIDATA does not reverse its value after interrupt is lowered. + +On 26 July 2015 at 19:16, T-T Yu <email address hidden> wrote: +> Public bug reported: +> +> static uint32_t pl050_read(void *opaque, target_phys_addr_t offset){ +> ... +> case 2: /* KMIDATA */ +> if (s->pending) +> s->last = ps2_read_data(s->dev); +> return s->last; +> } +> +> When the receive queue is empty (s->pending is false), is the KMIDATA +> register supposed to be reset to 0x00? In the current implementation, +> the KMIDATA does not reverse its value after interrupt is lowered. + +The documentation for the PL050: +http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0143c/i1014040.html + +just says that when KMIDATA is read you get the value in the +receive data register. The implication is that if you read +it multiple times you'll just continue to read the same +value it holds until the keyboard sends further data to be +clocked into the register. + +Are you saying that real hardware behaves differently? + +thanks +-- PMM + + +Yes. In our pl050 keyboard controller, the KMIDATA is reset once the receive queue is empty. + +Looking through old bug tickets... is this still an issue with the latest version of QEMU? Or could we close this ticket nowadays? + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1497204 b/results/classifier/zero-shot/105/semantic/1497204 new file mode 100644 index 000000000..3baad25ab --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1497204 @@ -0,0 +1,73 @@ +semantic: 0.841 +other: 0.834 +graphic: 0.770 +instruction: 0.709 +device: 0.707 +assembly: 0.683 +network: 0.625 +boot: 0.621 +socket: 0.584 +mistranslation: 0.577 +KVM: 0.534 +vnc: 0.530 + +qemu-system-s390x: no SMP support without KVM + +It seems SMP support is not implemented for s390x target, at least when not running under KVM. There is also no error message when starting qemu, it just fails when the kernel tries to bring up the CPUs: + +$ qemu-system-s390x -nographic -smp 8 -kernel s390x/kernel.debian +[ 0.003309] Initializing cgroup subsys cpuset +[ 0.004183] Initializing cgroup subsys cpu +[ 0.004263] Initializing cgroup subsys cpuacct +[ 0.004493] Linux version 3.16.0-4-s390x (<email address hidden>) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP Debian 3.16.7-ckt9-2 (2015-04-13) +[ 0.005816] setup: Linux is running under KVM in 64-bit mode +[ 0.007231] setup: Max memory size: 128MB +[ 0.032383] Zone ranges: +[ 0.034115] DMA [mem 0x00000000-0x7fffffff] +[ 0.034652] Normal empty +[ 0.034686] Movable zone start for each node +[ 0.034737] Early memory node ranges +[ 0.034847] node 0: [mem 0x00000000-0x07ffffff] +[ 0.047489] PERCPU: Embedded 12 pages/cpu @0000000007f29000 s17920 r8192 d23040 u49152 +[ 0.049613] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32320 +[ 0.049802] Kernel command line: +[ 0.053715] PID hash table entries: 512 (order: 0, 4096 bytes) +[ 0.053993] Dentry cache hash table entries: 16384 (order: 5, 131072 bytes) +[ 0.054330] Inode-cache hash table entries: 8192 (order: 4, 65536 bytes) +[ 0.061216] Memory: 115912K/131072K available (5701K kernel code, 847K rwdata, 2512K rodata, 452K init, 776K bss, 15160K reserved) +[ 0.062432] Write protected kernel read-only data: 0x100000 - 0x905fff +[ 0.068906] Hierarchical RCU implementation. +[ 0.068934] CONFIG_RCU_FANOUT set to non-default value of 32 +[ 0.068953] RCU dyntick-idle grace-period acceleration is enabled. +[ 0.068989] RCU restricting CPUs from NR_CPUS=32 to nr_cpu_ids=9. +[ 0.069045] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=9 +[ 0.070043] NR_IRQS:260 +[ 0.094273] console [ttyS1] enabled +[ 0.095630] pid_max: default: 32768 minimum: 301 +[ 0.097792] Security Framework initialized +[ 0.100624] AppArmor: AppArmor disabled by boot time parameter +[ 0.100677] Yama: disabled by default; enable with sysctl kernel.yama.* +[ 0.102466] Mount-cache hash table entries: 512 (order: 0, 4096 bytes) +[ 0.102556] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes) +[ 0.116828] Initializing cgroup subsys memory +[ 0.117460] Initializing cgroup subsys devices +[ 0.117678] Initializing cgroup subsys freezer +[ 0.118080] Initializing cgroup subsys net_cls +[ 0.118267] Initializing cgroup subsys blkio +[ 0.118393] Initializing cgroup subsys perf_event +[ 0.118477] Initializing cgroup subsys net_prio +[ 0.119176] ftrace: allocating 17140 entries in 67 pages +XXX unknown sigp: 0xb +XXX unknown sigp: 0xb +XXX unknown sigp: 0xb +[...] +XXX unknown sigp: 0xb +[ 0.211835] cpu: 8 configured CPUs, 0 standby CPUs +XXX unknown sigp: 0xb +XXX unknown sigp: 0xb +[endless stream of messages continues until qemu is killed] + +The XXX message is printed by qemu FWIW. + +QEMU v2.11 has now experimental SMP support (see https://wiki.qemu.org/ChangeLog/2.11#s390 ... commit has been made here: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=11b0079cec6b1f46ba76c ). + diff --git a/results/classifier/zero-shot/105/semantic/1497711 b/results/classifier/zero-shot/105/semantic/1497711 new file mode 100644 index 000000000..e725d50ae --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1497711 @@ -0,0 +1,50 @@ +semantic: 0.958 +graphic: 0.803 +device: 0.783 +socket: 0.754 +vnc: 0.750 +mistranslation: 0.747 +network: 0.718 +other: 0.664 +instruction: 0.613 +boot: 0.466 +assembly: 0.412 +KVM: 0.334 + +tests/libqos/ahci.c:745: redundant condition ? + +[qemu/tests/libqos/ahci.c:745]: (style) Redundant condition: props.ncq. '!props.ncq || (props.ncq && props.lba48)' is equivalent to '!props.ncq || props.lba48' + + g_assert(!props->ncq || (props->ncq && props->lba48)); + +On Sun, Sep 20, 2015 at 10:08:49AM -0000, dcb wrote: +> Public bug reported: +> +> [qemu/tests/libqos/ahci.c:745]: (style) Redundant condition: props.ncq. +> '!props.ncq || (props.ncq && props.lba48)' is equivalent to '!props.ncq +> || props.lba48' +> +> g_assert(!props->ncq || (props->ncq && props->lba48)); + +CCing John Snow, AHCI maintainer + + +Fixed in: + +commit 3d937150dce20cb95cbaae99b6fd48dca4261f32 +Author: John Snow <email address hidden> +Date: Mon Oct 5 12:00:55 2015 -0400 + + qtest/ahci: fix redundant assertion + + Fixes https://bugs.launchpad.net/qemu/+bug/1497711 + + (!ncq || (ncq && lba48)) is the same as + (!ncq || lba48). + + The intention is simply: "If a command is NCQ, + it must also be LBA48." + + Signed-off-by: John Snow <email address hidden> + Message-id: <email address hidden> + diff --git a/results/classifier/zero-shot/105/semantic/1524546 b/results/classifier/zero-shot/105/semantic/1524546 new file mode 100644 index 000000000..72ec35dc2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1524546 @@ -0,0 +1,63 @@ +semantic: 0.938 +mistranslation: 0.921 +graphic: 0.903 +assembly: 0.895 +socket: 0.887 +boot: 0.885 +instruction: 0.881 +other: 0.877 +device: 0.861 +KVM: 0.824 +vnc: 0.796 +network: 0.772 + +qemu-img rebase error message mentions wrong file name + +While doing 'qemu-img rebase' for linking to a different backing file, if the old backing file does not exist, the command fails. During this failure, the error message shown is misleading. +e.g. qemu-img rebase -b <backing_file> <filename> throws error saying "Could not open <filename>" +Ideally it should "Could not open <old_backing_file>" + +snippet - +[root@oxy-dev ~]# qemu-img info /opt/stack/data/nova/instances/94864a64-ebf8-45e6-a777-615921216a0a/disk +image: /opt/stack/data/nova/instances/94864a64-ebf8-45e6-a777-615921216a0a/disk +file format: qcow2 +virtual size: 20G (21474836480 bytes) +disk size: 174M +cluster_size: 65536 +backing file: /tmp/3559241a79b79ae663ec6e3d9b75d469967b383b +Format specific information: + compat: 1.1 + lazy refcounts: false +[root@oxy-dev ~]# mv /tmp/3559241a79b79ae663ec6e3d9b75d469967b383b /tmp/3559241a79b79ae663ec6e3d9b75d469967b383a +[root@oxy-dev ~]# file !$ +file /tmp/3559241a79b79ae663ec6e3d9b75d469967b383a +/tmp/3559241a79b79ae663ec6e3d9b75d469967b383a: x86 boot sector; partition 1: ID=0x83, active, starthead 32, startsector 2048, 409600 sectors; partition 2: ID=0x8e, starthead 159, startsector 411648, 16365568 sectors, code offset 0xc0 +[root@oxy-dev ~]# file /tmp/3559241a79b79ae663ec6e3d9b75d469967b383b +/tmp/3559241a79b79ae663ec6e3d9b75d469967b383b: cannot open (No such file or directory) +[root@oxy-dev ~]# qemu-img rebase -b /tmp/3559241a79b79ae663ec6e3d9b75d469967b383a /opt/stack/data/nova/instances/94864a64-ebf8-45e6-a777-615921216a0a/disk +qemu-img: Could not open '/opt/stack/data/nova/instances/94864a64-ebf8-45e6-a777-615921216a0a/disk': Could not open file: No such file or directory +[root@oxy-dev ~]# +qemu-img version 1.5.3 +OS: RHEL7 - 3.10.0-229 +libvirtd (libvirt) 1.2.8 + +I'm pretty sure this is working as intended. +If you observe the code, qemu-img first opens filename. When qemu opens a file, it needs full access to it's chain of backing files - Hence, if the old backing file does not exist, opening filename fails. + +(Not an active qemu dev, just passing through) + +The full story is that in 2015, qemu probably did not note that it failed to open the overlay (<filename>) because it failed to open the backing file. It does that, now, though: + +$ qemu-img rebase -b new.qcow2 top.qcow2 +qemu-img: Could not open 'top.qcow2': Could not open backing file: Could not open 'base.qcow2': No such file or directory + +So that should be a fix. + +Max + +What to do if the old file has been moved? + +Say the backing file was /path/to/os.qcow2 and it was moved to /new/path/to/os.qcow2. + +How can we tell snapshot.qcow2 to update the backing file from /path/to/os.qcow2 to /new/path/to/os.qcow2? + diff --git a/results/classifier/zero-shot/105/semantic/1528239 b/results/classifier/zero-shot/105/semantic/1528239 new file mode 100644 index 000000000..1bffe90b2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1528239 @@ -0,0 +1,66 @@ +semantic: 0.668 +device: 0.490 +instruction: 0.488 +mistranslation: 0.478 +socket: 0.473 +graphic: 0.465 +network: 0.443 +vnc: 0.425 +assembly: 0.420 +KVM: 0.375 +boot: 0.300 +other: 0.226 + +Unable to debug PIE binaries with QEMU gdb stub. + +The issue occurs on current trunk: + +max@max:~/build/qemu$ cat test.c +#include <stdio.h> + +int main() { + printf("Hello, world!\n"); + return 0; +} + +max@max:~/build/qemu$ gcc test.c -fPIC -pie -o bad.x +max@max:~/build/qemu$ ./x86_64-linux-user/qemu-x86_64 -g 1234 bad.x +............................. + + +max@max:~/build/qemu$ gdb +GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 +........................................................................................ +(gdb) file bad.x +Reading symbols from bad.x...(no debugging symbols found)...done. +(gdb) b main +Breakpoint 1 at 0x779 +(gdb) target remote localhost:1234 +Remote debugging using localhost:1234 +Reading symbols from /lib64/ld-linux-x86-64.so.2...warning: the debug information found in "/lib64/ld-2.19.so" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch). + +Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done. +done. +Loaded symbols for /lib64/ld-linux-x86-64.so.2 +Error in re-setting breakpoint 1: Cannot access memory at address 0x775 +Error in re-setting breakpoint 1: Cannot access memory at address 0x775 +0x0000004000a042d0 in _start () from /lib64/ld-linux-x86-64.so.2 +(gdb) c +Continuing. +[Inferior 1 (Remote target) exited normally] +(gdb) + + +max@max:~/build/qemu$ cat config.log +# Configured with: '/home/max/src/qemu/configure' '--prefix=/home/max/install/qemu' '--target-list=arm-linux-user,aarch64-linux-user,x86_64-linux-user' '--static' + + +W/O QEMU or -pie flag breakpoint on main works fine. + +GDB server itself actually supports PIE binaries. + +This patch https://<email address hidden>/ fixes the issue. + +Patch has been included in QEMU v4.2: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc12567a53c88d7a91b9 + diff --git a/results/classifier/zero-shot/105/semantic/1528718 b/results/classifier/zero-shot/105/semantic/1528718 new file mode 100644 index 000000000..d271c1b2a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1528718 @@ -0,0 +1,54 @@ +semantic: 0.342 +device: 0.216 +graphic: 0.205 +instruction: 0.179 +socket: 0.141 +other: 0.130 +mistranslation: 0.086 +boot: 0.081 +vnc: 0.067 +network: 0.058 +assembly: 0.048 +KVM: 0.009 + +Initial monitor does not output anything on Windows (MSYS2 binary) + +When running on Windows error messages before the UI is started are not showing up. + +For example when I run: + +qemu-system-i386.exe -L /mingw32/etc/qemu/ -m 20G + +It should display "ram size too large", according to gdb: + +Breakpoint 1, error_report (fmt=fmt@entry=0x71bdf6 <dma_aiocb_info+2426> "ram size too large") at C:/build/mingw/mingw-w64-qemu/src/qemu-2.4.0/util/qemu-error.c:233 + +However the console does never receive that. + +As far as I could find out vfprintf is called, but it doesn't output anything. + +This affects both binary editions (for instance "qemu-system-i386.exe" AND "qemu-system-i386w.exe") + +dumpbin /all "C:\Program Files\qemu\qemu-system-i386.exe"|more +Dump of file C:\Program Files\qemu\qemu-system-i386.exe +PE signature found +File Type: EXECUTABLE IMAGE +FILE HEADER VALUES +8664 machine (x64) + +This affects both binary editions (for instance "qemu-system-i386.exe" AND "qemu-system-i386w.exe") + +dumpbin /all "C:\Program Files\qemu\qemu-system-i386.exe"|more +Dump of file C:\Program Files\qemu\qemu-system-i386.exe +PE signature found +File Type: EXECUTABLE IMAGE +4.00 operating system version +5.02 subsystem version: Win32 + + + +Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + + +I just retested, it is fixed in the latest version on MSYS2. + diff --git a/results/classifier/zero-shot/105/semantic/1529449 b/results/classifier/zero-shot/105/semantic/1529449 new file mode 100644 index 000000000..3cf0e59e5 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1529449 @@ -0,0 +1,208 @@ +semantic: 0.896 +device: 0.895 +assembly: 0.890 +instruction: 0.887 +other: 0.887 +mistranslation: 0.883 +network: 0.862 +graphic: 0.849 +vnc: 0.838 +boot: 0.819 +socket: 0.812 +KVM: 0.788 + +serial is required for -device nvme + +I am not exactly sure if this is a bug, but I don't see why the option "serial" is required for -device nvme like drive. Truth is it seem to accept random string as its value anyway, if that's the case, couldn't qemu just generate one for it when it's not specified? + +On Mon, Jan 11, 2016 at 05:35:50PM +0100, Markus Armbruster wrote: +> Tom Yan <email address hidden> writes: +> > Public bug reported: +> > +> > I am not exactly sure if this is a bug, but I don't see why the option +> > "serial" should be required for -device nvme like the option "drive". +> > Truth is it seem to accept random string as its value anyway, if that's +> > the case, couldn't qemu just generate one for it when it's not +> > specified? +> +> You should've included a reproducer. Here are mine: +> +> 1. Bad error reporting on missing drive: +> +> $ upstream-qemu -nodefaults -device nvme +> upstream-qemu: -device nvme: Device initialization failed +> +> Expected: error reported like for other devices, e.g. +> +> $ upstream-qemu -nodefaults -device virtio-blk +> upstream-qemu: -device virtio-blk: drive property not set +> +> 2. Bad error reporting on empty drive: +> +> $ upstream-qemu -nodefaults -drive if=none,id=foo -device nvme,drive=foo +> upstream-qemu: -device nvme,drive=foo: Device initialization failed +> +> Expected: error is reported like for other devices, e.g. +> +> $ upstream-qemu -nodefaults -drive if=none,id=foo -device virtio-blk,drive=foo +> upstream-qemu: -device virtio-blk,drive=foo: Device needs media, but drive is empty +> +> 3. Bad handling of missing serial: +> +> $ upstream-qemu -nodefaults -drive if=none,id=foo,file=tmp.qcow2 -device nvme,drive=foo +> upstream-qemu: -device nvme,drive=foo: Device initialization failed +> +> Expected: either default the serial number, like some other devices +> do, or a decent error message. +> +> I recommend to convert the device to realize(), and add the missing +> error_setg(). Keith? + +Requiring a serial was a concious choice to push that responsibility +on the user, but I don't see a problem having the code provide default +serial string if the user does not over ride it. + +If you've multiple nvme devices in your guest, creating the same serial +could cause problems with multipathing if they're basing end device +uniqueness on the serial (some do). If we have the code provide the +serial, perhaps it would be best to make each unique. That's easy enough +to append an incrementing number to the end of the serial. + + +Instead of requiring a serial of arbitrary length/format, I think a WWN/EUI-64 is more useful/important, not to mention that the WWN/EUI-64 can pretty much always be used as the serial at the same time. + +Unlike Linux, Windows consider the WWN/EUI-64 as the "serial": + +"C:\Windows\system32>sg_vpd -i PD1 +Device Identification VPD page: + Addressed logical unit: + designator type: SCSI name string, code set: UTF-8 + SCSI name string: + 8086QEMU NVMe Ctrl 00012BDAC262CF831698 + +C:\Windows\system32>sg_vpd -p sn PD1 +Unit serial number VPD page: + Unit serial number: 0000_0000_0000_0000." + +(https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650553/+files/02.PNG) + +UEFI also makes use of the WWN/EUI-64 to generate boot entries for NVMe devices: +https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650554/+files/03.png +https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650555/+files/04.png + + +On 04/28/16 20:07, Tom Yan wrote: +> Instead of requiring a serial of arbitrary length/format, I think a +> WWN/EUI-64 is more useful/important, + +WWN/EUI-64 is not "more important". Section "7.9 Unique Identifier" in +the NVMe spec (Revision 1.2a, October 23, 2015) says that the serial +number is mandatory, while implementing an EUI-64 is optional. Let me +quote it all (emphases mine): + +> 7.9 Unique Identifier +> +> Information is returned in the Identify Controller data structure that +> may be used to construct a unique identifier. Specifically, the PCI +> Vendor ID, *Serial Number*, and Model Number fields when combined +> shall form a globally unique value that identifies the NVM subsystem. +> The mechanism used by the vendor to assign Serial Number and Model +> Number values to ensure uniqueness is *outside the scope* of this +> specification. +> +> An NVM subsystem may contain multiple controllers. All of the +> controllers that make up an NVM subsystem share the same NVM subsystem +> identifier (i.e., PCI Vendor ID, Serial Number, and Model Number). The +> Controller ID (CNTLID) value returned in the Identify Controller data +> structure may be used to uniquely identify a controller within an NVM +> subsystem. The Controller ID value when combined with the NVM +> subsystem identifier forms a globally unique value that identifies the +> controller. The mechanism used by the vendor to assign Controller ID +> values is outside the scope of this specification. +> +> The Identify Namespace data structure contains the IEEE Extended +> Unique Identifier (EUI64) and the Namespace Globally Unique Identifier +> (NGUID) fields. EUI64 is an 8-byte EUI-64 identifier and NGUID is a +> 16-byte identifier based on EUI-64. When creating a namespace, the +> controller specifies a globally unique value in the EUI64 or NGUID +> field (the controller may optionally specify a globally unique value +> in both fields). In cases where the 64-bit EUI64 field is unable to +> ensure a globally unique namespace identifier, the EUI64 field shall +> be cleared to 0h. *When not implemented*, these fields contain a value +> of 0h. + +The QEMU device model conforms to this: + +- The serial number is mandatory, and its generation is unspecified. +(First paragraph quoted.) Accordingly, QEMU forces the user to generate +and provide a serial number. + +- The EUI64 is optional (third paragraph); it shall be zero-filled when +not implemented. QEMU conforms. + +> not to mention that the WWN/EUI-64 +> can pretty much always be used as the serial at the same time. +> +> Unlike Linux, Windows consider the WWN/EUI-64 as the "serial": + +That's Windows's problem. Not the first (and not the last) occasion +where Microsoft interpret a specification creatively. + +> "C:\Windows\system32>sg_vpd -i PD1 +> Device Identification VPD page: +> Addressed logical unit: +> designator type: SCSI name string, code set: UTF-8 +> SCSI name string: +> 8086QEMU NVMe Ctrl 00012BDAC262CF831698 +> +> C:\Windows\system32>sg_vpd -p sn PD1 +> Unit serial number VPD page: +> Unit serial number: 0000_0000_0000_0000." +> +> (https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650553/+files/02.PNG) +> +> UEFI also makes use of the WWN/EUI-64 to generate boot entries for NVMe devices: +> https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650554/+files/03.png +> https://bugs.launchpad.net/qemu/+bug/1576347/+attachment/4650555/+files/04.png + +The UEFI specification (version 2.6, January 2016) says in "9.3.5.22 NVM +Express namespace messaging device path node": + + Mnemonic: IEEE Extended Unique Identifier + Byte Offset: 8 + Byte Length: 8 + Description: This field contains the IEEE Extended Unique + Identifier (EUI-64). Devices without an EUI-64 value + must initialize this field with a value of 0. + +QEMU conforms. + +The device paths visible on your OVMF screenshots are distinguishable +from each other by their Pci() device path nodes. There is no collision. + +I recommend reviewing the following commits: + +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=a907ec52cc1a +https://github.com/tianocore/edk2/commit/d7c0dfaef26c + +The point being: if QEMU grows a capability to store a nonzero EUI64, +and that EUI64 is reflected in the OpenFirmware device path that is +placed into the "bootorder" fw_cfg file, then OVMF will parse it just +fine. However, QEMU is not required to grow such a capability, according +to the NVMe and UEFI specifications. In practice, multiple NVMe devices +can be distinguished from each other by their different PCI B/D/F locations. + +Thanks, +Laszlo + + +Since both "drive=" and "serial=" expects an arbitrary string (while the value for "drive=" must be unique since it's the "id=" of a "-drive"), why not use the same string from "drive=" as the value of "serial=" when it's not specified explicitly? + +Apparently "-device scsi-hd" has already been doing that (although it does not create the "sn" VPD when a serial is not explicitly specified). + + + + + +No new developments for 4+ years, closing as invalid (I'd prefer "wontfix due to lack of resources", but I'm unable to pick that resolution). + diff --git a/results/classifier/zero-shot/105/semantic/1534 b/results/classifier/zero-shot/105/semantic/1534 new file mode 100644 index 000000000..ae085f23d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1534 @@ -0,0 +1,14 @@ +semantic: 0.305 +device: 0.262 +graphic: 0.248 +boot: 0.218 +instruction: 0.202 +KVM: 0.183 +vnc: 0.161 +network: 0.097 +mistranslation: 0.078 +other: 0.064 +socket: 0.025 +assembly: 0.005 + +usermode emulation warns about features that are system-only (x2apic, tsc-deadline, pcid, invpcid) diff --git a/results/classifier/zero-shot/105/semantic/1555076 b/results/classifier/zero-shot/105/semantic/1555076 new file mode 100644 index 000000000..881556c71 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1555076 @@ -0,0 +1,62 @@ +semantic: 0.719 +graphic: 0.700 +device: 0.692 +mistranslation: 0.686 +other: 0.608 +instruction: 0.578 +vnc: 0.564 +network: 0.528 +socket: 0.525 +assembly: 0.322 +boot: 0.316 +KVM: 0.305 + +Qemu 2.5 dont start with sdl,gl=on or gtk,gl=on + +with this config line + qemu-system-i386 -m 2047 -hda /dev/sda3 -display sdl,gl=on -sdl -vga virtio -cdrom xenial-desktop-i386.iso + + +i have this exit + +ERROR:ui/console-gl.c:95:surface_gl_create_texture: code should not be reached + +same is i use this: + +qemu-system-i386 -m 2047 -hda /dev/sda3 -display gtk,gl=on -sdl -vga virtio -cdrom xenial-desktop-i386.iso +ERROR:ui/console-gl.c:95:surface_gl_create_texture: code should not be reached + + +My Os i Debian Jessie on P5020 PPC64 4GB ram GPU RadeonHD . +Configure gave me gl ok, sdl ok , Virtio and Virgl OK . + +My Mesa are the 11.3 dev ... the same issue was found on oldest and stable release of mesa . + +OpenGL vendor string: X.Org +OpenGL renderer string: Gallium 0.4 on AMD TURKS (DRM 2.43.0) +OpenGL version string: 2.1 Mesa 11.3.0-devel (git-3146014) +OpenGL shading language version string: 1.30 + + +OpenGL ES profile version string: OpenGL ES 2.0 Mesa 11.3.0-devel (git-3146014) +OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16 + + +Thanks +Luigi + +Is this the same issue as the bug reported here: +https://bugs.launchpad.net/qemu/+bug/1581796 +? + +Sorry T, +i forget had been reported and duplicate the bug report. +can merge or close this one. + +i will check with your suggestion and report. + +Fix has been pulled into the QEMU git repository: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=2c2311c5451f4555e850772 + +Released with version 2.7 + diff --git a/results/classifier/zero-shot/105/semantic/1561 b/results/classifier/zero-shot/105/semantic/1561 new file mode 100644 index 000000000..6b1a6edf4 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1561 @@ -0,0 +1,40 @@ +semantic: 0.751 +graphic: 0.714 +instruction: 0.638 +device: 0.542 +socket: 0.495 +network: 0.471 +mistranslation: 0.378 +vnc: 0.327 +boot: 0.242 +KVM: 0.191 +assembly: 0.166 +other: 0.120 + +Compile QEMU 6.2.0 fail for file not found +Description of problem: +Compile QEMU failed with error message: +``` +In file included from ../subprojects/libvhost-user/libvhost-user.c:45: +../subprojects/libvhost-user/libvhost-user.h:23:10: Fatal error:standard-headers/linux/virtio_ring.h:no such file or directory + 23 | #include "standard-headers/linux/virtio_ring.h" + | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +``` +Steps to reproduce: +1. Download qemu-6.2.0 tarball at https://download.qemu.org/qemu-6.2.0.tar.xz +2. unzip the tarball to dir ```qemu-6.2.0``` +2. cd ```qemu-6.2.0```, and then ```./configure && make -j2``` +Additional information: +In ```qemu-6.2.0/subprojects/libvhost-user/libvhost-user.c:45```, the included files are: + +``` +#include <stdint.h> +#include <stdbool.h> +#include <stddef.h> +#include <poll.h> +#include <linux/vhost.h> +#include <pthread.h> +#include "standard-headers/linux/virtio_ring.h" +``` + +```standard-headers``` are in ```qemu-6.2.0/include/standard-headers/```, but above #include assume it's in the same dir of ```libvhost-user.c```. diff --git a/results/classifier/zero-shot/105/semantic/1569 b/results/classifier/zero-shot/105/semantic/1569 new file mode 100644 index 000000000..944cb513e --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1569 @@ -0,0 +1,40 @@ +semantic: 0.856 +vnc: 0.850 +boot: 0.830 +graphic: 0.828 +device: 0.759 +network: 0.617 +socket: 0.599 +assembly: 0.583 +instruction: 0.515 +other: 0.405 +KVM: 0.213 +mistranslation: 0.200 + +NVMe FS operations hang after suspending and resuming both guest and host +Description of problem: +Hello and thank you for your work on QEMU! + +Using the NVMe driver with my Seagate FireCuda 530 2TB M.2 works fine until I encounter this problem, which is reliably reproducible for me. + +When I suspend the guest and then suspend (s2idle) my host all is well until I resume the guest (manually with `virsh dompmwakeup $VMNAME`, after the host has resumed). Although the guest resumes and is interactive, it seems that anything involving filesystem operations hang forever and do not return. + +Suspending and resuming the Linux guest seems to work perfectly if I don't suspend/resume the host. + +Ultimately what I'm wanting to do is share the drive between VMs with qemu-storage-daemon. I can reproduce the problem in that scenario in much the same way. Using PCI passthrough with the same VM and device works fine and doesn't exhibit this problem. + +Hopefully that's clear enough - let me know if there's anything else I can provide. +Steps to reproduce: +1. Create a VM with a dedicated NVMe disk. +2. Boot an ISO and install to the disk. +3. Verify that suspend and resume works when not suspending the host. +4. Suspend the guest. +5. Suspend the host. +6. Wake the host. +7. Wake the guest. +8. Try just about anything that isn't likely already cached somewhere: `du -s /etc`. +Additional information: +I've attached the libvirt domain XML[1] and libvirtd debug logs for QEMU[2] ("1:qemu") that covers suspending the guest & host, resuming host & guest and doing something to cause a hang. I tried to leave enough time afterwards for any timeout to occur. + +1. [nvme-voidlinux.xml](/uploads/1dea47af096ce58175f7aa526eca455e/nvme-voidlinux.xml) +2. [nvme-qemu-debug.log](/uploads/42d3bed456a795069023a61d38fa5ccd/nvme-qemu-debug.log) diff --git a/results/classifier/zero-shot/105/semantic/1588328 b/results/classifier/zero-shot/105/semantic/1588328 new file mode 100644 index 000000000..d267664af --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1588328 @@ -0,0 +1,870 @@ +semantic: 0.875 +graphic: 0.844 +other: 0.837 +network: 0.818 +boot: 0.803 +mistranslation: 0.782 +device: 0.776 +instruction: 0.769 +socket: 0.742 +assembly: 0.735 +vnc: 0.708 +KVM: 0.586 + +Qemu 2.6 Solaris 9 Sparc Segmentation Fault + +Hi, +I tried the following command to boot Solaris 9 sparc: +qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server + +It seems there are a few Segmentation Faults, one from the starting of the boot. Another at the beginning of the commandline installation. + +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. +Configuration device id QEMU version 1 machine id 32 +Probing SBus slot 0 offset 0 +Probing SBus slot 1 offset 0 +Probing SBus slot 2 offset 0 +Probing SBus slot 3 offset 0 +Probing SBus slot 4 offset 0 +Probing SBus slot 5 offset 0 +Invalid FCode start byte +CPUs: 1 x FMI,MB86904 +UUID: 00000000-0000-0000-0000-000000000000 +Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 + Type 'help' for detailed information +Trying cdrom:d... +Not a bootable ELF image +Loading a.out image... +Loaded 7680 bytes +entry point is 0x4000 +bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d + +Jumping to entry point 00004000 for type 00000005... +switching to new context: +SunOS Release 5.9 Version Generic_118558-34 32-bit +Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +Use is subject to license terms. +WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0): + Corrupt label; wrong magic number + +Segmentation Fault +Configuring /dev and /devices +NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88) +audio may not work correctly until it is stopped and restarted +Segmentation Fault +Using RPC Bootparams for network configuration information. +Skipping interface le0 +Searching for configuration file(s)... +Search complete. + +.... + +What type of terminal are you using? + 1) ANSI Standard CRT + 2) DEC VT52 + 3) DEC VT100 + 4) Heathkit 19 + 5) Lear Siegler ADM31 + 6) PC Console + 7) Sun Command Tool + 8) Sun Workstation + 9) Televideo 910 + 10) Televideo 925 + 11) Wyse Model 50 + 12) X Terminal Emulator (xterms) + 13) CDE Terminal Emulator (dtterm) + 14) Other +Type the number of your choice and press Return: 3 +syslog service starting. +savecore: no dump device configured +Running in command line mode +/sbin/disk0_install[109]: 143 Segmentation Fault +/sbin/run_install[130]: 155 Segmentation Fault + +That basically looks like it should work. The only time I've seen random segfaults similar to this is with a corrupted disk, so the first question to ask is whether you've verified the ISO image you are using? Unfortunately this isn't an image I currently have access to, but I can report my Solaris 8 32-bit ISO boots and installs fine with no issues here. Does it make any difference if you remove the -hda part of the command line? + +Hi Mark, + +I compared the cksum provided and the dvd cksum that i did, seems to match. I did try out Solaris 8 too. It seems to work fine with sun formatted hda disk. I did try removing the -hda for solaris 9, the problem still persist. I shall find if i can get another solaris 9 image source to try out. + +If you can verify that the media is correct and you still see problems, I'd be interested to take a look if you are able to provide me a copy of the media for debugging. + + +Hi Mark, + +I have uploaded a copy of it to mega.nz + +https://mega.nz/#!94ZVXBra + + +Hi Mark, + +Attached is the new link: https://mega.nz/#!94ZVXBra!8QMsQ2d9eKKkMuawg_0YelfyWTy47CyyD1f6tvSv1bQ + + +Thanks for the test case. It appears that this is a regression that occurred somewhere between 2.5 and 2.6 - bisecting now. + +Can you guys check if the problem persists when qemu is launched with +the -singlestep option? +I think it's in general a good idea always check TCG-related problems +with -singlestep , because it helps to find out whether a bug is in +the optimizer or generator module of TCG. + +Artyom + +On Tue, Jun 14, 2016 at 11:44 PM, Mark Cave-Ayland +<email address hidden> wrote: +> Thanks for the test case. It appears that this is a regression that +> occurred somewhere between 2.5 and 2.6 - bisecting now. +> +> -- +> You received this bug notification because you are a member of qemu- +> devel-ml, which is subscribed to QEMU. +> https://bugs.launchpad.net/bugs/1588328 +> +> Title: +> Qemu 2.6 Solaris 9 Sparc Segmentation Fault +> +> Status in QEMU: +> New +> +> Bug description: +> Hi, +> I tried the following command to boot Solaris 9 sparc: +> qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server +> +> It seems there are a few Segmentation Faults, one from the starting of +> the boot. Another at the beginning of the commandline installation. +> +> Trying 127.0.0.1... +> Connected to localhost. +> Escape character is '^]'. +> Configuration device id QEMU version 1 machine id 32 +> Probing SBus slot 0 offset 0 +> Probing SBus slot 1 offset 0 +> Probing SBus slot 2 offset 0 +> Probing SBus slot 3 offset 0 +> Probing SBus slot 4 offset 0 +> Probing SBus slot 5 offset 0 +> Invalid FCode start byte +> CPUs: 1 x FMI,MB86904 +> UUID: 00000000-0000-0000-0000-000000000000 +> Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 +> Type 'help' for detailed information +> Trying cdrom:d... +> Not a bootable ELF image +> Loading a.out image... +> Loaded 7680 bytes +> entry point is 0x4000 +> bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d +> +> Jumping to entry point 00004000 for type 00000005... +> switching to new context: +> SunOS Release 5.9 Version Generic_118558-34 32-bit +> Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +> Use is subject to license terms. +> WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0): +> Corrupt label; wrong magic number +> +> Segmentation Fault +> Configuring /dev and /devices +> NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88) +> audio may not work correctly until it is stopped and restarted +> Segmentation Fault +> Using RPC Bootparams for network configuration information. +> Skipping interface le0 +> Searching for configuration file(s)... +> Search complete. +> +> .... +> +> What type of terminal are you using? +> 1) ANSI Standard CRT +> 2) DEC VT52 +> 3) DEC VT100 +> 4) Heathkit 19 +> 5) Lear Siegler ADM31 +> 6) PC Console +> 7) Sun Command Tool +> 8) Sun Workstation +> 9) Televideo 910 +> 10) Televideo 925 +> 11) Wyse Model 50 +> 12) X Terminal Emulator (xterms) +> 13) CDE Terminal Emulator (dtterm) +> 14) Other +> Type the number of your choice and press Return: 3 +> syslog service starting. +> savecore: no dump device configured +> Running in command line mode +> /sbin/disk0_install[109]: 143 Segmentation Fault +> /sbin/run_install[130]: 155 Segmentation Fault +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1588328/+subscriptions +> + + + +-- +Regards, +Artyom Tarasenko + +SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu + + +On 17/06/16 12:42, Artyom Tarasenko wrote: + +> Can you guys check if the problem persists when qemu is launched with +> the -singlestep option? +> I think it's in general a good idea always check TCG-related problems +> with -singlestep , because it helps to find out whether a bug is in +> the optimizer or generator module of TCG. +> +> Artyom + +Hi Artyom, + +I did manage to bisect this down to a single commit in the end: see +http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg04039.html +for the commit in question. + + +ATB, + +Mark. + + + +Artyom has located the regression and posted a patch here: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07226.html. + +Hi all, + +Thanks for the patch. I just tried, it seems to be not able to find the disk when it try to start the installation. :( + +... + +Please specify the media from which you will install the Solaris Operating +Environment. + +Media: + +1. CD/DVD +2. Network File System +3. HTTP (Flash archive only) +4. FTP (Flash archive only) +5. Local Tape (Flash archive only) + + Media [1]: 1 +Reading disc for Solaris Operating Environment... + +The system is being initialized, please wait... / +No Disks found. +Check to make sure disks are cabled and powered up. + + + +I ran all the way through the installer in order to test the patch, so it should be working for you. Is your Spark9.disk labelled? See http://virtuallyfun.superglobalmegacorp.com/2010/10/03/formatting-disks-for-solaris/ for more information on how to do this. + +Hmm.. strange. I did make a new disk went into the setup, then format the disk. After that, i rebooted and start that installation. But, it seems still there is no disk detected. + + Media [1]: 1 +Reading disc for Solaris Operating Environment... + +The system is being initialized, please wait... | +No Disks found. +Check to make sure disks are cabled and powered up. + + Press OK to Exit. + + <Press ENTER to continue/ + +-# format -e +Searching for disks...done + + +AVAILABLE DISK SELECTIONS: + 0. c0t0d0 <drive type unknown> + /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 +Specify disk (enter its number): 0 + + + +AVAILABLE DRIVE TYPES: + 0. Auto configure + 1. Quantum ProDrive 80S + 2. Quantum ProDrive 105S + 3. CDC Wren IV 94171-344 + 4. SUN0104 + 5. SUN0207 + 6. SUN0327 + 7. SUN0340 + 8. SUN0424 + 9. SUN0535 + 10. SUN0669 + 11. SUN1.0G + 12. SUN1.05 + 13. SUN1.3G + 14. SUN2.1G + 15. SUN2.9G + 16. Zip 100 + 17. Zip 250 + 18. other +Specify disk type (enter its number): 18 +Enter number of data cylinders: 24620 +Enter number of alternate cylinders[2]: +Enter number of physical cylinders[24622]: +Enter number of heads: 27 +Enter physical number of heads[default]: 107 +Enter number of data sectors/track: 107 +Enter number of physical sectors/track[default]: +Enter rpm of drive[3600]: +Enter format time[default]: +Enter cylinder skew[default]: +Enter track skew[default]: +Enter tracks per zone[default]: +Enter alternate tracks[default]: +Enter alternate sectors[default]: +Enter cache control[default]: +Enter prefetch threshold[default]: +Enter minimum prefetch[default]: +Enter maximum prefetch[default]: +Enter disk type name (remember quotes): Sparc9 +selecting c0t0d0 +[disk formatted] + + +FORMAT MENU: + disk - select a disk + type - select (define) a disk type + partition - select (define) a partition table + current - describe the current disk + format - format and analyze the disk + repair - repair a defective sector + label - write label to the disk + analyze - surface analysis + defect - defect list management + backup - search for backup labels + verify - read and display labels + save - save new disk/partition definitions + inquiry - show vendor, product and revision + scsi - independent SCSI mode selects + cache - enable, disable or query SCSI disk cache + volname - set 8-character volume name + !<cmd> - execute <cmd>, then return + quit +format> label +[0] SMI Label +[1] EFI Label +Specify Label type[0]: 1 +Ready to label disk, continue?y + +format> q + +#reboot +Jun 28 23:37:16 rpcbind: rpcbind terminating on signal. +syncing file systems... done +rebooting... +rebooting () +Configuration device id QEMU version 1 machine id 32 +Probing SBus slot 0 offset 0 +Probing SBus slot 1 offset 0 +Probing SBus slot 2 offset 0 +Probing SBus slot 3 offset 0 +Probing SBus slot 4 offset 0 +Probing SBus slot 5 offset 0 +Invalid FCode start byte +CPUs: 1 x FMI,MB86904 +UUID: 00000000-0000-0000-0000-000000000000 +Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 + Type 'help' for detailed information +Trying cdrom:d... +Not a bootable ELF image +Loading a.out image... +Loaded 7680 bytes +entry point is 0x4000 +bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d + +Jumping to entry point 00004000 for type 00000005... +switching to new context: +SunOS Release 5.9 Version Generic_118558-34 32-bit +Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +Use is subject to license terms. +Configuring /dev and /devices +NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88) +audio may not work correctly until it is stopped and restarted + +Please specify the media from which you will install the Solaris Operating +Environment. + +Media: + +1. CD/DVD +2. Network File System +3. HTTP (Flash archive only) +4. FTP (Flash archive only) +5. Local Tape (Flash archive only) + + Media [1]: 1 +Reading disc for Solaris Operating Environment... + +The system is being initialized, please wait... -^[[6|^R +^[[/ +No Disks found. +Check to make sure disks are cabled and powered up. + + Press OK to Exit. + + + +Okay. Can you confirm which version (or git revision) you've used to apply the patch so I can try and reproduce locally? + + +May 11 2016. qemu-2.6.0 from http://wiki.qemu.org/Download + +I've just tried v2.6.0 with the recent ldstub patch applied and it looks from the output above that you're using an incorrect format to put down the disk label. I see the following: + +$ ./qemu-system-sparc -cdrom sol-9-905hw-ga-sparc-dvd.iso -hda /home/build/src/qemu/image/sparc32/sol9.qcow2 -boot d -nographic -prom-env 'auto-boot?=false' +Configuration device id QEMU version 1 machine id 32 +Probing SBus slot 0 offset 0 +Probing SBus slot 1 offset 0 +Probing SBus slot 2 offset 0 +Probing SBus slot 3 offset 0 +Probing SBus slot 4 offset 0 +Probing SBus slot 5 offset 0 +Invalid FCode start byte +CPUs: 1 x FMI,MB86904 +UUID: 00000000-0000-0000-0000-000000000000 +Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 + Type 'help' for detailed information + +0 > boot cdrom:d -vs Not a bootable ELF image +Loading a.out image... +Loaded 7680 bytes +entry point is 0x4000 +bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d + +Jumping to entry point 00004000 for type 00000005... +switching to new context: +Size: 0x4624f+0xdaf5+0x1d6a3 Bytes +SunOS Release 5.9 Version Generic_118558-34 32-bit +Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +Use is subject to license terms. +... +... +INIT: SINGLE USER MODE +# format +Searching for disks...WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0): + Corrupt label; wrong magic number + +WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0): + Corrupt label; wrong magic number + +done + + +AVAILABLE DISK SELECTIONS: + 0. c0t0d0 <drive type unknown> + /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 +Specify disk (enter its number): 0 + + + +AVAILABLE DRIVE TYPES: + 0. Auto configure + 1. Quantum ProDrive 80S + 2. Quantum ProDrive 105S + 3. CDC Wren IV 94171-344 + 4. SUN0104 + 5. SUN0207 + 6. SUN0327 + 7. SUN0340 + 8. SUN0424 + 9. SUN0535 + 10. SUN0669 + 11. SUN1.0G + 12. SUN1.05 + 13. SUN1.3G + 14. SUN2.1G + 15. SUN2.9G + 16. Zip 100 + 17. Zip 250 + 18. other +Specify disk type (enter its number): 18 +Enter number of data cylinders: 24620 +Enter number of alternate cylinders[2]: +Enter number of physical cylinders[24622]: +Enter number of heads: 27 +Enter physical number of heads[default]: +Enter number of data sectors/track: 107 +Enter number of physical sectors/track[default]: 107 +Enter rpm of drive[3600]: +Enter format time[default]: +Enter cylinder skew[default]: +Enter track skew[default]: +Enter tracks per zone[default]: +Enter alternate tracks[default]: +Enter alternate sectors[default]: +Enter cache control[default]: +Enter prefetch threshold[default]: +Enter minimum prefetch[default]: +Enter maximum prefetch[default]: +Enter disk type name (remember quotes): Sparc9 +selecting c0t0d0 +[disk formatted] + + +FORMAT MENU: + disk - select a disk + type - select (define) a disk type + partition - select (define) a partition table + current - describe the current disk + format - format and analyze the disk + repair - repair a defective sector + label - write label to the disk + analyze - surface analysis + defect - defect list management + backup - search for backup labels + verify - read and display labels + save - save new disk/partition definitions + inquiry - show vendor, product and revision + volname - set 8-character volume name + !<cmd> - execute <cmd>, then return + quit +format> label +Ready to label disk, continue? y + +WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0): + Corrupt label; wrong magic number + +format> label +Ready to label disk, continue? y + +format> q + +And then after the reboot: + +$ ./qemu-system-sparc -cdrom sol-9-905hw-ga-sparc-dvd.iso -hda /home/build/src/qemu/image/sparc32/sol9.qcow2 -boot d -nographic +Configuration device id QEMU version 1 machine id 32 +Probing SBus slot 0 offset 0 +Probing SBus slot 1 offset 0 +Probing SBus slot 2 offset 0 +Probing SBus slot 3 offset 0 +Probing SBus slot 4 offset 0 +Probing SBus slot 5 offset 0 +Invalid FCode start byte +CPUs: 1 x FMI,MB86904 +UUID: 00000000-0000-0000-0000-000000000000 +Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 + Type 'help' for detailed information +Trying cdrom:d... +Not a bootable ELF image +Loading a.out image... +Loaded 7680 bytes +entry point is 0x4000 +bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d + +Jumping to entry point 00004000 for type 00000005... +switching to new context: +SunOS Release 5.9 Version Generic_118558-34 32-bit +Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +Use is subject to license terms. +Configuring /dev and /devices +NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88) +audio may not work correctly until it is stopped and restarted +Using RPC Bootparams for network configuration information. +Skipping interface le0 +Searching for configuration file(s)... +Search complete. + +Select a Language + + 0. English + 1. French + 2. German + 3. Italian + 4. Japanese + 5. Korean + 6. Simplified Chinese + 7. Spanish + 8. Swedish + 9. Traditional Chinese + +Please make a choice (0 - 9), or press h or ? for help: 0 + +Select a Locale + + 0. English (C - 7-bit ASCII) + 1. Albania (ISO8859-2) + 2. Australia (ISO8859-1) + 3. Belgium-Flemish (ISO8859-1) + 4. Belgium-Flemish (ISO8859-15 - Euro) + 5. Bosnia (ISO8859-2) + 6. Brazil (ISO8859-1) + 7. Brazil (UTF-8) + 8. Bulgaria (ISO8859-5) + 9. Canada-English (ISO8859-1) + 10. Catalan, Spain (ISO8859-1) + 11. Catalan, Spain (ISO8859-15 - Euro) + 12. Croatia (ISO8859-2) + 13. Czech Republic (ISO8859-2) + 14. Denmark (ISO8859-1) + 15. Denmark (ISO8859-15 - Euro) + 16. Egypt (ISO8859-6) + 17. Egypt (UTF-8) + 18. Estonia (ISO8859-15) + +Press Return to show more choices. +Please make a choice (0 - 59), or press h or ? for help: 0 + +What type of terminal are you using? + 1) ANSI Standard CRT + 2) DEC VT52 + 3) DEC VT100 + 4) Heathkit 19 + 5) Lear Siegler ADM31 + 6) PC Console + 7) Sun Command Tool + 8) Sun Workstation + 9) Televideo 910 + 10) Televideo 925 + 11) Wyse Model 50 + 12) X Terminal Emulator (xterms) + 13) CDE Terminal Emulator (dtterm) + 14) Other +Type the number of your choice and press Return: 3 +syslog service starting. +savecore: no dump device configured +Running in command line mode + +Please wait while the system information is loaded... | + +... +... + +Please wait while the system is configured with your settings... + +Scanning system disk information... + +Searching disks for upgradable Solaris root devices... +No Upgradable Solaris root devices were found. + + +Searching for locations to accommodate a temporary copy of the Solaris +installation software. Swap slices are usually erased at reboot, so it is +preferable to place the Solaris installation software on slice labeled swap. + +No swap slices that begin at the first usable cylinder have enough space +to accommodate a temporary copy of the Solaris installation software. + +Using a slice that begins at the first usable cylinder allows the most +flexibility during filesystem layout. If you are doing an initial install and +you are not preserving any filesystems, you can re-partition a disk with the +swap slice starting at the first usable cylinder. + +Would you like to re-partition a disk? [y,n,?,q] y + +The default root disk is /dev/dsk/c0t0d0. +The selected disk will be re-partitioned before the Solaris installation +software is copied to the disk. + +WARNING: ALL INFORMATION ON THE DISK WILL BE ERASED! + + +Do you want to re-partition /dev/dsk/c0t0d0 [y,n,?,q] y + +NOTE: The swap size cannot be changed during file system layout. + + +Enter a swap slice size between 158MB and 34729MB, default = 512MB [?] + +Placing the swap slice at the beginning of the disk will allow the most flexible file system partitioning later in the installation. + +Can the swap slice start at the beginning of the disk [y,n,?,q] y +Confirm Information: + + Disk Slice : /dev/dsk/c0t0d0s1 + Size : 512 MB + Start Cyl. : 0 + +WARNING: ALL INFORMATION ON THE DISK WILL BE ERASED! + + +Is this OK [y,n,?,q] y + +etc. + +Please specify the media from which you will install the Solaris Operating +Environment. + +Media: + +1. CD/DVD +2. Network File System +3. HTTP (Flash archive only) +4. FTP (Flash archive only) +5. Local Tape (Flash archive only) + + Media [1]: +Reading disc for Solaris Operating Environment... + +The system is being initialized, please wait... / + +Sun Microsystems, Inc. +Binary Code License Agreement + +etc. + +Comparing your output with mine I can see two obvious differences: + +1) You are using a different version of Solaris to label the disk in a way that can't be understood by Solaris 9 + +2) You've mistyped the "Physical number of heads" as 27 rather than accepting the default + + +ATB, + +Mark. + + +Hi Mark, + +Thanks a lot. Got it working now. When formatting the label, there are 2 options, SMI and EFI. Once I format it with SMI, it seems to be able to find the disk. + +Great news! FWIW with newer versions of QEMU, including 2.6.0, the framebuffer emulation is good enough to install and run Solaris (including X) without the -nographic/-serial options if you need it. I've also CCd the relevant patch to qemu-stable so it should appear in 2.6.1 also. + +Many thanks for the report! + + +Hi Mark, + +Thanks for the update. I would definitely be nice to have other than the black screen. Still got a problem though. I managed to install sparc9 but after i removed the cdrom, it fails to boot. + +qemu-system-sparc -nographic -monitor null -serial mon:telnet:0.0.0.0:3000,server -hda ./Sparc9.disk -m 256 -net nic,macaddr=52:54:0:12:34:56 -net tap,ifname=tap0,script=no,downscript=no + +From an article written by Artyom, i did add + +# cat >> /a/etc/system + +set scsi_options=0x58 + +^d + +to solaris 2.6. + +However, i think i didnt do this with solaris 8, it works fine. + +For the solaris 9, it will allow you to either set it to not reboot but once you reached the end of the installation, it will reboot when you press enter. And the Sparc9.disk cannot be booted :( + + + +telnet 0.0.0.0 3000 +Trying 0.0.0.0... +Connected to 0.0.0.0. +Escape character is '^]'. +Configuration device id QEMU version 1 machine id 32 +Probing SBus slot 0 offset 0 +Probing SBus slot 1 offset 0 +Probing SBus slot 2 offset 0 +Probing SBus slot 3 offset 0 +Probing SBus slot 4 offset 0 +Probing SBus slot 5 offset 0 +Invalid FCode start byte +CPUs: 1 x FMI,MB86904 +UUID: 00000000-0000-0000-0000-000000000000 +Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19 + Type 'help' for detailed information +Trying disk... +Not a bootable ELF image +Not a bootable a.out image +No valid state has been set by load or init-program + +0 > + + + +If you use OpenBIOS then you don't explicitly have to set scsi-options since the value can be overridden via the device tree which is exactly what OpenBIOS does. + +Interestingly enough it seems that the default bootloader for Solaris 9 is installed in the slice rather than the root of the disk as per my Solaris 8 installation. Fortunately you can manually boot Solaris 9 from the slice by entering "boot disk:d" at the Forth prompt. + +Based upon this it probably makes sense to add "disk:d" to the bootpath used by OpenBIOS - I'll send a patch through to the OpenBIOS mailing list shortly. + + +Hi Mark, + +I have tried boot diisk:d. After this + +Not a bootable ELF image +Not a bootable a.out image +No valid state has been set by load or init-program + +0 > boot disk:d No valid state has been set by load or init-program + ok +0 > + + +Somehow I am getting invalid boot + +It works here as per my post above, so I think the problem is still with the disk label. With the above ISO image, I don't get asked for the type of label which makes me think you are using a newer version of Solaris for labelling than you are for installation. + +Can you re-label the disk using the exact same image used for the installation and see if that makes a difference? + +Hmmm I've just tried a second installation of Solaris 9 with a completely blank disk image and now it appears that "boot disk:a" is correct, i.e. boot from slice a. Not sure yet if this is the correct convention for HDs. + +Hi Mark, + +We are finally in:) + +By the way, how do you figure out which slice its in? + +From solaris 8 dvd onwards, i seems to see 2 disk label options: SMI and EFI. Not sure why you didnt see those. + +0 > boot disk:a Not a bootable ELF image +Loading a.out image... +Loaded 7680 bytes +entry point is 0x4000 +bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0:a + +Jumping to entry point 00004000 for type 00000005... +switching to new context: +SunOS Release 5.9 Version Generic_118558-34 32-bit +Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. +Use is subject to license terms. +configuring IPv4 interfaces: le0. +starting DHCP on primary interface le0 +Hostname: unknown +The system is coming up. Please wait. +checking ufs filesystems +/dev/rdsk/c0t0d0s7: is logging. +starting rpc services: rpcbind done. +syslog service starting. +syslogd: line 24: WARNING: loghost could not be resolved +Jul 3 14:06:40 unknown sendmail[239]: My unqualified host name (unknown) unknown; sleeping for retry +Jul 3 14:06:40 unknown sendmail[240]: My unqualified host name (unknown) unknown; sleeping for retry +volume management starting. +Creating new rsa public/private host key pair +Creating new dsa public/private host key pair +Jul 3 14:06:54 unknown snmpXdmid: Error in Adding Row for Subscription Table Entry +Jul 3 14:06:55 unknown snmpXdmid: Failed to add filter to SP for Event delivery +The system is ready. + +unknown console login: root +Password: +Jul 3 14:07:09 unknown login: ROOT LOGIN /dev/console +Sun Microsystems Inc. SunOS 5.9 Generic May 2002 +# ls +bin etc lib opt tmp xfn +cdrom export lost+found platform usr +dev home mnt proc var +devices kernel net sbin vol +# Jul 3 14:07:41 unknown sendmail[240]: unable to qualify my own domain name (unknown) -- using short name +Jul 3 14:07:41 unknown sendmail[240]: [ID 702911 mail.alert] unable to qualify my own domain name (unknown) -- using short name +Jul 3 14:07:46 unknown sendmail[239]: unable to qualify my own domain name (unknown) -- using short name +Jul 3 14:07:46 unknown sendmail[239]: [ID 702911 mail.alert] unable to qualify my own domain name (unknown) -- using short name + + + +Great news! AFAICT it's just convention that the first disk slice is used, and I've also proposed a patch for OpenBIOS to include this in the boot search path in future, hopefully in time for 2.7. + diff --git a/results/classifier/zero-shot/105/semantic/1589564 b/results/classifier/zero-shot/105/semantic/1589564 new file mode 100644 index 000000000..579f5a915 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1589564 @@ -0,0 +1,52 @@ +semantic: 0.724 +device: 0.695 +graphic: 0.693 +network: 0.636 +vnc: 0.624 +socket: 0.588 +other: 0.587 +mistranslation: 0.580 +instruction: 0.460 +KVM: 0.444 +boot: 0.380 +assembly: 0.331 + +qemu/hw/scsi/scsi-disk.c:2741: possible missing break ? + +qemu/hw/scsi/scsi-disk.c:2741] -> [qemu/hw/scsi/scsi-disk.c:2745]: (warning) Variable 'cdb1' is reassigned a value before the old one has been used. 'break;' missing? +qemu/hw/scsi/scsi-disk.c:2742] -> [qemu/hw/scsi/scsi-disk.c:2746]: (warning) Variable 'group_number' is reassigned a value before the old one has been used. 'break;' missing? + +Source code is + + case 1: + /* 10-byte CDB. */ + r->cdb1 = req->cmd.buf[1]; + r->group_number = req->cmd.buf[6]; + case 4: + /* 12-byte CDB. */ + +Also, + +[qemu/hw/scsi/scsi-disk.c:2063]: (warning) %lu in format string (no. 1) requires 'unsigned long' but the argument type is 'signed long'. +[qemu/hw/scsi/scsi-disk.c:2066]: (warning) %lu in format string (no. 1) requires 'unsigned long' but the argument type is 'signed long'. +[qemu/hw/scsi/scsi-disk.c:2069]: (warning) %lu in format string (no. 1) requires 'unsigned long' but the argument type is 'signed long'. +[qemu/hw/scsi/scsi-disk.c:2083]: (warning) %lu in format string (no. 2) requires 'unsigned long' but the argument type is 'signed long'. + +The issue with the missing break has been fixed here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ed45cae39117d41315541 + +I am currently not able to reproduce the problem with the format strings ... how did you get them? Which compiler (and version) did you use? + +>I am currently not able to reproduce the problem with the format strings ... +>how did you get them? Which compiler (and version) did you use? + +I used a static analyser for C & C++ called cppcheck. It is available +from sourceforge. I find it very useful. + +I think gcc might be able to reproduce these problems with one of the +higher warning levels. -Wformat=2 springs to mind, but a check of the gcc +documentation around -Wformat will give more accurate guidance. + +The issue with the format strings should now be fixed, too: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=142c21455bb2416b37f71b + diff --git a/results/classifier/zero-shot/105/semantic/1596009 b/results/classifier/zero-shot/105/semantic/1596009 new file mode 100644 index 000000000..1506b9b11 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1596009 @@ -0,0 +1,53 @@ +semantic: 0.813 +other: 0.800 +graphic: 0.791 +instruction: 0.776 +device: 0.774 +network: 0.693 +socket: 0.672 +vnc: 0.667 +assembly: 0.654 +boot: 0.554 +mistranslation: 0.455 +KVM: 0.122 + +config/build problem due to libncursesw on Xenial + +it happened to me during a build of yocto/bitbake related cross tools. the auto-configuration part titled "SDL probe" for qemu-2.2.0 i found the configuration step failing for the compile_prog routine. actually those test compile went fine but only the test linking failed. + +this was due a reference of the sub-sub-...-included libcaca referenced an initially not installed (hint: check for and report such pre-requisites upfront - might be yocto related) and later on installed by me component of name libncursesw seemingly in its dev variant (i was installing libncursesw5-dev_6.0+20160213-1ubuntu1_amd64.deb). tests on the command line showed that adding the required paths and resources made the test application link nicely. + +a quick hack attempt for the config script resulted in those line: + sdl_libs="$sdl_libs -L/lib/x86_64-linux-gnu -lncursesw" +this allowed me to pass the configuration check nicely. +i am just seeing my full scale compile fail for the same reason multiple times for linking. that all should be fixable the same way. + +you might or might not have addressed this in newer versions of your package. but you probably know that setups for embedded targets will sometimes lack behind in their evolution until a sudden (well prepared) some big jump in versions does happen. so i leave the hint here for your reference - for the main reason of this very often spotted message - raised by several main reasons according to public web reports, but not this one until right here and now: + +| ERROR: User requested feature sdl +| configure was not able to find it. +| Install SDL devel + +By the way these lines already have to locations in the configure script +where the first indicates that pkg/sdl/sdl2-config application is not there (=no SDL devel there) +whilst the second indicates that *-config is there but the test compile failed (=devel is broken for some other reason). +This could/should see some improvement as well as this is the first hint on what went wrong - and in the second case you definitely can give the user the quite valueable hint for the log file with the results of the test compile. + +Could you please try to reproduce this problem with the latest release of QEMU (version 2.6)? Thanks! + +This is not an issue with Ubu but a Yocto build issue. In Xenial libsdl1.2-dev depends on libcaca-dev which depends on libcaca0 which depends on ncursesw5. Additionally libcaca.so is looking for symbols in libncursesw5 with variable decorations like `resize_term@NCURSESW_5.3.20021019', yet Yocto builds ncurses-native with the symbol decorations like 'resize_term@Base', thus the symbols aren't found when configure attempts to build a test application to check on SDL. So basically the above is describing a confusing web of who is providing what (host vs. Yocto -native) and what each library looks like internally. + +You can see these details better if you do something like: +bitbake qemu -c devshell +edit configure to add '-x' to the #! line +run '../temp/run.do_configure' +after the failure is displayed, cd ../build and run the gcc line from the failure msg + +To work around this you can force use of the host's ncurses library, so do something like this in your build's local.conf: + +ASSUME_PROVIDED += "libsdl-native ncurses-native" + +This is not a Ubu bug, so this issue should be closed. + +Closing according to comment #2. + diff --git a/results/classifier/zero-shot/105/semantic/1603693 b/results/classifier/zero-shot/105/semantic/1603693 new file mode 100644 index 000000000..da656bdf2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1603693 @@ -0,0 +1,141 @@ +semantic: 0.876 +assembly: 0.867 +other: 0.853 +instruction: 0.814 +device: 0.775 +graphic: 0.768 +network: 0.767 +boot: 0.754 +mistranslation: 0.663 +socket: 0.658 +vnc: 0.560 +KVM: 0.556 + +Disks in mptsas1068 scsi controller not seen by linux + +When using the mptsas1068 scsi controller, linux detects the controller itself but not the drives attached to it. Freebsd works. Using a different controller with linux works. VMware with linux works. + +qemu 2.6.50 (v2.6.0-1925-g6b92bbf) +seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 support) + +Test script, loosely based off what libvirt runs and the libvirt tests that Paolo Bonzini wrote [1] + +##################### +iso=archlinux-2016.07.01-dual.iso +#iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso +device=mptsas1068 +#device=lsi + +img=empty.img +qemu-img create -f qcow2 $img 1G + +/usr/bin/qemu-system-x86_64 \ +-enable-kvm \ +-m 1024 \ +-boot menu=on \ +-device $device,id=scsi0,bus=pci.0,addr=0x9 \ +-drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \ +-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \ +-drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \ +-device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 +##################### + +The ISOs can be downloaded from [2] and [3]. + +After booting linux, do "lsblk". /dev/sda should exist. + +After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU HARDDISK" should be mentioned. + +With device=mptsas1068 this fails in linux. + +With device=lsi line it works in both. + +With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only loads modules for mptsas1068, this works. + +I also reproduced this with the debian 8.5 netinstall image, but it insists in making you pick a driver from a list of modules when it fails to mount it, instead of dropping to a shell. + +Arch linux dmesg output snippet (full output attached as arch-linux-dmesg.txt): + +##################### +root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0 +[ 0.000000] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 +[ 0.000000] Normal empty +[ 0.000000] Preemptible hierarchical RCU implementation. +[ 1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249) +[ 1.951581] SCSI subsystem initialized +[ 1.957113] Fusion MPT base driver 3.04.20 +[ 1.957618] Fusion MPT SAS Host driver 3.04.20 +[ 2.281773] scsi host0: ata_piix +[ 2.285372] scsi host1: ata_piix +[ 2.305803] mptbase: ioc0: Initiating bringup +[ 2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator} +[ 2.444390] scsi 0:0:1:0: CD-ROM QEMU QEMU DVD-ROM 2.5+ PQ: 0 ANSI: 5 +[ 2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, MaxQ=128, IRQ=11 +[ 2.507024] sr 0:0:1:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray +[ 2.507274] sr 0:0:1:0: Attached scsi CD-ROM sr0 +##################### + +The controller itself is detected, the disk isn't. + +An early version of this patch [4] said that it was only tested with FreeBSD: + +>Tested with FreeBSD for now. The previous version (before the +>configuration page rewrite) worked with RHEL and Windows guests as well. +> +>TODO: write qtest for (at least) config pages, test Linux and Windows. + +[1]: https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=fc922eb2080a3fa7b24bc8a8b0aabfd394480143 +[2]: https://www.archlinux.org/download +[3]: https://www.freebsd.org/where.html +[4]: https://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06475.html + + + + + +Linux requires that you specify a WWN for the disk (through the wwn property of the scsi-disk device). + +Welp. Yeah now I see it, it was in the test case I linked. Thanks. + +Vmware doesn't seem to need this. Seems like it assigns a WWN of 0x5000c293944837df to my disk (not in the vm config files as far as i can see, seems to persist across reboots) + +[ 2.305111] ioc0: LSISAS1068 B0: Capabilities={Initiator} +[ 2.445800] scsi host2: ioc0: LSISAS1068 B0, FwRev=01032920h, Ports=1, MaxQ=128, IRQ=18 +[ 2.447672] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x5000c293944837df +[ 2.448806] scsi 2:0:0:0: Direct-Access VMware, VMware Virtual S 1.0 PQ: 0 ANSI: 2 + +Qemu with the manually specified WWN, for reference: + +[ 3.656894] ioc0: LSISAS1068 A0: Capabilities={Initiator} +[ 3.790680] scsi host0: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, MaxQ=128, IRQ=10 +[ 3.792232] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x5000c50015ea71ac +[ 3.792476] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 + +Also vmware doesn't populate /dev/disk/by-id/wwn-*: + +# ls /dev/disk/by-id +ata-VMware_Virtual_IDE_CDROM_Drive_00000000000000000001@ dm-name-arch_airootfs@ + +Qemu: + +# ls /dev/disk/by-id +ata-QEMU_DVD-ROM_QM00002@ scsi-35000c50015ea71ac@ scsi-35000c50015ea71ac-part2@ wwn-0x5000c50015ea71ac@ wwn-0x5000c50015ea71ac-part2@ +dm-name-arch_airootfs@ scsi-35000c50015ea71ac-part1@ scsi-35000c50015ea71ac-part3@ wwn-0x5000c50015ea71ac-part1@ wwn-0x5000c50015ea71ac-part3@ + + +Not directly related: after getting the arch iso cd to boot, I found that the VM that I actually wanted to get working uses mptspi instead of mptsas. So I didn't even need this controller... + +The non-working vmware config says `scsi0.virtualDev = "lsilogic"` (that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`. + +Is it correct to say that the LSI53C1030 parts of [1] were never applied? + +[1]: http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01608.html + +> The non-working vmware config says `scsi0.virtualDev = "lsilogic"` +> (that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas +> tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`. +> +> Is it correct to say that the LSI53C1030 parts of [1] were never applied? + +Yes, that's correct. The patch you linked was almost entirely rewritten. + diff --git a/results/classifier/zero-shot/105/semantic/1617114 b/results/classifier/zero-shot/105/semantic/1617114 new file mode 100644 index 000000000..c0cc431cc --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1617114 @@ -0,0 +1,61 @@ +semantic: 0.776 +graphic: 0.680 +mistranslation: 0.661 +instruction: 0.646 +other: 0.640 +network: 0.541 +device: 0.525 +boot: 0.453 +socket: 0.407 +KVM: 0.388 +vnc: 0.375 +assembly: 0.310 + +Qemu 2.6.0 freezes with windows guests + +When launching qemu with the same command line as before 2.6.0, with SDL display, with virtio, for a win-10 guest: + +qemu-system-x86_64 -enable-kvm -name win-10 -machine type=pc,accel=kvm -cpu host -smp cores=1,threads=2,sockets=1 -m 2.7G -balloon virtio -drive file=/home/<username>/.qemu/imgs/win10-coe.qcow2,index=0,media=disk,if=virtio -drive file=/usr/share/virtio/virtio-win.iso,index=1,media=cdrom -drive file=/usr/share/spice-guest-tools/spice-guest-tools.iso,index=2,media=cdrom -net nic,model=virtio -net tap,ifname=tap0,script=no,downscript=no,vhost=on -usbdevice tablet -usb -display sdl -vga qxl -soundhw ac97 -rtc base=localtime -usbdevice host:0b0e:0032 -usbdevice host:0b0e:0348 -usbdevice host:0b0e:0410 + +Qemu at some point just freezes with no error message at all with newer version 2.6.0-1. + +Reverting to prior version 2.5.1-1, things go back to normal. + +A simple way to accelerate the freeze is to have qemu launch in a workspace/desktop, and then move to a different workspace/desktop, and then move back to the qemu workspace/desktop, and you'll find out it's frozen. + +BTW, there's no way to get into qemu monitor mode terminal at all once frozen. The monitor terminal shows up, but does nothing... + +Perhaps it's useful to notice that I have up to date win-10 virtio drivers for ethernet, scsi/storage, qxl-dod, balloon, and serial interface drivers. The ISO version used is 0.1.118.1 (virtio-win AUR package). + +Using the standard (std) qemu video driver, rather than the qxl-dod one makes no difference BTW. + +Just in case, running on up to date x86-64 Arch, plain qemu command line. + +Can you please try with 2.6.1 or 2.7.0-rc4? + +Tested 2.6.1, fails/freezes the same way, :-( + +Tested as well 2.7.0, and it now fails on windows start with: + +KMODE_EXCEPTION_NOT_HANDLED (viostor.sys) + +Notice 2.5.1 still works just fine. + +Qemu 2.8.0 is no better. Actually now win-10 can even boot, getting the light blue window with sad face saying: "Your PC ran into a problem and needs to restart...". Moreover, the qemu monitor mode (alt-2) pops up a frozen useless window, so no way to try reseting... + +More narrowed now, :-) + +With 2.8.0 qemu keeps freezing bad, when used with "-display sdl". However when used with "-display gtk" or "-display none", then it doesn't freeze. + +So it seems "-display sdl" is the one totally breaking windows guest on qemu. + +Notice that if I don't try other displays, then I wouldn't even notice it was just the SDL display. + +If there's no intention on fixing SDL, given other alternatives are available, in particular a GTK one for displaying the graphics output, then I'm OK with a "no fix" for this. + +As a bonus, it seems no display (-display none), with current qxl-dod windows driver from Fedora project, seems to be working fine with spice. That was not working before... So getting away from SDL display now. But no sure if this means SDL never again? :-) + +Which version of SDL were you using here? SDL 1.2 or SDL 2.0? If you were using SDL 1.2, could you please try with SDL 2.0 instead? Support for 1.2 has been removed now... + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1629483 b/results/classifier/zero-shot/105/semantic/1629483 new file mode 100644 index 000000000..77cd57e33 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1629483 @@ -0,0 +1,53 @@ +semantic: 0.713 +boot: 0.580 +device: 0.550 +instruction: 0.532 +network: 0.509 +other: 0.507 +graphic: 0.401 +socket: 0.300 +KVM: 0.286 +mistranslation: 0.278 +vnc: 0.233 +assembly: 0.149 + +Build fails on optionrom + +Git pseudo-bisected (focused on optionrom commits) it to this commit. + +commit cdbd727c20ad7aac7797dc8c95e485e1a4c6901b +Author: Richard Henderson <email address hidden> +Date: Thu Jul 7 21:49:36 2016 -0700 + + build: Use $(AS) for optionrom explicitly + + +Build output (non-verbose): + + AS optionrom/linuxboot.o +cpp: fatal error: '-c' is not a valid option to the preprocessor +compilation terminated. +cpp: fatal error: '-c' is not a valid option to the preprocessor +compilation terminated. + CC optionrom/linuxboot_dma.o + CC /home/bkamath/dev/workspace/block-2/mothra/output/sp0/targetqga/main.o + AS optionrom/kvmvapic.o +cpp: fatal error: '-c' is not a valid option to the preprocessor +compilation terminated. + +Steps to reproduce: +Using buildroot and overriding qemu version to 2.7.0 +Fedora 24, cpp (GCC) 6.2.1 20160916 (Red Hat 6.2.1-2) + +I tried first just building without the -c option but it hangs indefinitely. Reverting the above listed commit fixes the problem on my platform. I didn't dive much further into this, because this seems like a regression. + +I am seeing the same problem. Cross compiling QEMU 2.7 using buildroot get fatal error -c is not a valid option. As Benjamin states removing the -c flag from Makefile gets through the compile, but when booting a virtual image of Ubuntu 16.04 the network does not come up (console is live and you can login through the console, but the only network interface is loopback) I have not diagnosed further. + +I was not able to simply back out the optionrom commit Benjamin cites... caused problems elsewhere, perhaps because I was not doing it right. Reverting to QEMU 2.6.2 does work. + +David + +Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1670377 b/results/classifier/zero-shot/105/semantic/1670377 new file mode 100644 index 000000000..b9d41afaf --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1670377 @@ -0,0 +1,112 @@ +semantic: 0.751 +graphic: 0.735 +instruction: 0.719 +assembly: 0.707 +device: 0.683 +network: 0.681 +other: 0.665 +vnc: 0.624 +boot: 0.513 +KVM: 0.505 +socket: 0.486 +mistranslation: 0.437 + + VNC: short read for zlre data/RDR EndOfStream + +In openQA we have a custom VNC client (https://github.com/os-autoinst/os-autoinst/tree/master/consoles), which connects to QEMU guest and from there performs actions (sends keys, handles pointer, ...). We have several backends (https://github.com/os-autoinst/os-autoinst/tree/master/backend). With qemu backend we start QEMU guest *locally* on openQA worker which connects to it via VNC and sends commands. That works fine. + +However, with svirt backend we start QEMU on a KVM or Xen host and then connect to it remotely from openQA worker - the guest and worker are different systems. In this scenario fairly often happens that while system operates in Grub2, QEMU stops sending data via VNC: + +... +15:24:15.5341 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:50 called testapi::send_key +15:24:15.5342 27074 <<< testapi::send_key(key='c') +15:24:15.7361 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:51 called testapi::type_string +15:24:15.7362 27074 <<< testapi::type_string(string='gfxmode=1024x768; terminal_output console; terminal_output gfxterm +', max_interval=250, wait_screen_changes=0) +15:24:22.2243 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:53 called testapi::send_key +15:24:22.2244 27074 <<< testapi::send_key(key='esc') +15:24:22.4255 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:79 called testapi::send_key +15:24:22.4256 27074 <<< testapi::send_key(key='e') +15:24:22.6264 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:22.6265 27074 <<< testapi::send_key(key='down') +15:24:22.8273 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:22.8274 27074 <<< testapi::send_key(key='down') +15:24:23.0282 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:23.0283 27074 <<< testapi::send_key(key='down') +DIE short read for zlre data 107132 - 995002 at /usr/lib/os-autoinst/consoles/VNC.pm line 978. + + at /usr/lib/os-autoinst/backend/baseclass.pm line 73. +... + +My observation is that it happens only while in Grub, when resolution happened a short while ago. See attached video and log. + +Prior to QEMU 2.8.0 I was able to reproduce a similar issue with vncviewer. I started QEMU with SLES JeOS image pressed several times a 'down' key in Grub and vncviewer (Tiger VNC 1.6.0 from openSUSE Leap 42.2) crashed with rdr::EndOfStream exception. This does not happen with QEMU 2.8.0, but I am still able to reproduce similar issue via openQA. + +/usr/bin/qemu-system-x86_64 -name guest=openQA-SUT-20,debug-threads=on -S -machine pc-i440fx-2.6,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 87535fc1-e693-41b9-813e-834d6fc4cb5a -no-user-config -nodefaults -rtc base=utc -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/openQA-SUT-20.img,format=qcow2,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev user,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:12:34:56,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device virtio-tablet-pci,id=input0,bus=pci.0,addr=0x6 -device virtio-keyboard-pci,id=input1,bus=pci.0,addr=0x7 -vnc 0.0.0.0:20,share=force-shared -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on -monitor stdio + +Host: openSUSE Leap 42.2 x86_64 KVM or Xen on x86_64 Intel with QEMU 2.6.0. +Guest: Leap 42.2. + +I can't reproduce the problem with QEMU 2.5.0, but I can with any QEMU version from 2.6 RC1 on. + + + + + +It isn't 100% clear from the info provided, but this is almost certainly fixed in 2.9.0 by + +commit 537848ee62195fc06c328b1cd64f4218f404a7f1 +Author: Michael Tokarev <email address hidden> +Date: Fri Feb 3 12:52:29 2017 +0300 + + vnc: do not disconnect on EAGAIN + + When qemu vnc server is trying to send large update to clients, + there might be a situation when system responds with something + like EAGAIN, indicating that there's no system memory to send + that much data (depending on the network speed, client and server + and what is happening). In this case, something like this happens + on qemu side (from strace): + + sendmsg(16, {msg_name(0)=NULL, + msg_iov(1)=[{"\244\"..., 729186}], + msg_controllen=0, msg_flags=0}, 0) = 103950 + sendmsg(16, {msg_name(0)=NULL, + msg_iov(1)=[{"lz\346"..., 1559618}], + msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN + sendmsg(-1, {msg_name(0)=NULL, + msg_iov(1)=[{"lz\346"..., 1559618}], + msg_controllen=0, msg_flags=0}, 0) = -1 EBADF + + qemu closes the socket before the retry, and obviously it gets EBADF + when trying to send to -1. + + This is because there WAS a special handling for EAGAIN, but now it doesn't + work anymore, after commit 04d2529da27db512dcbd5e99d0e26d333f16efcc, because + now in all error-like cases we initiate vnc disconnect. + + This change were introduced in qemu 2.6, and caused numerous grief for many + people, resulting in their vnc clients reporting sporadic random disconnects + from vnc server. + + Fix that by doing the disconnect only when necessary, i.e. omitting this + very case of EAGAIN. + + Hopefully the existing condition (comparing with QIO_CHANNEL_ERR_BLOCK) + is sufficient, as the original code (before the above commit) were + checking for other errno values too. + + Apparently there's another (semi?)bug exist somewhere here, since the + code tries to write to fd# -1, it probably should check if the connection + is open before. But this isn't important. + + Signed-off-by: Michael Tokarev <email address hidden> + Reviewed-by: Daniel P. Berrange <email address hidden> + Message-id: <email address hidden> + Fixes: 04d2529da27db512dcbd5e99d0e26d333f16efcc + Cc: Daniel P. Berrange <email address hidden> + Cc: Gerd Hoffmann <email address hidden> + Cc: <email address hidden> + Signed-off-by: Gerd Hoffmann <email address hidden> + + diff --git a/results/classifier/zero-shot/105/semantic/1677492 b/results/classifier/zero-shot/105/semantic/1677492 new file mode 100644 index 000000000..cc1d28c30 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1677492 @@ -0,0 +1,62 @@ +semantic: 0.781 +device: 0.713 +graphic: 0.560 +mistranslation: 0.500 +other: 0.469 +vnc: 0.455 +instruction: 0.391 +network: 0.367 +boot: 0.273 +socket: 0.147 +assembly: 0.136 +KVM: 0.054 + +block_set_io_throttle complaints Need exactly one of 'device' and 'id' + +All of sudden, after a qemu update, block_set_io_throttle does not work anymore. + +Full command to QEMU monitor -- + +(qemu) block_set_io_throttle db 0 0 0 0 0 0 +Need exactly one of 'device' and 'id' + +The help text still point to the same old syntax, which no longer works. + +On 03/30/2017 02:14 AM, dE wrote: +> Public bug reported: +> +> All of sudden, after a qemu update, block_set_io_throttle does not work +> anymore. +> +> Full command to QEMU monitor -- +> +> (qemu) block_set_io_throttle db 0 0 0 0 0 0 +> Need exactly one of 'device' and 'id' +> +> The help text still point to the same old syntax, which no longer works. + +Broken in 2.8, fixed here (will be in 2.9): + +commit 3f35c3b166c18043596768448e5d91b5d52f8353 +Author: Eric Blake <email address hidden> +Date: Fri Jan 20 17:03:59 2017 -0600 + + hmp: fix block_set_io_throttle + + Commit 7a9877a made the 'device' parameter to BlockIOThrottle + optional, favoring 'id' instead. But it forgot to update the + HMP usage to set has_device, which makes all attempts to change + throttling via HMP fail with "Need exactly one of 'device' and 'id'" + + CC: <email address hidden> + Signed-off-by: Eric Blake <email address hidden> + Message-Id: <email address hidden> + Reviewed-by: Stefan Hajnoczi <email address hidden> + Signed-off-by: Dr. David Alan Gilbert <email address hidden> + +-- +Eric Blake eblake redhat com +1-919-301-3266 +Libvirt virtualization library http://libvirt.org + + + diff --git a/results/classifier/zero-shot/105/semantic/1682681 b/results/classifier/zero-shot/105/semantic/1682681 new file mode 100644 index 000000000..d1411327e --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1682681 @@ -0,0 +1,90 @@ +semantic: 0.770 +graphic: 0.769 +network: 0.747 +assembly: 0.742 +instruction: 0.731 +device: 0.716 +other: 0.708 +socket: 0.642 +boot: 0.638 +mistranslation: 0.637 +vnc: 0.577 +KVM: 0.491 + +qemu 2.5 network model rtl8139 collisions Ubuntu 16.04.2 LTS + +When I use NIC model rtl8139, I have a lot collisions and very low transfer. +I tested that with brctl and Open vSwitch, because I thought that was a vSwitch issue. +When I change NIC model to virtio all works as expect. + +Host: Ubuntu 16.04.2 LTS +Guest: Ubuntu 14.04.5 LTS - affected +Guest: Ubuntu 16.04.2 LTS - not affected + +QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.10) + +Thanks Thomas for reassigning, and hi Bartłomeij, +Btw - I'd recommend very much to user virtio over rtl driver anyway [1], but that is not the point here. + +Thanks for retesting with brctl and moving OVS out of the equation already. +The difference certainly is within the guest drivers for that network card between the 14.04 and 16.04 guest. + +I checked the changes we had in between the respective kernels and there were not that much for the drivers themselves at least. Mostly bug fixes and while it could be anything else in the kernels this is certainly worth a quick test. There is one in particular which could be interesting that enabled TSO offloading by default. +You can check in your guests with + $ ethtool -k <device> +what the offloads currently are. +Please check if more differ than just TSO (usually the list grows the newer things get). +Then on the 16.04 guest modify the config one-by-one to match the one you have seen with the 14.04 guest. +If you happen to find a single offload feature that switches good/bad behavior get back here. + +Furthermore we can exclude other packages here by using the HWE kernels [2]. Could you confirm that the 14.04 guest with the HWE-x kernel booted shows the same bad behavior? +That would exclude anything of 16.04 other than the kernel to cause the difference. + +If so it would be great to further shrink the range we are looking at by trying HWE- +To do so in your case take the 14.04 the guest and install the packages +linux-image-virtual-lts-utopic, linux-image-virtual-lts-vivid, linux-image-virtual-lts-wily, linux-image-virtual-lts-xenial. Then modify the boot loader (or interactively select at the prompt) to boot one after the other and check your results as well as the maybe related offload settings of above. + +Also to better reproduce this could you outline what kind/direction of workload you are testing +- Is it Guest-to-Guest or Traffic from the outside? +- What is the network traffic you are using, can it be in archive net tools or only a custom workload in your setup? + +Summary: +- please verify that the same happens in 14.04 + HWE+x kernel (go on with that setup if it shows the issue) +- please check different HWE level which is the first to show the issue (go on with the oldest of the HWE kernels that show the issue) +- please compare and test different offload settings as outlined above +- please describe your workload more in Details so that we can try to reproduce + +[1]: http://www.linux-kvm.org/page/Tuning_KVM +[2]: https://wiki.ubuntu.com/Kernel/LTSEnablementStack + +A quick check on a Trusty Guest modified from the uvt default of virtio to use rtl8139 and then moving into the kernel that is likely the reason shows me this for a trivial 1 connection duplex iperf streaming load to the Host: + +Release Nettype Kernel - Result +1. 14.04 virtio 3.13.0-116 - ~11 + 8 GBits/s +2. 14.04 rtl8139 3.13.0-116 - 124 + 824 Mbits/s +3. 14.04 rtl8139 4.4.0-72 - 758 + 703 MBits/s +4. 14.04 rtl8139 4.4.0-72 - 115 + 795 MBits/s + +Notes: +On #2: I already see 13k receive drops here +on #3: I can confirm TSO, GSO, SG and IP Checksum offloads on as expected, they help to speed up my load despite now seeing 26k receive drops +On #4: slow again back to ~14k drops + +Note: disabling offloads via: +$ sudo ethtool -K eth0 tx-tcp-segmentation off +$ sudo ethtool -K eth0 tx-checksum-ipv4 off +$ sudo ethtool -K eth0 tx-scatter-gather off + +There is quite a chance that the generally much better behavior of enabling those offloads for your specific case it is a drawback. In that case please check with the disabling of the offloads and help to clarify the details I asked for. + +Yet overall IMHO - as I stated in my first comment - I'd strongly vote to use the virtio driver and be much faster than any rtl8139 based network would be. + +The affected Ubuntu 14.04.5 LTS has kernel HWE 4.4.0-72-generic + +Usually I'm using virtio but that was first time when I used vSwitch and I would rather start with rtl8139. + +Because that is my production env I need to prepare new VMs. It's take me few days. + + +[Expired for qemu (Ubuntu) because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1689499 b/results/classifier/zero-shot/105/semantic/1689499 new file mode 100644 index 000000000..b8207e021 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1689499 @@ -0,0 +1,100 @@ +semantic: 0.802 +mistranslation: 0.773 +vnc: 0.770 +other: 0.757 +network: 0.750 +KVM: 0.713 +graphic: 0.696 +device: 0.628 +socket: 0.627 +assembly: 0.620 +instruction: 0.616 +boot: 0.601 + +copy-storage-all/inc does not easily converge with load going on + +Hi, +for now this is more a report to discuss than a "bug", but I wanted to be sure if there are things I might overlook. + +I'm regularly testing the qemu's we have in Ubuntu which currently are 2.0, 2.5, 2.6.1, 2.8 plus a bunch of patches. And for all sorts of verification upstream every now and then. + +I recently realized that the migration options around --copy-storage-[all/inc] seem to have got worse at converging on migration. Although it is not a hard commit that is to be found, it just seems more likely to occur the newer the qemu versions is. I assume that is partially due to guest performance optimization that keep it busy. +To a user it appears as a hanging migration being locked up. + +But let me outline what actually happens: +- Setup without shared storage +- Migration using --copy-storage-all/--copy-storage-inc +- Working fine with idle guests +- If the guests is busy the migration does take like forever (1 vCPU that are busy with 1 CPU, 1 memory and one disk hogging processes) +- statistically seems to trigger more likely on newer qemu's (might be a red herring) + +The background workloads are most trivial burners: +- cpu: md5sum /dev/urandom +- memory: stress-ng -m 1 --vm-keep --vm-bytes 256M +- disk: while /bin/true; do dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=100; done + +We are talking about ~1-2 minutes on qemu 2.5 (4 tries x 3 architectures) and 2-10+ hours on >=qemu 2.6.1. + +I say it is likely not a bug, but more a discussion as I can easily avoid hanging via either: +- timeouts (--timeout, ...) to abort or suspend to migrate it +- --auto-converge ( I had only one try, but it seemed to help by slowing down the load generators) + +So you might say "that is all as it should be, and the users can use the further options to mitigate" and I'm all fine with that. In that case the bug still serves as a "searchable" document of some kind for others triggering the same case. But if anything comes to your mind that need better handling around this case lets start to discuss more deeply about it. + +Interesting. +That's quite a big difference, so if you could bisect it down it would be interesting to figure out where the change occurred. + +What happens if you just make it a 'disk' workload without the memory stress? + +What network interface (1G/10G etc) are you migrating over and what bandwidth limit have you got set? + +Dave + +On Tue, May 9, 2017 at 12:41 PM, Dr. David Alan Gilbert <<email address hidden> +> wrote: + +> Interesting. +> That's quite a big difference, so if you could bisect it down it would be +> interesting to figure out where the change occurred. +> + +Hi David, +if it turns out to stay reproducible enough I can certainly try somewhen +next week. + +What happens if you just make it a 'disk' workload without the memory +> stress? +> + +Will do so along my checks if it triggers reliably enough for a bisect. + + +> What network interface (1G/10G etc) are you migrating over and what +> bandwidth limit have you got set? +> + +No explicit bandwith limit set, the connection itself is only virtual +(migrating two libvirt/qemu stacks in between lxd containers) so other than +networking overhead this is usually really fast. +I quickly sniffed with iperf on a few of the test hosts and speed was +around 30-120 GBit/s which should qualify as "fast enough". + +I'll get back to you once I found the time to verify reproducibility and +hopefully a bisect. +I beg your pardon as this might need a few days (to free up my tasks as +well a system capable to do so). + + +Ok, I found an hour to set up a test environment. +I already had all the bisect script written until the systems were ready, but unfortunately it is not reproducible enough with my build from git as of today's master - out of 8 tries it showed as less of a slowdown once, and never to the hours of wait that I saw when testing on a wider scope. + +If my regular testing triggers these issues again I'll try to sort out what the difference is between this and my bisect environment (I set the latter up with the scripts that do the former), but until then I'm kind of out of options for now :-/. + + + +I'm "incompleting" myself until I was able to provide more :-/ + +OK, it'll be interesting if you get something repeatable, but sometimes these type of bugs go and hide in a dark corner until someone trips over them in a more critical situation. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1695286 b/results/classifier/zero-shot/105/semantic/1695286 new file mode 100644 index 000000000..d4523998d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1695286 @@ -0,0 +1,37 @@ +semantic: 0.685 +graphic: 0.685 +device: 0.622 +network: 0.576 +boot: 0.512 +other: 0.477 +vnc: 0.472 +mistranslation: 0.451 +instruction: 0.437 +socket: 0.432 +KVM: 0.423 +assembly: 0.339 + +Add multiboot2 support + +multiboot2 is a recent specification that resolves some of the issues of multiboot. Multiboot2 is supported by some tools already (e.g. grub). + +It would be great if one can run OS with multiboot2 using '-kernel' option, similar as it is done now with multiboot images. + +Quick googling shows there is a Debian bug and patch that adds multiboot support https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=621529 Would it be possible to integrate it to QEMU upstream? + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + + +Marking new to migrate to the new bug tracker. + +It would be really great to see this in QEMU! + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/389 + + diff --git a/results/classifier/zero-shot/105/semantic/1696180 b/results/classifier/zero-shot/105/semantic/1696180 new file mode 100644 index 000000000..647e4b7d0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1696180 @@ -0,0 +1,68 @@ +semantic: 0.912 +other: 0.910 +assembly: 0.898 +mistranslation: 0.895 +KVM: 0.884 +graphic: 0.876 +instruction: 0.871 +device: 0.852 +vnc: 0.842 +boot: 0.815 +socket: 0.805 +network: 0.775 + +Issues with qemu-img, libgfapi, and encryption at rest + +Hi, + +Encryption-at-rest has been supported for some time now. The client is responsible for encrypting the files with a help of a master key file. I have a properly setup environment and everything appears to be working fine but when I use qemu-img to move a file to gluster I get the following: + + +# qemu-img convert -f raw -O raw linux.iso gluster://gluster01/virt0/linux.raw +[2017-06-06 16:52:25.489720] E [mem-pool.c:579:mem_put] (-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f30f7a36d35] -->/lib64/libglusterfs.so.0(+0x59f02) [0x7f30f7a32f02] -->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f30f7a24a60] ) 0-mem-pool: mem-pool ptr is NULL +[2017-06-06 16:52:25.490778] E [mem-pool.c:579:mem_put] (-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f30f7a36d35] -->/lib64/libglusterfs.so.0(+0x59f02) [0x7f30f7a32f02] -->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f30f7a24a60] ) 0-mem-pool: mem-pool ptr is NULL +[2017-06-06 16:52:25.492263] E [mem-pool.c:579:mem_put] (-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f30f7a36d35] -->/lib64/libglusterfs.so.0(+0x59f02) [0x7f30f7a32f02] -->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f30f7a24a60] ) 0-mem-pool: mem-pool ptr is NULL +[2017-06-06 16:52:25.497226] E [mem-pool.c:579:mem_put] (-->/lib64/libglusterfs.so.0(syncop_create+0x44d) [0x7f30f7a3cf5d] -->/lib64/libglusterfs.so.0(+0x59f02) [0x7f30f7a32f02] -->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f30f7a24a60] ) 0-mem-pool: mem-pool ptr is NULL + +On and on until I get this message: + +[2017-06-06 17:00:03.467361] E [MSGID: 108006] [afr-common.c:4409:afr_notify] 0-virt0-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. +[2017-06-06 17:00:03.467442] E [MSGID: 108006] [afr-common.c:4409:afr_notify] 0-virt0-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up. + +I asked for help assuming it's a problem with glusterfs and was told it appears qemu-img's implementation of libgfapi doesn't call the xlator function correctly. + +I'm using Fedora 24 with version: + +qemu-img 2.6.2 +glusterfs-api-3.8.12 + +When reporting bugs to the QEMU project, please always try with the latest release first (distros are often not shipping the latest version). So can you please try with the latest release of QEMU (currently version 2.9.0)? + +Just upgraded to 2.9.0 and actually I see a different issue: + +# qemu-img convert -O raw fedora.iso gluster://dalpinfglt04/virt0/fedora6.raw +[2017-06-07 16:52:43.300902] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-virt0-client-2: server 172.19.38.42:49152 has not responded in the last 42 seconds, disconnecting. +[2017-06-07 17:02:44.342745] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f78c3e4fe6d] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f78c3c169a1] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f78c3c16abe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x87)[0x7f78c3c18157] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f78c3c18c28] ))))) 0-virt0-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-06-07 16:52:00.618744 (xid=0x1c) +[2017-06-07 17:02:44.342952] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f78c3e4fe6d] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f78c3c169a1] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f78c3c16abe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x87)[0x7f78c3c18157] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f78c3c18c28] ))))) 0-virt0-client-2: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2017-06-07 16:52:00.618753 (xid=0x1d) +[2017-06-07 17:02:44.343415] E [MSGID: 114031] [client-rpc-fops.c:1593:client3_3_finodelk_cbk] 0-virt0-client-2: remote operation failed [Transport endpoint is not connected] +[2017-06-07 17:08:49.367264] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-virt0-client-3: server 172.19.38.43:49152 has not responded in the last 42 seconds, disconnecting. +[2017-06-07 17:13:29.969206] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f78c3e4fe6d] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f78c3c169a1] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f78c3c16abe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x87)[0x7f78c3c18157] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f78c3c18c28] ))))) 0-virt0-client-3: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-06-07 17:08:06.371259 (xid=0x22) +[2017-06-07 17:13:29.969250] E [MSGID: 114031] [client-rpc-fops.c:1593:client3_3_finodelk_cbk] 0-virt0-client-3: remote operation failed [Transport endpoint is not connected] +[2017-06-07 17:13:29.969355] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7f78c3e4fe6d] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f78c3c169a1] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f78c3c16abe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x87)[0x7f78c3c18157] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f78c3c18c28] ))))) 0-virt0-client-3: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2017-06-07 17:08:06.371268 (xid=0x23) +[2017-06-07 17:13:29.972665] E [MSGID: 108008] [afr-transaction.c:2619:afr_write_txn_refresh_done] 0-virt0-replicate-1: Failing FSETXATTR on gfid 86042280-9ae1-444f-8342-be4442f82111: split-brain observed. [Input/output error] +[2017-06-07 17:13:29.977821] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 0-virt0-replicate-1: Failing FGETXATTR on gfid 86042280-9ae1-444f-8342-be4442f82111: split-brain observed. [Input/output error] +[2017-06-07 17:13:29.981667] E [MSGID: 114031] [client-rpc-fops.c:1593:client3_3_finodelk_cbk] 0-virt0-client-2: remote operation failed [Invalid argument] +[2017-06-07 17:13:30.157560] E [MSGID: 108006] [afr-common.c:4781:afr_notify] 0-virt0-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. +[2017-06-07 17:13:30.157904] E [MSGID: 108006] [afr-common.c:4781:afr_notify] 0-virt0-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up. +qemu-img: gluster://dalpinfglt04/virt0/fedora6.raw: error while converting raw: Could not create image: Transport endpoint is not connected + +The file was created but nothing was written to it. Either way, I don't think encryption-at-rest is tested much with qemu integration. + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/145 + + diff --git a/results/classifier/zero-shot/105/semantic/1701974 b/results/classifier/zero-shot/105/semantic/1701974 new file mode 100644 index 000000000..702d3e1cd --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1701974 @@ -0,0 +1,50 @@ +semantic: 0.404 +instruction: 0.385 +device: 0.378 +vnc: 0.293 +network: 0.247 +graphic: 0.227 +socket: 0.226 +mistranslation: 0.211 +other: 0.196 +boot: 0.150 +KVM: 0.135 +assembly: 0.111 + +pwrite does not work right under qemu-sh4 + +The pwrite system call has no effect when writing to a non-zero file position, in a program running under qemu-sh4 (version 2.9.0). + +How to reproduce: +- Compile the program: + sh4-linux-gnu-gcc-5 -O -Wall -static -o test-pwrite test-pwrite.c +- Set environment variable for using qemu-sh4 (actually not needed, since the program is statically linked here). +- ~/inst-qemu/2.9.0/bin/qemu-sh4 test-pwrite + +Expected output: +buf = 01W3456789 + +Actual output: +buf = 0123456789 +test-pwrite.c:56: assertion 'strcmp ("01W3456789",buf) == 0' failed +qemu: uncaught target signal 6 (Aborted) - core dumped + + + + + +In case it matters: My host platform is Linux/x86_64. + +The behaviour in qemu-2.10 is the same as in qemu-2.9. + +This might be related to this fix: + +> https://git.qemu.org/?p=qemu.git;a=commit;h=8bf8e9df4a7d82c7a47cc961c9cdee1615595de0 + +FWIW, if you're interested in sh4, please join #debian-ports on OFTC and subscribe to the debian-superh mailing list. We're doing lots of sh4 development and testing QEMU in Debian. + +Works fine in qemu-2.11: +$ ~/inst-qemu/2.11.0/bin/qemu-sh4 test-pwrite +buf = 01W3456789 + + diff --git a/results/classifier/zero-shot/105/semantic/1704658 b/results/classifier/zero-shot/105/semantic/1704658 new file mode 100644 index 000000000..6ef332c89 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1704658 @@ -0,0 +1,69 @@ +semantic: 0.866 +graphic: 0.865 +assembly: 0.827 +device: 0.819 +instruction: 0.817 +socket: 0.783 +other: 0.774 +network: 0.759 +KVM: 0.756 +boot: 0.739 +vnc: 0.715 +mistranslation: 0.595 + +O_CLOEXEC not handled in dup3 system call in user mode + +In qemu user mode, for hppa and sparc64 targets, the parameter of the dup3 is not passed correctly when it contains the O_CLOEXEC flag. + +When the attached program runs, the expected output is: +errno=9=EBADF + +How to reproduce on hppa: +- Compile the program: hppa-linux-gnu-gcc-5 -O -Wall -static testdup3.c -o testdup3-hppa +- Set environment variables for running qemu-hppa. +- ~/inst-qemu/2.9.0/bin/qemu-hppa testdup3-hppa +errno=22=EINVAL +testdup3.c:54: assertion 'errno == EBADF' failed + +How to reproduce on sparc64: +- Compile the program: sparc64-linux-gnu-gcc-5 -O -Wall -static testdup3.c -o testdup3-sparc64 +- Set environment variables for running qemu-sparc64. +- ~/inst-qemu/2.9.0/bin/qemu-sparc64 testdup3-sparc64 +errno=22=EINVAL +testdup3.c:54: assertion 'errno == EBADF' failed + + + + + + + +I see this bug for hppa, sparc64. +I don't see it for m68k, mips, mips64, powerpc, powerpc64. +Most likely because the binary values of O_CLOEXEC on hppa and sparc64 are different than on other platforms. Looking in the glibc source code: + +$ grep -r 'define.*O_CLOEXEC' glibc +glibc/bits/fcntl.h:# define O_CLOEXEC 0x00400000 /* Set close_on_exec. */ +glibc/sysdeps/mach/hurd/bits/fcntl.h:# define O_CLOEXEC 0x00400000 /* Set FD_CLOEXEC. */ +glibc/sysdeps/unix/sysv/linux/sparc/bits/fcntl.h:#define __O_CLOEXEC 0x400000 /* Set close_on_exit. */ +glibc/sysdeps/unix/sysv/linux/bits/fcntl-linux.h:# define __O_CLOEXEC 02000000 +glibc/sysdeps/unix/sysv/linux/bits/fcntl-linux.h:# define O_CLOEXEC __O_CLOEXEC /* Set close_on_exec. */ +glibc/sysdeps/unix/sysv/linux/hppa/bits/fcntl.h:#define __O_CLOEXEC 010000000 /* Set close_on_exec. */ +glibc/sysdeps/unix/sysv/linux/microblaze/bits/fcntl.h:#define __O_CLOEXEC 02000000 /* Set close_on_exec. */ +glibc/sysdeps/unix/sysv/linux/alpha/bits/fcntl.h:#define __O_CLOEXEC 010000000 /* Set close_on_exec. */ +glibc/sysdeps/nacl/bits/fcntl.h:# define O_CLOEXEC 02000000 /* Set close_on_exec. */ + +So, what's missing is probably that the O_CLOEXEC of the target platform gets mapped to O_CLOEXEC of the host platform, during the dup3 system call emulation. + +The behaviour in qemu-2.10 is the same as in qemu-2.9. + +The behaviour in qemu-2.11 is the same as in qemu-2.9. + +Should be fixed by http://patchwork.ozlabs.org/patch/849226/ + + +Fix has been included here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=10fa993aae539fa8d0da1d + +Confirmed: It's fixed in qemu-2.12. + diff --git a/results/classifier/zero-shot/105/semantic/1725707 b/results/classifier/zero-shot/105/semantic/1725707 new file mode 100644 index 000000000..07b512760 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1725707 @@ -0,0 +1,97 @@ +semantic: 0.909 +graphic: 0.894 +network: 0.889 +other: 0.879 +device: 0.872 +assembly: 0.864 +vnc: 0.858 +instruction: 0.847 +mistranslation: 0.819 +boot: 0.810 +KVM: 0.788 +socket: 0.745 + +QEMU sends excess VNC data to websockify even when network is poor + +Description of problem +------------------------- +In my latest topic, I reported a bug relate to QEMU's websocket: +https://bugs.launchpad.net/qemu/+bug/1718964 + +It has been fixed but someone mentioned that he met the same problem when using QEMU with a standalone websocket proxy. +That makes me confused because in that scenario QEMU will get a "RAW" VNC connection. +So I did a test and found that there indeed existed some problems. The problem is: + +When the client's network is poor (on a low speed WAN), QEMU still sends a lot of data to the websocket proxy, then the client get stuck. It seems that only QEMU has this problem, other VNC servers works fine. + +Environment +------------------------- +All of the following versions have been tested: + +QEMU: 2.8.1.1 / 2.9.1 / 2.10.1 / master (Up to date) +Host OS: Ubuntu 16.04 Server LTS / CentOS 7 x86_64_1611 +Websocket Proxy: websockify 0.6.0 / 0.7.0 / 0.8.0 / master +VNC Web Client: noVNC 0.5.1 / 0.61 / 0.62 / master +Other VNC Servers: TigerVNC 1.8 / x11vnc 0.9.13 / TightVNC 2.8.8 + +Steps to reproduce: +------------------------- +100% reproducible. + +1. Launch a QEMU instance (No need websocket option): +qemu-system-x86_64 -enable-kvm -m 6G ./win_x64.qcow2 -vnc :0 + +2. Launch websockify on a separate host and connect to QEMU's VNC port + +3. Open VNC Web Client (noVNC/vnc.html) in browser and connect to websockify + +4. Play a video (e.g. Watch YouTube) on VM (To produce a lot of frame buffer update) + +5. Limit (e.g. Use NetLimiter) the client inbound bandwidth to 300KB/S (To simulate a low speed WAN) + +6. Then client's output gets stuck(less than 1 fps), the cursor is almost impossible to move + +7. Monitor network traffic on the proxy server + +Current result: +------------------------- +Monitor Downlink/Uplink network traffic on the proxy server +(Refer to the attachments for more details). + +1. Used with QEMU +- D: 5.9 MB/s U: 5.7 MB/s (Client on LAN) +- D: 4.3 MB/s U: 334 KB/s (Client on WAN) + +2. Used with other VNC servers +- D: 5.9 MB/s U: 5.6 MB/s (Client on LAN) +- D: 369 KB/s U: 328 KB/s (Client on WAN) + +It is found that when the client's network is poor, all the VNC servers (tigervnc/x11vnc/tightvnc) +will reduce the VNC data send to websocket proxy (uplink and downlink symmetry), but QEMU never drop any frames and still sends a lot of data to websockify, the client has no capacity to accept so much data, more and more data are accumulated in the websockify, then it crashes. + +Expected results: +------------------------- +When the client's network is poor (WAN), QEMU will reduce the VNC data send to websocket proxy. + + + + + + + +This is nothing specific to websockets AFAIK. Even using regular VNC QEMU doesn't try to dynamically throttle data / quality settings. + +NB, if websockify crashes, then that is a serious flaw in websockify - it shouldn't read an unbounded amount of data from QEMU, if it is unable to send it onto the client. If websockify stopped reading data from QEMU, then QEMU would in turn stop sending it once the TCP buffer was full + + +Reference: +https://github.com/novnc/noVNC/issues/431#issuecomment-71883085 + +QEMU uses many more (30x) operations with much smaller amounts of data than other VNC server, perhaps this leads to the different result. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1735049 b/results/classifier/zero-shot/105/semantic/1735049 new file mode 100644 index 000000000..30d6be288 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1735049 @@ -0,0 +1,66 @@ +semantic: 0.816 +mistranslation: 0.758 +graphic: 0.750 +other: 0.722 +device: 0.692 +instruction: 0.520 +vnc: 0.470 +socket: 0.434 +boot: 0.411 +network: 0.350 +assembly: 0.280 +KVM: 0.263 + +Need MTTCG support for x86 guests + +MTTCG support is notably absent for x86_64 guests. The last Wiki update on MTTCG was back in 2015, and I am having some difficulty determining the current status of the underlying requirements to enable this feature on x86 hosts. + +For instance, has support for strong-on-weak memory consistency been added into QEMU GIT at this point? + +Thanks! + +Patches are now on the list to enable MTTCG for i386 and x86_64 guests. See v2 here: + +https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00237.html + +I'm hoping these patches will be in the next QEMU release. + +Regarding your last question: +> For instance, has support for strong-on-weak memory consistency been added into QEMU GIT at this point? + +Yes, TCG inserts the appropriate barriers around memory accesses since commit b32dc3370a ("tcg: Implement implicit ordering semantics", 2017-09-05) + + + +This feature is in QEMU v3.1, which was released today. + +See the discussion linked below that says that strong on weak is not actually fully supported yet. + +Is that discussion correct? + +=== + +In short they explained to me that since the host arm64 is a weaker memory order than the guest x86 they disabled mttcg because if they would implement it would slow everything down but the good news is that if the guest is the same memory order it is not disabled and if it is weaker memory order it is not disabled also. + +https://github.com/utmapp/UTM/issues/257#issuecomment-612675960 + +=== + +Right, that's what I figured from the code. So basically the launchpad comment was incorrect. There is no MTTCG support for x86 on ARM64. + +https://github.com/utmapp/UTM/issues/257#issuecomment-612689011 + +Looks like support for this was not fully added; my apologies for closing this bug too early. + +Adding full support for strong-on-weak emulation would be simple, at least when it comes to memory ordering. The slowdown would be huge though, see Figure 12 in http://www.cs.columbia.edu/~cota/pubs/cota_cgo17.pdf (i.e. ~2x hmean overhead for SPEC). + +The good news is that with hardware support this overhead is ~0 (see SAO in that figure). + +The other feature that is not yet implemented in upstream QEMU is the correct emulation of LL/SC, although for most code out there this shouldn't be an issue in practice given that most parallel code relies on cmpxchg, not on LL/SC pairs. + +I'm reopening this bug an Cc'ing a few people who are more familiar with the current code than I am in case I missed anything. + +OK, looks like I cannot reopen the bug, probably because the bug tracker moved to gitlab. + +If you care about this feature, please file a bug over there: https://gitlab.com/qemu-project/qemu/-/issues + diff --git a/results/classifier/zero-shot/105/semantic/1738545 b/results/classifier/zero-shot/105/semantic/1738545 new file mode 100644 index 000000000..a10c9843e --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1738545 @@ -0,0 +1,62 @@ +semantic: 0.656 +other: 0.631 +instruction: 0.607 +graphic: 0.565 +mistranslation: 0.475 +device: 0.311 +vnc: 0.261 +network: 0.235 +boot: 0.229 +socket: 0.197 +assembly: 0.183 +KVM: 0.136 + +Go binaries panic with "mmap errno 9" on qemu-user + +Go binaries panic with "mmap errno 9" on qemu-user. + +root@nofan:/# cat hello.go +package main + +import "fmt" + +func main() { + fmt.Println("hello world") +} +root@nofan:/# gccgo-7 hello.go -o hello +root@nofan:/# ./hello +mmap errno 9 +fatal error: mmap + +runtime stack: +mmap errno 9 +fatal error: mmap +panic during panic + +runtime stack: +mmap errno 9 +fatal error: mmap +stack trace unavailable +root@nofan:/# + +Tested with qemu from git master with Debian unstable for armel. + +Same binaries work fine on real hardware. + +With current QEMU (and in particular with 4.1.0 rc3 or later with commit 5bfce0b74fbd5d5308 that fixes sigaltstack) go binaries work OK. I think we must have fixed this mmap issue at some point between when this bug was reported and now (or possibly the go runtime was made a bit more forgiving of QEMU's eccentricities). + + +Oh, that's interesting. I will verify this and if it indeed works, I will enable Go binaries for sh4 in Debian. + +Thanks a lot for the update! + +I haven't tested sh4 specifically, but arm (subject of this bug report) definitely works, as does arm64. + + +I can confirm that the issue has been resolved on arm. Unfortunately, on sh4, the Go binaries are still crashing, albeit differently now. I verified that they work fine on real sh4 hardware. + +Could you file a separate bug for the sh4 case, then, please (with repro instructions)? + + +I'm marking this bug as "fix released" now since the Arm problem has been fixed. If there is something else to do for sh4, please open a new bug as suggested by Peter. + diff --git a/results/classifier/zero-shot/105/semantic/1740219 b/results/classifier/zero-shot/105/semantic/1740219 new file mode 100644 index 000000000..b516c0545 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1740219 @@ -0,0 +1,192 @@ +semantic: 0.891 +assembly: 0.866 +device: 0.864 +instruction: 0.835 +network: 0.828 +graphic: 0.824 +mistranslation: 0.813 +other: 0.803 +KVM: 0.790 +vnc: 0.784 +socket: 0.777 +boot: 0.737 + +static linux-user ARM emulation has several-second startup time + +static linux-user emulation has several-second startup time + +My problem: I'm a Parabola packager, and I'm updating our +qemu-user-static package from 2.8 to 2.11. With my new +statically-linked 2.11, running `qemu-arm /my/arm-chroot/bin/true` +went from taking 0.006s to 3s! This does not happen with the normal +dynamically linked 2.11, or the old static 2.8. + +What happens is it gets stuck in +`linux-user/elfload.c:init_guest_space()`. What `init_guest_space` +does is map 2 parts of the address space: `[base, base+guest_size]` +and `[base+0xffff0000, base+0xffff0000+page_size]`; where it must find +an acceptable `base`. Its strategy is to `mmap(NULL, guest_size, +...)` decide where the first range is, and then check if that ++0xffff0000 is also available. If it isn't, then it starts trying +`mmap(base, ...)` for the entire address space from low-address to +high-address. + +"Normally," it finds an accaptable `base` within the first 2 tries. +With a static 2.11, it's taking thousands of tries. + +---- + +Now, from my understanding, there are 2 factors working together to +cause that in static 2.11 but not the other builds: + + - 2.11 increased the default `guest_size` from 0xf7000000 to 0xffff0000 + - PIE (and thus ASLR) is disabled for static builds + +For some reason that I don't understand, with the smaller +`guest_size` the initial `mmap(NULL, guest_size, ...)` usually +returns an acceptable address range; but larger `guest_size` makes it +consistently return a block of memory that butts right up against +another already mapped chunk of memory. This isn't just true on the +older builds, it's true with the 2.11 builds if I use the `-R` flag to +shrink the `guest_size` back down to 0xf7000000. That is with +linux-hardened 4.13.13 on x86-64. + +So then, it it falls back to crawling the entire address space; so it +tries base=0x00001000. With ASLR, that probably succeeds. But with +ASLR being disabled on static builds, the text segment is at +0x60000000; which is does not leave room for the needed +0xffff1000-size block before it. So then it tries base=0x00002000. +And so on, more than 6000 times until it finally gets to and passes +the text segment; calling mmap more than 12000 times. + +---- + +I'm not sure what the fix is. Perhaps try to mmap a continuous chunk +of size 0xffff1000, then munmap it and then mmap the 2 chunks that we +actually need. The disadvantage to that is that it does not support +the sparse address space that the current algorithm supports for +`guest_size < 0xffff0000`. If `guest_size < 0xffff0000` *and* the big +mmap fails, then it could fall back to a sparse search; though I'm not +sure the current algorithm is a good choice for it, as we see in this +bug. Perhaps it should inspect /proc/self/maps to try to find a +suitable range before ever calling mmap? + +Actually, it seems that the `[base+0xffff0000, base+0xffff0000+page_size]` segment is only mapped on 32-bit ARM. So this is 32-bit ARM-specific. + +To have a link to it from here, on the 28th I submitted a patchset to fix this: https://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg05237.html + +From Alistair Buxton (a-j-buxton) on bug 1756807: +I just tested the patch from https://bugs.launchpad.net/qemu/+bug/1740219 and it fixes the problem for me. Specifically I only tried the final patch of the series. + +I duped the bugs onto this one since it is older and has a suggested patch on the ML. + +Added an qemu(Ubuntu) task to further track this, keeping it incomplete there until this is resolved upstream. + +Everything except for the final patch (which has the actual fix) is now applied on the master branch. + +This is now fixed on master, as of 3be2e41b3323169852dca11ffe6ff772c33e5aaa. + +The sha above is the merge, thanks Luke. + +The actual change by you is +commit 2a53535af471f4bee9d6cb5b363746b8d5ed21dd +Author: Luke Shumaker <email address hidden> +Date: Thu Dec 28 13:08:13 2017 -0500 + + linux-user: init_guest_space: Try to make ARM space+commpage continuous + +I'll be away a week but then look at taking this fix in. + +@Luke - to check in advance, are there depending changes post 2.11.1 that are needed for this that you know of? + +I don't believe so. The patchset applies cleanly on 2.11.0, and fixes the issue there. + +Oh, but it's worth noting that patch 1/10 had a mistake in it, which was corrected when applied as 8756e1361d177e91dc6d88f37749b809fd2407fb. + +Back again, +my question was more about if we are able to JUST take 2a53535af471f4bee9d6cb5b363746b8d5ed21dd without the rest. + +We are already in Feature Freeze for Ubuntu 18.04, so we can either + +a) wait for the next release and pick it up in full by the new qemu version (well we will do that anyway) + +b) identify a fix only (not all the cleanup and reworks) patch that will be good for the 2.11.1 in Bionic + +Especially being "just slow" but not broken makes it harder to consider the closer we get to release (I hate that as well being a performance engineer, but minimizing regressions is a target as well :-) ). +Essentially to some extend being in feature freeze is as if we are under [1] already. + +So will 2a53535af471f4bee9d6cb5b363746b8d5ed21dd alone be good in your opinion? +Or will it need more and if so what would be the minimal set of your changes. + + +[1]: https://wiki.ubuntu.com/StableReleaseUpdates + +Yes, I believe that 2a53535af471f4bee9d6cb5b363746b8d5ed21dd alone is good. + +Considering 2.12-rcX a release set the upstream status to that + +We don't generally mark bugs 'fix released' until the final (non-rc) release is made. + + +I wasn't sure if you'd usually take the interim step to "Fix Committed", thanks Peter. + +For Ubuntu: PPA: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3225 + +Regression test against ppa looked good tonight. + +There are new changes which I need to add for two more bugs. +But testing from the ppa is ok right now already. + +@Luke: Please test against this PPA, as I want to ensure it is working for your case before pushing to Bionic. + +I'm not on a Debian/Ubuntu-ish system, but extracting + + qemu-user-static_2.11+dfsg-1ubuntu6~ppa3_amd64.deb : data.tar.xz : usr/bin/qemu-arm-static + +and testing with that binary: + + $ time usr/bin/qemu-arm-static /var/lib/archbuild/dbscripts@armv7h/luke/usr/bin/ldconfig --help + Usage: ldconfig [OPTION...] + ... + <https://github.com/archlinuxarm/PKGBUILDs/issues>. + + real 0m0.068s + user 0m0.067s + sys 0m0.000s + +That is: LGTM. + +Thanks Luke. +I tried the same from the deb of libc for arm in bionic. + +Down from +real 0m2.031s +to +real 0m0.002s + +So confirmed as well. + +This bug was fixed in the package qemu - 1:2.11+dfsg-1ubuntu6 + +--------------- +qemu (1:2.11+dfsg-1ubuntu6) bionic; urgency=medium + + * Remove LP: 1752026 changes to d/p/ubuntu/define-ubuntu-machine-types.patch. + The Kernel fixes are preferred and already committed to the kernel. + Therefore remove the default disabling of the HTM feature (LP: #1761175) + * d/p/ubuntu/lp1739665-SSE-AVX-AVX512-cpu-features.patch: Enable new + SSE/AVX/AVX512 cpu features (LP: #1739665) + * d/p/ubuntu/lp1740219-continuous-space-commpage.patch: make Arm + space+commpage continuous which avoids long startup times on + qemu-user-static (LP: #1740219) + * d/p/ubuntu/lp-1761372-*: provide pseries-bionic-2.11-sxxm type as + convenience with all meltdown/spectre workarounds enabled by default. + This is not the default type following upstream and x86 on that. + (LP: #1761372). + * d/p/ubuntu/lp-1704312-1-* provide means to manually handle filesystem-dax + with pmem by backporting align and unarmed options (LP: #1704312). + * d/p/ubuntu/lp-1762315-slirp-Add-domainname.patch: slirp: Add domainname + option to slirp's DHCP server (LP: #1762315) + + -- Christian Ehrhardt <email address hidden> Wed, 04 Apr 2018 15:16:07 +0200 + diff --git a/results/classifier/zero-shot/105/semantic/1743191 b/results/classifier/zero-shot/105/semantic/1743191 new file mode 100644 index 000000000..436643540 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1743191 @@ -0,0 +1,490 @@ +semantic: 0.930 +assembly: 0.917 +device: 0.915 +instruction: 0.906 +boot: 0.897 +other: 0.893 +vnc: 0.864 +mistranslation: 0.852 +socket: 0.851 +graphic: 0.843 +KVM: 0.838 +network: 0.797 + +Interacting with NetBSD serial console boot blocks no longer works + +The NetBSD boot blocks display a menu allowing the user to make a +selection using the keyboard. For example, when booting a NetBSD +installation CD-ROM, the menu looks like this: + + 1. Install NetBSD + 2. Install NetBSD (no ACPI) + 3. Install NetBSD (no ACPI, no SMP) + 4. Drop to boot prompt + + Choose an option; RETURN for default; SPACE to stop countdown. + Option 1 will be chosen in 30 seconds. + +When booting NetBSD in a recent qemu using an emulated serial console, +making this menu selection no longer works: when you type the selected +number, the keyboard input is ignored, and the 30-second countdown +continues. In older versions of qemu, it works. + +To reproduce the problem, run: + + wget http://ftp.netbsd.org/pub/NetBSD/NetBSD-7.1.1/amd64/installation/cdrom/boot-com.iso + qemu-system-x86_64 -nographic -cdrom boot-com.iso + +During the 30-second countdown, press 4 + +Expected behavior: The countdown stops and you get a ">" prompt + +Incorrect behavior: The countdown continues + +There may also be some corruption of the terminal output; for example, +"Option 1 will be chosen in 30 seconds" may be displayed as "Option 1 +will be chosen in p0 seconds". + +Using bisection, I have determined that the problem appeared with qemu +commit 083fab0290f2c40d3d04f7f22eed9c8f2d5b6787, in which seabios was +updated to 1.11 prerelease, and the problem is still there as of +commit 7398166ddf7c6dbbc9cae6ac69bb2feda14b40ac. The host operating +system used for the tests was Debian 9 x86_64. + +Credit for discovering this bug goes to Paul Goyette. + +Reverting to Seabios 1.10 (version rel-1.10.3.0-gb76661dd) fixes this problem. + +Steps: + +$ cd && mkdir seabios-test && cd seabios-test +$ git clone -b 1.10-stable https://github.com/coreboot/seabios.git +$ cd seabios +$ make +$ qemu-system-x86_64 \ +-drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +-M q35,accel=kvm -m 350M -cpu host -smp $(nproc) \ +-nic user,model=virtio-net-pci,ipv6=off \ +-nographic -bios /home/oc/seabios-test/seabios/out/bios.bin + +Result: +I can interact with NetBSD boot menu and select one of the available options. + +Host: +Linux e130 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux + +QEMU emulator version 4.2.0 + + + +Possibly related thread: +"Do we need a cpu with TSC support to run SeaBIOS?" +https://<email address hidden>/msg11726.html + +Workaround: add "-vga none" to the qemu command line. + +@kraxel-redhat, + +I guess "-vga none" is implicit when using -nographic? + +However, for the sake of trying, I've added "-vga none" and it won't solve it for me (when using default bios). + +Gerd Hommann wrote: +> Workaround: add "-vga none" to the qemu command line. + +This supposed workaround does not work for me. + + +@kraxel-redhat: This issue bisects to commit d6728f301d7e6e31ba0ee2fa51ed4a24feab8860 ("add serial console support"). seabios.git/master + "[PATCH] sercon: vbe modeset is int 10h function 4f02 not 4f00" still has the issue. + +I'm using the following command-line: + + qemu-system-x86_64 -M accel=kvm -m 1G -cpu host -cdrom ~/Downloads/boot-com.iso -nographic + +Ah, it's a special serial console boot iso. I was trying the normal NetBSD-<version>-amd64.iso. + +So, it seems seabios sercon and bootloader are fighting over the serial line. + +seabios enables sercon for no-graphical guests ("-machine graphics=off", "-nographics" enables this too). + +So one option is to turn off seabios sercon: "qemu -nographic -machine graphics=on". + +The other option is to turn on seabios sercon and use the normal boot.iso (this needs the "-vga none" workaround from comment 3, or the sercon patch). + +On Fri, 6 Mar 2020 at 13:24, Gerd Hoffmann <email address hidden> wrote: +> So one option is to turn off seabios sercon: "qemu -nographic -machine +> graphics=on". + +This works for me, but only if I turn off "q35", therefore changing +from a sata disk to a plain ide: + +qemu-system-x86_64 \ +-drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +-drive if=virtio,file=/home/oc/VM/img/newdisk2.img,index=1,media=disk \ +-m 300M -cpu host -smp $(nproc) \ +-nic user,hostfwd=tcp::6665-:22,model=virtio-net-pci,ipv6=off \ +-nographic -machine accel=kvm,graphics=on + + +Just to clarify my last comment, and in absence of updates, if I launch the VM as: + +qemu-system-x86_64 \ +-drive if=virtio,file=/home/oc/VM/img/openbsd.image,index=0,media=disk \ +-drive if=virtio,file=/home/oc/VM/img/openbsd.image.old,index=1,media=disk \ +-M q35,accel=kvm,graphics=on -m 250M -cpu host -smp $(nproc) \ +-nic user,hostfwd=tcp::6666-:22,model=virtio-net-pci -nographic + +(note the -M q35,accel=kvm,graphics=on), the problem still persists. + +I'm still on version 4.2 and I haven't updated to 5.0 yet. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + +This bug was fixed long ago, so long ago that I have no idea when! + +Please close wiwth an appropriate status. + + +On Thu, 22 Apr 2021, Thomas Huth wrote: + +> The QEMU project is currently considering to move its bug tracking to +> another system. For this we need to know which bugs are still valid +> and which could be closed already. Thus we are setting older bugs to +> "Incomplete" now. +> +> If you still think this bug report here is valid, then please switch +> the state back to "New" within the next 60 days, otherwise this report +> will be marked as "Expired". Or please mark it as "Fix Released" if +> the problem has been solved with a newer version of QEMU already. +> +> Thank you and sorry for the inconvenience. +> +> ** Changed in: qemu +> Status: New => Incomplete +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1743191 +> +> Title: +> Interacting with NetBSD serial console boot blocks no longer works +> +> Status in QEMU: +> Incomplete +> +> Bug description: +> The NetBSD boot blocks display a menu allowing the user to make a +> selection using the keyboard. For example, when booting a NetBSD +> installation CD-ROM, the menu looks like this: +> +> 1. Install NetBSD +> 2. Install NetBSD (no ACPI) +> 3. Install NetBSD (no ACPI, no SMP) +> 4. Drop to boot prompt +> +> Choose an option; RETURN for default; SPACE to stop countdown. +> Option 1 will be chosen in 30 seconds. +> +> When booting NetBSD in a recent qemu using an emulated serial console, +> making this menu selection no longer works: when you type the selected +> number, the keyboard input is ignored, and the 30-second countdown +> continues. In older versions of qemu, it works. +> +> To reproduce the problem, run: +> +> wget http://ftp.netbsd.org/pub/NetBSD/NetBSD-7.1.1/amd64/installation/cdrom/boot-com.iso +> qemu-system-x86_64 -nographic -cdrom boot-com.iso +> +> During the 30-second countdown, press 4 +> +> Expected behavior: The countdown stops and you get a ">" prompt +> +> Incorrect behavior: The countdown continues +> +> There may also be some corruption of the terminal output; for example, +> "Option 1 will be chosen in 30 seconds" may be displayed as "Option 1 +> will be chosen in p0 seconds". +> +> Using bisection, I have determined that the problem appeared with qemu +> commit 083fab0290f2c40d3d04f7f22eed9c8f2d5b6787, in which seabios was +> updated to 1.11 prerelease, and the problem is still there as of +> commit 7398166ddf7c6dbbc9cae6ac69bb2feda14b40ac. The host operating +> system used for the tests was Debian 9 x86_64. +> +> Credit for discovering this bug goes to Paul Goyette. +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1743191/+subscriptions +> +> !DSPAM:60811a8265601949211437! +> +> + ++--------------------+--------------------------+-----------------------+ +| Paul Goyette | PGP Key fingerprint: | E-mail addresses: | +| (Retired) | FA29 0E3B 35AF E8AE 6651 | <email address hidden> | +| Software Developer | 0786 F758 55DE 53BA 7731 | <email address hidden> | ++--------------------+--------------------------+-----------------------+ + + +Paul Goyette wrote: +> This bug was fixed long ago, so long ago that I have no idea when! + +No, it is not fixed, and I did actually check before I switched the +bug state back to "new". + +Perhaps you are specifying "-machine graphics=on" as suggested in one +of the comments? If so, that's a work-around, and an ugly and +nonintuitive one at that, not a fix. +-- +Andreas Gustafsson, <email address hidden> + + +On Thu, 22 Apr 2021 at 13:46, Andreas Gustafsson +<email address hidden> wrote: +> +> Paul Goyette wrote: +> > This bug was fixed long ago, so long ago that I have no idea when! +> +> No, it is not fixed, and I did actually check before I switched the +> bug state back to "new". +> +> Perhaps you are specifying "-machine graphics=on" as suggested in one +> of the comments? If so, that's a work-around, and an ugly and +> nonintuitive one at that, not a fix. +> -- +> Andreas Gustafsson, <email address hidden> + +I am currently using: + +$ qemu-system-x86_64 --version +QEMU emulator version 5.2.0 + +And I have no problem selecting from menu in serial console, so I +assume this is fixed for me. This is my command line: + +$ cat opt/bin/boot-netbsd-virtio +#!/bin/sh +qemu-system-x86_64 \ +-drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +-drive if=virtio,file=/home/oc/VM/img/netbsd.image.old,index=1,media=disk \ +-M q35,accel=kvm -m 250M -cpu host -smp $(nproc) \ +-nic user,hostfwd=tcp:127.0.0.1:5555-:22,model=virtio-net-pci,ipv6=off \ +-daemonize -display none -vga none \ +-serial mon:telnet:127.0.0.1:6665,server,nowait \ +-pidfile /home/oc/VM/pid/netbsd-pid -nodefaults + +telnet 127.0.0.1 6665 + + + +-- +Ottavio Caruso + + +On Thu, 22 Apr 2021, Ottavio Caruso wrote: + +> On Thu, 22 Apr 2021 at 13:46, Andreas Gustafsson +> <email address hidden> wrote: +>> +>> Paul Goyette wrote: +>>> This bug was fixed long ago, so long ago that I have no idea when! +>> +>> No, it is not fixed, and I did actually check before I switched the +>> bug state back to "new". +>> +>> Perhaps you are specifying "-machine graphics=on" as suggested in one +>> of the comments? If so, that's a work-around, and an ugly and +>> nonintuitive one at that, not a fix. + +Andreas is correct - I am using the suggested work-around, and the +original bug is NOT fixed. + +I believe Andreas has moved the bug back to New status to reflect +that it is not fixed. (Whether or not it is fixed, _I_ should not +have asked to have _his_ bug closed. It's been so long, I almost +believed it was my bug. :) My apologies to Andreas and everyone +else.) + + +>> -- +>> Andreas Gustafsson, <email address hidden> +> +> I am currently using: +> +> $ qemu-system-x86_64 --version +> QEMU emulator version 5.2.0 +> +> And I have no problem selecting from menu in serial console, so I +> assume this is fixed for me. This is my command line: +> +> $ cat opt/bin/boot-netbsd-virtio +> #!/bin/sh +> qemu-system-x86_64 \ +> -drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +> -drive if=virtio,file=/home/oc/VM/img/netbsd.image.old,index=1,media=disk \ +> -M q35,accel=kvm -m 250M -cpu host -smp $(nproc) \ +> -nic user,hostfwd=tcp:127.0.0.1:5555-:22,model=virtio-net-pci,ipv6=off \ +> -daemonize -display none -vga none \ +> -serial mon:telnet:127.0.0.1:6665,server,nowait \ +> -pidfile /home/oc/VM/pid/netbsd-pid -nodefaults +> +> telnet 127.0.0.1 6665 +> +> +> -- +> Ottavio Caruso +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1743191 +> +> Title: +> Interacting with NetBSD serial console boot blocks no longer works +> +> Status in QEMU: +> New +> +> Bug description: +> The NetBSD boot blocks display a menu allowing the user to make a +> selection using the keyboard. For example, when booting a NetBSD +> installation CD-ROM, the menu looks like this: +> +> 1. Install NetBSD +> 2. Install NetBSD (no ACPI) +> 3. Install NetBSD (no ACPI, no SMP) +> 4. Drop to boot prompt +> +> Choose an option; RETURN for default; SPACE to stop countdown. +> Option 1 will be chosen in 30 seconds. +> +> When booting NetBSD in a recent qemu using an emulated serial console, +> making this menu selection no longer works: when you type the selected +> number, the keyboard input is ignored, and the 30-second countdown +> continues. In older versions of qemu, it works. +> +> To reproduce the problem, run: +> +> wget http://ftp.netbsd.org/pub/NetBSD/NetBSD-7.1.1/amd64/installation/cdrom/boot-com.iso +> qemu-system-x86_64 -nographic -cdrom boot-com.iso +> +> During the 30-second countdown, press 4 +> +> Expected behavior: The countdown stops and you get a ">" prompt +> +> Incorrect behavior: The countdown continues +> +> There may also be some corruption of the terminal output; for example, +> "Option 1 will be chosen in 30 seconds" may be displayed as "Option 1 +> will be chosen in p0 seconds". +> +> Using bisection, I have determined that the problem appeared with qemu +> commit 083fab0290f2c40d3d04f7f22eed9c8f2d5b6787, in which seabios was +> updated to 1.11 prerelease, and the problem is still there as of +> commit 7398166ddf7c6dbbc9cae6ac69bb2feda14b40ac. The host operating +> system used for the tests was Debian 9 x86_64. +> +> Credit for discovering this bug goes to Paul Goyette. +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1743191/+subscriptions +> +> !DSPAM:608193ed146681924717040! +> +> + ++--------------------+--------------------------+-----------------------+ +| Paul Goyette | PGP Key fingerprint: | E-mail addresses: | +| (Retired) | FA29 0E3B 35AF E8AE 6651 | <email address hidden> | +| Software Developer | 0786 F758 55DE 53BA 7731 | <email address hidden> | ++--------------------+--------------------------+-----------------------+ + + +Ottavio Caruso wrote: +> I am currently using: +> +> $ qemu-system-x86_64 --version +> QEMU emulator version 5.2.0 +> +> And I have no problem selecting from menu in serial console, so I +> assume this is fixed for me. This is my command line: +> +> $ cat opt/bin/boot-netbsd-virtio +> #!/bin/sh +> qemu-system-x86_64 \ +> -drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +> -drive if=virtio,file=/home/oc/VM/img/netbsd.image.old,index=1,media=disk \ +> -M q35,accel=kvm -m 250M -cpu host -smp $(nproc) \ +> -nic user,hostfwd=tcp:127.0.0.1:5555-:22,model=virtio-net-pci,ipv6=off \ +> -daemonize -display none -vga none \ +> -serial mon:telnet:127.0.0.1:6665,server,nowait \ +> -pidfile /home/oc/VM/pid/netbsd-pid -nodefaults +> +> telnet 127.0.0.1 6665 + +Have you tried the test case in the original bug report? +-- +Andreas Gustafsson, <email address hidden> + + +On Thu, 22 Apr 2021 at 18:23, Andreas Gustafsson +<email address hidden> wrote: +> +> Ottavio Caruso wrote: +> > I am currently using: +> > +> > $ qemu-system-x86_64 --version +> > QEMU emulator version 5.2.0 +> > +> > And I have no problem selecting from menu in serial console, so I +> > assume this is fixed for me. This is my command line: +> > +> > $ cat opt/bin/boot-netbsd-virtio +> > #!/bin/sh +> > qemu-system-x86_64 \ +> > -drive if=virtio,file=/home/oc/VM/img/netbsd.image,index=0,media=disk \ +> > -drive if=virtio,file=/home/oc/VM/img/netbsd.image.old,index=1,media=disk \ +> > -M q35,accel=kvm -m 250M -cpu host -smp $(nproc) \ +> > -nic user,hostfwd=tcp:127.0.0.1:5555-:22,model=virtio-net-pci,ipv6=off \ +> > -daemonize -display none -vga none \ +> > -serial mon:telnet:127.0.0.1:6665,server,nowait \ +> > -pidfile /home/oc/VM/pid/netbsd-pid -nodefaults +> > +> > telnet 127.0.0.1 6665 +> +> Have you tried the test case in the original bug report? +> -- +> Andreas Gustafsson, <email address hidden> + +You're right. Using the boot-com install image, the problem persists. + + +-- +Ottavio Caruso + +A: Because it messes up the order in which people normally read text. +Q: Why is top-posting such a bad thing? +A: Top-posting. +Q: What is the most annoying thing in e-mail? + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/147 + + diff --git a/results/classifier/zero-shot/105/semantic/1745316 b/results/classifier/zero-shot/105/semantic/1745316 new file mode 100644 index 000000000..e325c4e01 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1745316 @@ -0,0 +1,208 @@ +semantic: 0.856 +instruction: 0.827 +assembly: 0.814 +graphic: 0.814 +device: 0.787 +other: 0.772 +vnc: 0.771 +boot: 0.764 +KVM: 0.762 +socket: 0.725 +mistranslation: 0.719 +network: 0.714 + +SDL1.x>SDL2 regressions: non-usbtablet mouse position reporting is broken, and VGA/compatmonitor/serial/etc view switching is unusable + +Hi, + +I almost exclusively use -sdl when I use QEMU. The GTK UI (I'm on Linux) distinctly takes a few extra seconds to start on every boot, and I don't really ever use the extra controls it provides. I hope the SDL-based UI never goes away :) + +The SDL 1.2 > SDL 2.0 update (committed between June 8-20 2017) introduced the following two regressions: + +- PS/2 and serial mouse position reporting became completely broken (only usbtablet works) + +- The compatmonitor/serial/parallel "views" try to open new windows, which does not play well on my system at all + +First of all, here are the bisection details: + +https://github.com/qemu/qemu/commit/269c20b2bbd2aa8531e0cdc741fb166f290d7a2b + (June 8 2017): the last version that works + +https://github.com/qemu/qemu/commit/7e56accdaf35234b69c33c85e4a44a5d56325e53 + (June 20 2017): the first version that fails + +Here are the changelists between these two revisions: + +https://github.com/qemu/qemu/compare/269c20b...7e56acc +(compare direction: OLD to NEW) (Commits: 60 Files changed: 85) + +https://github.com/qemu/qemu/compare/7e56acc...269c20b +(compare direction: NEW to OLD) (Commits: 41 Files changed: 108) + +(Someone else more familiar with Git might know why GitHub returns results for both compare directions. I'm including both links just in case.) + +I've found that configuring commit 7e56acc using --with-sdlapi=1.2 completely remedies all issues. Thus, I think the issue is with SDL 2. + +== #1: Broken mouse position reporting ===================== + +This surfaces immediately with older operating systems. I first experienced it when trying to install OS/2 for the first time, and thought there was something wrong with OS/2. Then I experienced the same issues in Windows 3.1 and MS-DOS applications and I knew something was up with QEMU. + +In a nice coincidence, I've recently been playing around with prehistorically ancient Linux systems, and while looking around a Linux 0.99-based SLS system from 1992 I discovered a low-level (console) mouse-testing utility buried in /usr/X386. This utility only works when I configure QEMU to use a serial mouse, but it reveals some very interesing data: the dx and dy values ("d" = "delta", right?) received by the kernel do not contain relative values such as -1, -10, 2, 5, etc, but instead absolute values such as 0, 12, 37, 112, 329, etc. + +Similarly, if I configure Windows 3.1 to use a serial mouse, the mouse position jumps exponentially around the screen relative to my mouse movements (it's very hard to control), consistent with delta values being reported as absolute instead of relative. + +I found that the DOS CuteMouse driver comes with a mouse-testing program. CuteMouse absolutely refuses to detect QEMU's serial mouse for some reason (?!), but when QEMU is running in PS/2 mode, the mouse tester that comes with CuteMouse reports that the mouse at 632,192 no matter how much I move the mouse around the window. If I look carefully I can see the DOS cursor flickering back and forth as I move the mouse and the tester rewrites the line showing the position info, so I don't think the test program is broken. + +I got curious and wondered if this was actually an internal SDL bug. However, the following test program reports perfect values for me: + +--8<-------------------------------------------------------- + +#include <stdio.h> +#include "SDL2/SDL.h" +int main(void) { + SDL_Init(SDL_INIT_VIDEO); + SDL_Window *window = SDL_CreateWindow("Mouse test", + SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, + 640, 480, SDL_WINDOW_RESIZABLE + ); + if (window == NULL) { + perror(SDL_GetError()); + exit(1); + } + for (;;) { + SDL_Event event; + while (SDL_PollEvent(&event)) { + if (event.type == SDL_MOUSEMOTION) { + printf( + "x=%d y=%d xrel=%d yrel=%d\n", + event.motion.x, event.motion.y, + event.motion.xrel, event.motion.yrel + ); + } + if ( + event.type == SDL_KEYDOWN || + event.type == SDL_QUIT + ) { + SDL_DestroyWindow(window); + SDL_Quit(); + exit(0); + } + } + } +} + +(gcc ... -lSDL2) + +------------------------------------------------------------ + +Unfortunately it would seem the issue is QEMU-specific. + +--- + +Finding modern test targets to verify mouse functionality with was actually a small challenge. I tested Ubuntu, Lubuntu and even GParted, but the recent Linux kernels in all three systems automatically loaded the usbtablet module early in the boot process, completely hiding the bug. + +I've found two actively-maintained somewhat-mainstream platforms that make for good tests. These are both ISOs: + +- ReactOS: + - http://www.reactos.org/download + - pick "Download LiveCD" and then "proceed with the download" at + bottom-right of popup + +- Tiny Core Linux: + - http://distro.ibiblio.org/tinycorelinux/downloads.html + - pick TinyCore (16MB) + +ReactOS is a very good example, as it's actively maintained and is probably heavily tested in QEMU (albeit apparently without SDL(2)). + +Tiny Core is a bit of a mixed bag. It's actively maintained and uses a recent kernel (without usbdevice). It also uses a resurrected low-memory XFree86 target that was officially dropped decades ago for its graphics (and mouse input). It could be argued that Tiny Core's mild obscurity makes it an even better test target. + +--- + +I've attached to this report the mouse test utilities I've played with. Both are faster to iterate with than waiting for Tiny Core or ReactOS to boot. + +----- FreeDOS + CuteMouse + mousetst ----- + +This boots completely and is ready to look at the mouse position in around a second. It automatically starts the mouse tester on startup and APM-shuts-down QEMU when you exit the mouse tester with ESC. I can highly recommend this version for iteration/verification. + +$ qemu-system-i386 -fda freedos-mousetest.img +$ qemu-system-i386 -fda freedos-mousetest.img -sdl + + +----- Linux-0.99 (SLS) + 'mouse.c' ----- + +This is a heavily stripped-down SLS configuration containing just the mouse testing utility. + +$ qemu-system-i386 -boot a -fda sls-boot.img -hda sls-mousetest.img \ + -serial msmouse + +$ qemu-system-i386 -boot a -fda sls-boot.img -hda sls-mousetest.img \ + -serial msmouse -sdl + +Login as root (no password), and then + +# ./mouse Microsoft /dev/ttyS0 + +Also, the following produces junk results, but I'm including it because it may be interesting anyway: + +$ qemu-system-i386 -boot a -fda sls-boot.img -hda sls-mousetest.img -sdl + +and + +# ./mouse Microsoft /dev/ps2aux + +The reason I think this is noteworthy is that the button state affects the reported values (incorrectly, but they do still change), but moving the mouse does not. So while the interpretation is wrong, the behavior seems to be similar to that reported by CuteMouse's mouse tester. (Unless the position fields are being perfectly skipped over.) + +When you're done with this image - `halt` takes several seconds and there's really no point. Just closing QEMU manually is faster. (This is also why I used the shorter -[hf]da syntax - writes to the disk images are not relevant.) + +(Also - since I can't possibly leave this info out :) - http://www.oldlinux.org/Linux.old/distributions/SLS/, `sls-1.0.tar.bz2`. This is a turnkey disk image that Just Works(TM). `boot.img` attached below is in fact the same as `a.img` in this archive.) + +--- + +In case it's useful, I included a Windows 3.1 test image when I reported https://bugs.launchpad.net/qemu/+bug/1745312. This image (which happens to be configured for PS/2) can be found in this folder: https://drive.google.com/drive/folders/1WdZVBh5Trs9HLC186-nHyKeqaSxfyM2c + +For reference: this image includes a lot of cruft as it's an active test image I've used for a lot of different things. It contains a full copy of the contents of the Win3.1 installation disks in W31INST. You can `deltree c:\windows` and reinstall by running `setup` in that directory (takes 3-5 minutes), or `subst a: c:\w31inst` before starting Windows and you'll be able to use Windows Setup to switch to a serial mouse. + +== #2: SDL view switching unusability ====================== + +This issue is (somewhat) more straightforward to demonstrate. Simply + +$ qemu-system-i386 -sdl + +and then hit CTRL+ALT+2 (or 3 or 4). + +On my system, when I do this, QEMU creates and destroys new windows in an infinite loop for as long as I have CTRL+ALT+n held down. I have to hit `2` really quickly! + +I've also observed that some internal contention can frequently cause the compatmonitor window to become blind to the Enter key. It seems to be that this occurs most often when the windowmanager I'm using (tested using i3 and openbox) has focus-follows-mouse enabled and the mouse is over the area of the screen where the compatmonitor window opens itself. Perhaps this is caused by the CTRL+ALT capture code interacting badly with window focus state? (I'm very very interested to hear if you cannot reproduce this.) + +I initially thought all of these changes/glitches were due to some kind of messed-up "upgrade" / deliberate feature that happened to be broken on my system. But the only (obviously-)UI-related tasks I found in the changelist above just mention upgrading SDL, with no particular UI work (that I can see). It looks like this is actually an SDL thing - unless some (preparatory?) changes occurred in QEMU before the commit window I discovered. I have no idea. + +A couple things about fixing this that I want to mention: + +The way the GTK UI does things, views can be switched inside the same window, or they can be detached into new windows. This provides the best of both worlds - and there are situations where I do want both behaviors. + +If QEMU is not averse to burying an obscure option somewhere that lets me pick whether SDL will open views in new windows or the same window, that could be very nice - and it would bring the SDL UI to perfect feature parity with the GTK UI, too. But I'm not sure what QEMU's stance on obscure options is. + +That said, my preference for the SDL 1.x way of doing things is admittedly very probably biased by the usability issues created by this bug. Incidentally, I've taken to using `-serial null -monitor stdio`. But for that to work I have to remember to add it ahead of time, and I do often forget, heh. + +I'll be very interested to play with QEMU view switching once this is less glitchy. If the fixed implementation still opens new windows, I'll see what I think of that once it works stably. :) + + + + + + + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1750899 b/results/classifier/zero-shot/105/semantic/1750899 new file mode 100644 index 000000000..cc013dbc3 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1750899 @@ -0,0 +1,51 @@ +semantic: 0.891 +mistranslation: 0.868 +other: 0.859 +assembly: 0.847 +graphic: 0.846 +vnc: 0.836 +instruction: 0.829 +device: 0.797 +network: 0.762 +socket: 0.738 +KVM: 0.697 +boot: 0.630 + +Mouse cursor sometimes can't pass the invisible border on the right side of the screen + +I'm using qemu 2.11 on Gentoo Linux, with configured GPU passthrough (Radeon RX580) to the guest Windows 10. +This configuration is alive for last 4 years, this time I changed a lot qemu, linux kernel and windows versions, changed GPU and always all was working as expected. I always used standard PS/2 mouse emulation and that was enough for me. + +Now, I bought two new monitors, instead of old one, and setup them as one logical monitor, using technology called Eyefinity - it's a part of standard Radeon software. Now Windows thinks, that I have one monitor with resolution 2160x1920 (I bought Dell monitors with a thin borders and use them in portrait mode). + +Windows uses it without any problems, but mouse become crazy - sometimes (~3 times from each 5) I can't move cursor to the right border of the screen, it looks like the invisible vertical border. I spent really huge amount of time to understand, which component is the root of problem and found, that it's really a mouse. I tried all possible variants (standard, tablet, virtio-mouse-pci, virtio-tablet-pci), and found, that in both mouse variants bug is reproducing, and in both tablet variants - cursor stuck near all real borders and corners, so it's not a variant too. +The only working variant becomes passing real USB port to my VM and insert second mouse to this port. So, now it's working, but I have two mice on my working place, which doesn't seems very useful. + +Here is my command line: + +QEMU_AUDIO_DRV=pa QEMU_PA_SAMPLES=4096 qemu-system-x86_64 -enable-kvm -M q35 -m 12168 -cpu host,kvm=off -smp 4,sockets=1,cores=4 \ +-bios /usr/share/qemu/bios.bin -rtc base=localtime -vga none -device secondary-vga \ +-drive id=virtiocd,if=none,format=raw,file=/home/akushsky/virtio-win-0.1.141.iso \ +-device driver=ide-cd,bus=ide.1,drive=virtiocd \ +-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \ +-device vfio-pci,host=05:00.0,bus=root.1,addr=00.0,multifunction=on,romfile=/opt/kvm/images/Sapphire.RX580.8192.170320_1.bin,x-vga=on \ +-device virtio-scsi-pci,id=scsi \ +-drive file=/dev/sdb,id=disk,format=raw,if=none,discard=on,cache=none,aio=native,detect-zeroes=unmap -device scsi-hd,drive=disk,id=scsi0 \ +-device ich9-intel-hda,bus=pcie.0,addr=1b.0,id=sound0 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \ +-usb -usbdevice host:046d:c52b + +All in all, I checked on Windows 7 and Windows 10, and on qemu 2.10 and 2.11 - bug is always reproducible. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/76 + + diff --git a/results/classifier/zero-shot/105/semantic/1751264 b/results/classifier/zero-shot/105/semantic/1751264 new file mode 100644 index 000000000..9f1db05d2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1751264 @@ -0,0 +1,83 @@ +semantic: 0.825 +other: 0.767 +instruction: 0.760 +graphic: 0.758 +device: 0.708 +assembly: 0.707 +network: 0.482 +boot: 0.478 +mistranslation: 0.468 +vnc: 0.433 +socket: 0.407 +KVM: 0.375 + +qemu-img convert issue in a tmpfs partition + +qemu-img convert command is slow when the file to convert is located in a tmpfs formatted partition. + +v2.1.0 on debian/jessie x64, ext4: 10m14s +v2.1.0 on debian/jessie x64, tmpfs: 10m15s + +v2.1.0 on debian/stretch x64, ext4: 11m9s +v2.1.0 on debian/stretch x64, tmpfs: 10m21.362s + +v2.8.0 on debian/jessie x64, ext4: 10m21s +v2.8.0 on debian/jessie x64, tmpfs: Too long + +v2.8.0 on debian/stretch x64, ext4: 10m42s +v2.8.0 on debian/stretch x64, tmpfs: Too long + +It seems that the issue is caused by this commit : https://github.com/qemu/qemu/commit/690c7301600162421b928c7f26fd488fd8fa464e + +In order to reproduce this bug : + +1/ mount a tmpfs partition : mount -t tmpfs tmpfs /tmp +2/ get a vmdk file (we used a 15GB image) and put it on /tmp +3/ run the 'qemu-img convert -O qcow2 /tmp/file.vmdk /path/to/destination' command + +When we trace the process, we can see that there's a lseek loop which is very slow (compare to outside a tmpfs partition). + +Hi, + +This is a combination of (in our opinion) a bug in tmpfs (...and I think maybe btrfs as well?), the fact that the vmdk block driver is not very well optimized, and qemu-img convert assuming that the filesystem works as it thinks it does or that at least the block driver can work around this. + +So what happens is that qemu-img convert tries to find out which data it needs to copy. For this, it queries which parts of the image are allocated. This involves querying both the format level (vmdk in this case) and the protocol level (tmpfs in this case). + +Now the vmdk block driver is not very well optimized, so it only allows querying on cluster boundaries (64 kB by default, as far as I can tell). qcow2 OTOH allows greater areas (I just created a 512 MB image and it can query the whole image at once). + +So the requests go down to the protocol level. We expect that to respond very quickly to an allocation request (the lseek() you are seeing) -- but tmpfs (and I think btrfs, too) don't do that. They take a rather long time. + +For an example, the attached program seeks through a file (in 64 kB steps) with SEEK_DATA/SEEK_HOLE. This is what happens: +$ cd /tmp +$ gcc test.c -std=c11 -Wall -Wextra -pedantic -O3 +$ qemu-img create -f raw -o preallocation=falloc empty 512M +$ qemu-img create -f raw -o preallocation=falloc ~/empty 512M +$ time ./a.out empty +./a.out empty 0,01s user 23,10s system 99% cpu 23,166 total +$ time ./a.out ~/empty +./a.out ~/empty 0,01s user 0,03s system 96% cpu 0,041 total + +So there's a huge difference and that is (in my opinion) a bug in tmpfs. + +(When converting from qcow2 you don't notice this, because qcow2 allows performing a single allocation request for the whole image, so it doesn't matter much whether that's slow.) + + +There are three ways around this: +(1) tmpfs (and probably btrfs? -- although I can't reproduce it myself right now) should be fixed. If they can't tell allocated areas quickly, they should just report the whole file as allocated. + +(2) Our vmdk driver could be optimized. Sure, but that wouldn't solve the real issue and someone would have to do it first (and we don't have a strong interest in this, because all format drivers but qcow2 and raw are there mainly just for reading other formats and converting them to qcow2). + +(3a) qemu-img convert could poll for allocation information less insistently. One way would be to add a switch to disable this behavior completely and force it to just read everything. We already have -S 0 which could do this; but just reading all data and then doing zero detection over it kind of defeats the purpose. If read() + memcmp() is faster than lseek(SEEK_DATA), then the FS is just doing something wrong. + +(3b) Eric Blake has recently added support for a less insisting way to query allocation status that should only go to the format layer (e.g. vmdk) and ignore the protocol layer (e.g. tmpfs). Maybe qemu-img convert should use that. + + +But in any case, I claim the main issue is in tmpfs. + +Max + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1751674 b/results/classifier/zero-shot/105/semantic/1751674 new file mode 100644 index 000000000..0abf71aef --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1751674 @@ -0,0 +1,81 @@ +semantic: 0.773 +graphic: 0.770 +other: 0.737 +mistranslation: 0.642 +instruction: 0.605 +device: 0.590 +assembly: 0.554 +network: 0.550 +socket: 0.521 +boot: 0.469 +KVM: 0.317 +vnc: 0.297 + +qemu-system-arm segmentation fault using pmemsave on the interrupt controller registers + +Qemu segfaults trying to generate a VM memory dump: + +$ QEMU_AUDIO_DRV=none qemu-git-src/arm-softmmu/qemu-system-arm -M vexpress-a9 -smp 4 -m 1024 -machine secure=off,dump-guest-core=on -kernel linux-4.9.75/arch/arm/boot/zImage -append "root=/dev/mmcblk0 rw rootfstype=ext4 mem=1024M net.ifnames=0 console=ttyAMA0" -dtb vexpress-v2p-ca9.dtb -sd armv7-hd.qcow2 -netdev tap,ifname=tap_armv7,script=no,downscript=no,id=net0 -device virtio-net-device,mac=00:DE:AD:BE:FF:02,netdev=net0 -monitor stdio -serial vc -loadvm SS0 +QEMU 2.11.50 monitor - type 'help' for more information +(qemu) pmemsave 0 0x3FFFFFFF memory.dmp +Segmentation fault (core dumped) + +$ git rev-parse HEAD +b384cd95eb9c6f73ad84ed1bb0717a26e29cc78f + +It's the second time I try to submit this bug, I think last time it failed because the attached core dump size (400M compressed). Have a look if you can get that file, otherwise I will try to update this ticket once it's created: + +(Error ID: OOPS-65553b72bc14be693eb1e37814ff9267) + +Yeah, the page fails uploading the code dump file. Actually it seems to upload the whole file but then it shows a "Timeout error" error. Anyway, let me know if you need that file and if so how can I send it to you. + +What's happening here is that the memory range you're asking to dump (physaddrs 0 to 0x3fffffff) includes a lot of devices, including the interrupt controller, which is at 0x1e000000. There's a longstanding bug in the GIC code where it will crash if you try to access its per-CPU register bank from some context that isn't a guest CPU (including the monitor or the QEMU gdb stub), because it doesn't know which CPU's version of the registers you wanted. That's what you've run into here. + +However, I suspect you didn't really want to try to take a memory dump of a pile of devices. The RAM in the vexpress-a9 board starts at 0x60000000, so if you wanted the RAM then try + pmemsave 0x60000000 0x9fffffff memory.dmp + + +LP:1602247 is the bug for the similar issue when using the gdb stub. + + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +Still valid. + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/247 + + diff --git a/results/classifier/zero-shot/105/semantic/1753309 b/results/classifier/zero-shot/105/semantic/1753309 new file mode 100644 index 000000000..0ddfb6e9f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1753309 @@ -0,0 +1,118 @@ +semantic: 0.903 +other: 0.885 +assembly: 0.882 +network: 0.863 +graphic: 0.860 +device: 0.843 +instruction: 0.835 +vnc: 0.738 +KVM: 0.702 +mistranslation: 0.680 +boot: 0.644 +socket: 0.627 + +Ethernet interrupt vectors for sabrelite machine are defined backwards + +The sabrelite machine model used by qemu-system-arm is based on the Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet controller which is supported in QEMU using the imx_fec.c module (actually called imx.enet for this model.) + +The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the imx.enet device like this: + +#define FSL_IMX6_ENET_MAC_1588_IRQ 118 +#define FSL_IMX6_ENET_MAC_IRQ 119 + +However, this is backwards. The reference manual for the i.MX6D/Q devices can be found here: + +https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf + +On page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary, it shows the following: + +150 ENET +MAC 0 IRQ, Logical OR of: +MAC 0 Periodic Timer Overflow +MAC 0 Time Stamp Available +MAC 0 Time Stamp Available +MAC 0 Time Stamp Available +MAC 0 Payload Receive Error +MAC 0 Transmit FIFO Underrun +MAC 0 Collision Retry Limit +MAC 0 Late Collision +MAC 0 Ethernet Bus Error +MAC 0 MII Data Transfer Done +MAC 0 Receive Buffer Done +MAC 0 Receive Frame Done +MAC 0 Transmit Buffer Done +MAC 0 Transmit Frame Done +MAC 0 Graceful Stop +MAC 0 Babbling Transmit Error +MAC 0 Babbling Receive Error +MAC 0 Wakeup Request [synchronous] + +151 ENET +MAC 0 1588 Timer interrupt [synchronous] request + +Note: +150 - 32 == 118 +151 - 32 == 119 + +In other words, the vector definitions in the fsl-imx6.h file are reversed. The correct definition is: + +#define FSL_IMX6_ENET_MAC_IRQ 118 +#define FSL_IMX6_ENET_MAC_1588_IRQ 119 + +I tested the sabrelite simulation using VxWorks 7 (which supports the SabreLite board) and found that while I was able to send and receive packet data via the simulated ethernet interface, the VxWorks i.MX6 ethernet driver failed to receive any interrupts. When I corrected the interrupt vector definitions as shown above and recompiled QEMU, everything worked as expected. I was able to exchange ICMP packets with the simulated target and telnet to/from the VxWorks instance running in the virtual machine. I used the tap interface for this. + +As a workaround I was also able to make the ethernet work by modifying the VxWorks imx6q-sabrelite.dts file to change the ethernet interrupt property from 150 to 151. + +This problem was observed with the following environment: + +Host: FreeBSD/amd64 11.1-RELEASE +QEMU version: 2.11.0 and 2.11.1 built from source code + +Swapping the interrupt pins fixes the problem on Linux v4.13 and later. Older kernels start failing as follows. + + On v4.12 and earlier, the Ethernet interface fails to instantiate with + fec 2188000.ethernet (unnamed net_device) (uninitialized): MDIO read timeout + fec: probe of 2188000.ethernet failed with error -5 + I have not found the reason yet. Unmodified qemu works fine. +- v4.1 and earlier crash. The crash is due to a bad error path and fixed by commit + 32cba57ba74be ("net: fec: introduce fec_ptp_stop and use in probe fail path"). + + +Followup on #1: The relevant upstream commit is 4c8777892e80b ("ARM: dts: imx6qdl-sabrelite: remove erratum ERR006687 workaround"). + +Test results with various kernel versions: +4.14+: Both versions of qemu (as-is and interrupts reverted) work fine +4.9.y: Requires cherry-pick of 4c8777892e80b for both versions of qemu to work +4.4.y: Requires backport of 4c8777892e80b for both versions of qemu to work +4.1.y: Requires backport of 4c8777892e80b for both versions of qemu to work + +I didn't test older kernels. + +Now the big question is if this matches the experience with real hardware. + + +"4.14+: Both versions of qemu (as-is and interrupts reverted) work fine" + +Hm. I really wonder how it can be possible that Linux works with the interrupt vectors reversed, though to be fair I have not looked at the Linux i.MX6 ENET driver code. I suppose it's possible that the driver is binding the same interrupt service routine to both interrupt vectors. If so, then it works by accident. :) + +I think U-Boot uses polling so it wouldn't care if the interrupt vectors are wrong. + +We have several SabreLite boards in house. We also have NXP Sabre SD reference boards which use the same i.MX6Q SoC and the exact same ethernet driver with the same interrupt configuration. I have always used VxWorks with them rather than Linux, and I can say for a fact that the VxWorks ENET driver only binds an ISR to vector 150 (118) (VxWorks doesn't currently support the IEEE 1588 feature with this interface so it never uses vector 151) and it works as expected -- network interrupt events are indeed received via vector 150. + +The same VxWorks image that works with real hardware does not work with QEMU unless I fix the vectors in fsl-imx6.h. + +In short, both the hardware and the manual seem to agree. QEMU is doing it wrong. :) + +Also, the errata sheet for the i.MX6 is here: + +https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf + +Apparently erratum 6687 is related to power management and wakeup events. I'm not sure how that factors in to how Linux behaves. + +#3: Correct, Linux version 4.14 and older registers two interrupt lines, both the correct and the wrong one. With qemu version, the kernel receives interrupts on irq 151, with the other on 150. So, yes, I guess it works by accident. My question is what to do with older (pre-4.14) kernels. Presumably those worked (?) with real hardware, so I am a bit concerned about the impact of applying 4c8777892e80b to those kernels. + + +Submitted https://patchwork.kernel.org/patch/10264615/ + +This is now fixed in git master by commit 6461d7e2678fe4, which updates the defines and also has a workaround for older guest kernels (which we can remove if/when we model the IOMUX). + diff --git a/results/classifier/zero-shot/105/semantic/1760262 b/results/classifier/zero-shot/105/semantic/1760262 new file mode 100644 index 000000000..88bde41c6 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1760262 @@ -0,0 +1,72 @@ +semantic: 0.928 +instruction: 0.913 +graphic: 0.905 +other: 0.903 +mistranslation: 0.876 +network: 0.810 +KVM: 0.805 +device: 0.799 +socket: 0.793 +assembly: 0.790 +boot: 0.724 +vnc: 0.694 + +cmsdk-apb-uart doesn't appear to clear interrupt flags + +I have been writing a small operating system and using QEMU emulating the mps2-an385 board for some of my testing. + +During development of the uart driver I observed some odd behaviour with the TX interrupt -- writing a '1' to bit 0 of the INTCLEAR register doesn't clear the TX interrupt flag, and the interrupt fires continuously. + +It's possible that I have an error somewhere in my code, but after inspecting the QEMU source it does appear to be a QEMU bug. I applied the following patch and it solved my issue: + +From 9875839c144fa60a3772f16ae44d32685f9328aa Mon Sep 17 00:00:00 2001 +From: Patrick Oppenlander <email address hidden> +Date: Sat, 31 Mar 2018 15:10:28 +1100 +Subject: [PATCH] hw/char/cmsdk-apb-uart: fix clearing of interrupt flags + +--- + hw/char/cmsdk-apb-uart.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/hw/char/cmsdk-apb-uart.c b/hw/char/cmsdk-apb-uart.c +index 1ad1e14295..64991bd9d7 100644 +--- a/hw/char/cmsdk-apb-uart.c ++++ b/hw/char/cmsdk-apb-uart.c +@@ -274,6 +274,7 @@ static void uart_write(void *opaque, hwaddr offset, uint64_t value, + * is then reflected into the intstatus value by the update function). + */ + s->state &= ~(value & (R_INTSTATUS_TXO_MASK | R_INTSTATUS_RXO_MASK)); ++ s->intstatus &= ~(value & ~(R_INTSTATUS_TXO_MASK | R_INTSTATUS_RXO_MASK)); + cmsdk_apb_uart_update(s); + break; + case A_BAUDDIV: +-- +2.16.2 + + + +Found in v2.12.0-rc1. + +Thanks for the bug report; I've submitted this patch (which is similar to but not quite the same as your fix): +https://patchwork.ozlabs.org/patch/896715/ + +Hopefully this will get into 2.12, but we're quite close to release now so it will depend on whether we need to spin an extra release candidate for some other reason. + + +On Tue, Apr 10, 2018 at 11:45 PM, Peter Maydell +<email address hidden> wrote: +> +> Thanks for the bug report; I've submitted this patch (which is similar to but not quite the same as your fix): +> https://patchwork.ozlabs.org/patch/896715/ +> +> Hopefully this will get into 2.12, but we're quite close to release now +> so it will depend on whether we need to spin an extra release candidate +> for some other reason. + +Thanks for looking into it. + +Patrick + + +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=6670b494fdb23f74ecd9b + diff --git a/results/classifier/zero-shot/105/semantic/1761027 b/results/classifier/zero-shot/105/semantic/1761027 new file mode 100644 index 000000000..b8cfd34f0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1761027 @@ -0,0 +1,49 @@ +semantic: 0.835 +device: 0.738 +instruction: 0.675 +socket: 0.617 +graphic: 0.610 +mistranslation: 0.591 +vnc: 0.572 +boot: 0.533 +network: 0.529 +other: 0.433 +KVM: 0.283 +assembly: 0.254 + +Unexpected error: "AioContext polling is not implemented on Windows" + +When run it this error happens: +Unexpected error in aio_context_set_poll_params() at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/util/aio-win32.c:413: +C:\Program Files\qemu\qemu-system-x86_64.exe: AioContext polling is not implemented on Windows + +This application has requested the Runtime to terminate it in an unusual way. +Please contact the application's support team for more information. + + + +System: +Windows 10 x64 + +Which version of QEMU are you using? And which parameters are you using when you start it? + +I have that message too with this version: + +c:\Tools\QEMU>qemu-system-aarch64.exe -version +QEMU emulator version 2.11.90 (v2.12.0-rc0-11704-g30195e9d53-dirty) + +My launch params are: +C:\TOOLS\QEMU\qemu-system-aarch64.exe -M raspi3 -kernel D:\QEMU-img\2017-12-04-pcudev01l.img + +My system is Windows 7 64bit +The qemu package downloaded is the 64bit version. + + +Fixed in qemu.git/master and due to be released in QEMU 2.12: + +commit 90c558beca0c0ef26db1ed77d1eb8f24a5ea02a1 +Author: Peter Xu <email address hidden> +Date: Thu Mar 22 16:56:30 2018 +0800 + + iothread: fix breakage on windows + diff --git a/results/classifier/zero-shot/105/semantic/1777672 b/results/classifier/zero-shot/105/semantic/1777672 new file mode 100644 index 000000000..b14063a73 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1777672 @@ -0,0 +1,123 @@ +semantic: 0.748 +other: 0.736 +graphic: 0.734 +assembly: 0.712 +device: 0.708 +mistranslation: 0.707 +instruction: 0.678 +vnc: 0.645 +boot: 0.643 +network: 0.620 +KVM: 0.613 +socket: 0.500 + +QEMU raspi virtual/physical frame buffer not implemented + +I fully recognize that the error here could be mine, but the code is pretty simple and straightforward; When emulating a Raspberry PI 3 using aarch64 and allocating a virtual framebuffer larger than the physical frambuffer (for double-buffering purposes), the QEMU window shows the full size of the *virtual* framebuffer rather than the size of the *physical* framebuffer. + +You can replicate this with code such as: + + +#define FBWIDTH 1024 +#define FBHEIGHT 768 + +void lfb_init() +{ + uart_puts("Initializing Framebuffer\n"); + mbox[0] = 35*4; + mbox[1] = MBOX_REQUEST; + + mbox[2] = 0x48003; //set phy wh + mbox[3] = 8; + mbox[4] = 8; + mbox[5] = FBWIDTH; //FrameBufferInfo.width + mbox[6] = FBHEIGHT; //FrameBufferInfo.height + + mbox[7] = 0x48004; //set virt wh + mbox[8] = 8; + mbox[9] = 8; + mbox[10] = FBWIDTH; //FrameBufferInfo.virtual_width + mbox[11] = FBHEIGHT * 2; //FrameBufferInfo.virtual_height + + mbox[12] = 0x48009; //set virt offset + mbox[13] = 8; + mbox[14] = 8; + mbox[15] = 0; //FrameBufferInfo.x_offset + mbox[16] = 0; //FrameBufferInfo.y.offset + + mbox[17] = 0x48005; //set depth + mbox[18] = 4; + mbox[19] = 4; + mbox[20] = 32; //FrameBufferInfo.depth + + mbox[21] = 0x48006; //set pixel order + mbox[22] = 4; + mbox[23] = 4; + mbox[24] = 1; //RGB, not BGR preferably + + mbox[25] = 0x40001; //get framebuffer, gets alignment on request + mbox[26] = 8; + mbox[27] = 8; + mbox[28] = 4096; //FrameBufferInfo.pointer + mbox[29] = 0; //FrameBufferInfo.size + + mbox[30] = 0x40008; //get pitch + mbox[31] = 4; + mbox[32] = 4; + mbox[33] = 0; //FrameBufferInfo.pitch + + mbox[34] = MBOX_TAG_LAST; + + if(mbox_call(MBOX_CH_PROP) && mbox[20]==32 && mbox[28]!=0) { + mbox[28]&=0x3FFFFFFF; + fbwidth=mbox[5]; + fbheight=mbox[6]; + pitch=mbox[33]; + lfb=(void*)((unsigned long)mbox[28]); + } +} + +I will assume, for the sake of this posting, that the reader understands the mailbox architecture and the appropriate address definitions for them. The key point is that allocating a virtual buffer twice the height of the physical buffer results in QEMU improperly displaying a double-height window. + +Can you provide a test binary and QEMU command line that reproduce this, please ? + + +Certainly! Attached. + + +If you start the attached on a piece of hardware, it will start and display fine.. If you start it in QEMU, it will start but display a double-height screen rather than limiting the physical screen to the specified dimensions. + +(The virtual display is double-height in preparation for double buffering) + +Whoops.. Forgot to include the QEMU command line: + +qemu-system-aarch64 -M raspi3 --kernel kernel8.img -serial stdio + + +Thanks for the test case. I'm having difficulty matching up your guest code with the documentation of the fb mbox tags in https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface ... + +Your code sets the physical height to FBHEIGHT via tag 0x48003, and the virtual height to FBHEIGHT * 2 via tag 0x48004. The documentation in the wiki link agrees that 48003 is phys w/h and 48004 is virt w/h, but it says that the physical size is the size of the buffer in memory, and the virtual size is the size of the viewport sent to the display device, ie the virtual size should be smaller than the physical, not vice-versa. Which is correct ? + + +The virtual size must be at least the size of the physical display. One approach toward double-buffering is to make the virtual height twice the physical height. To "flip" the displays you simply change the start of the visible view port. + +See these: + +https://lb.raspberrypi.org/forums/viewtopic.php?t=47329 +https://www.raspberrypi.org/forums/viewtopic.php?f=67&t=19073&p=324866#p324866 + + +Mmm. I guess the wiki page is just wrong, then. I have some prototype patches that work, but I need to check somehow what the real hardware's response to various edge cases is: + * trying to set a virtual screen size that's smaller than the physical screen size + * trying to set the virtual x/y offsets to values that put the physical viewport partially outside the virtual screen (eg setting height = vheight = 480, width = vwidth = 640, xoffset = yoffset = 50) + +There's a mechanism in the mbox API for saying "can't do that" but I don't know whether that sort of thing actually does result in failure to set the height or the offset or whatever... + + +Just submitted this patchset to the list which should fix this bug: +https://patchwork.ozlabs.org/project/qemu-devel/list/?series=60775 + + +This should now be fixed in git master as of commit f4e8428b9a6ea440bb. + + diff --git a/results/classifier/zero-shot/105/semantic/1779634 b/results/classifier/zero-shot/105/semantic/1779634 new file mode 100644 index 000000000..d75c1a7b0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1779634 @@ -0,0 +1,91 @@ +semantic: 0.901 +other: 0.893 +network: 0.887 +graphic: 0.885 +instruction: 0.876 +assembly: 0.866 +device: 0.863 +mistranslation: 0.835 +socket: 0.817 +vnc: 0.704 +KVM: 0.698 +boot: 0.612 + +qemu-x86_64 on aarch64 reports "Synchronous External Abort" + +Purpose: to run x86_64 utilities on aarch64 platform (Intel/Dell network adapters' firmware upgrade tools) +System: aarch64 server platform, with ubuntu 16.04 (xenial) Linux 4.13.0-45-generic #50~16.04.1-Ubuntu SMP Wed May 30 11:14:25 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux + +Reproduce: +1) build linux-user qemu-x86_64 static from source (tried both version 1.12.0 & 1.11.02) + ./configure --target-list=x86_64-linux-user --disable-system --static --enable-linux-user + +2) install the interpreter into binfmt_misc filesystem + $ cat /proc/sys/fs/binfmt_misc/qemu-x86_64 + enabled + interpreter /usr/local/bin/qemu-x86_64 + flags: + offset 0 + magic 7f454c4602010100000000000000000002003e00 + mask fffffffffffefefcfffffffffffffffffeffffff + +3) packaging Intel/Dell upgrade utilities into docker images, I've published two on docker hub: + REPOSITORY TAG IMAGE ID CREATED SIZE + heyi/dellupdate latest 8e013f5511cd 6 hours ago 210MB + heyi/nvmupdate64e latest 9d2de9d0edaa 3 days ago 451MB + +4) run the docker container on aarch64 server platform: + docker run -it --privileged --network host --volume /usr/local/bin/qemu-x86_64:/usr/local/bin/qemu-x86_64 heyi/dellupdate:latest + +5) finally, within docker container run the upgrade tool: + # ./Network_Firmware_T6VN9_LN_18.5.17_A00.BIN + +Errors: in dmesg it reports excessive 'Synchronous External Abort': + +kernel: [242850.159893] Synchronous External Abort: synchronous external abort (0x92000610) at 0x0000000000429958 +kernel: [242850.169199] Unhandled fault: synchronous external abort (0x92000610) at 0x0000000000429958 + +thanks and best regards, Yi + +qemu-x86_64 is just a userspace program. If the kernel is getting Synchronous External Aborts then this is not a QEMU problem. Either there's a bug in the host kernel, or the guest binary is attempting to mmap /dev/mem and do wrong things to it because it's expecting it to be an x86 system. I suspect the latter (it's probably trying to do userspace writes directly to the network controller handware). This sort of binary is never going to be runnable via QEMU. + + +You could confirm this hypothesis by using strace and looking for whether it's doing mmap() of /dev/mem or /dev/kmem. If it's true, then the program would not work even if you had the source and recompiled it for aarch64 -- it would require bugfixes (code changes) to achieve whatever it's trying to do. + + +Thanks very much @Peter Maydell, when invoking these tools through docker/qemu-user I really saw syscall disorders, even strace fails. You are right these tools have x86_64 syscall numbers & perhaps mmaps of /dev/mem to allocate contiguous memory region for DMA transactions. + +Then the goal cannot be achieved by docker/qemu-user method (although it is quite convenient :) + +strace ./Network_Firmware_T6VN9_LN_18.5.17_A00.BIN +qemu: Unsupported syscall: 101 +qemu: Unsupported syscall: 101 +/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented ++++ exited with 1 +++ + +I'll downgrade this bug to a question, and try full qemu-system emulation with PCI device passthrough assignment to achieve the goal of running Intel/Dell firmware upgrading tools on aarch64 servers. + + +> /usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented + +This indicates that you're trying to run an x86 strace under QEMU. That won't work. You want to either (a) run QEMU + guest binary under the host strace or (b) run QEMU + guest binary with the QEMU -strace option. + + +thanks Peter, yes I tried to run an x86 strace under QEMU. + +I'll stop this experiment since you are right this won't work for utilities with device-level I/O and memory operations, I will raise this requirement to Intel support website firstly. + +Best Regards, Yi + +Hi, + +if of interest to anyone... +we were able to successfully upgrade firmware of Intel XL710 on aarch64 platform. + +Two major items were required: +- small qemu change: https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00553.html +- hack in Linux kernel /dev/mem driver in mmap function to catch wrong addresses nvmupdate64e asked for thru qemu. For some reason only lower 32-bit portion of actual physical address came to linux kernel. /dev/mem driver in kernel was changed to add missing upper 32 bits of address + +best regards, +Matevz + diff --git a/results/classifier/zero-shot/105/semantic/1790617 b/results/classifier/zero-shot/105/semantic/1790617 new file mode 100644 index 000000000..2440c8411 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1790617 @@ -0,0 +1,76 @@ +semantic: 0.604 +other: 0.553 +instruction: 0.550 +mistranslation: 0.533 +assembly: 0.497 +graphic: 0.481 +device: 0.475 +KVM: 0.471 +socket: 0.309 +boot: 0.277 +network: 0.250 +vnc: 0.216 + +Version of disk image format not exhibited using the 'qemu-img info' command + +OS: Fedora (64 bits) – Linux –. Last available component: qemu-img.x86_64 2:2.11.2-2.fc28 + +Description: version of disk image format not exhibited using the 'qemu-img info' command. + +Command 'qemu-img info qcow2 [image-file-name.img]' produces this stderr message: +qemu-img: Expecting one image file name +Try 'qemu-img --help' for more information + +How to reproduce in terminal: +1. Create a VM using Raw disk image format +2. Run either commands 'qemu-img info -f raw [image-file-name.img]', 'qemu-img info [image-file-name.img]'. +3. Run either commands 'qemu-img info -f qcow2 [image-file-name.qcow2]', 'qemu-img info [image-file-name.qcow2]'. + +Actual result: output model resulting from step .2 exhibits following informations: +image: image-file-name.img +file format: raw +virtual size: _G (_ bytes) +disk size: _G + +Output model resulting from step .3 exhibits following informations: +image: image-file-name.qcow2 +file format: qcow2 +virtual size: _G (_ bytes) +disk size: _G +cluster_size: _ +Snapshot list: +ID TAG VM SIZE DATE VM CLOCK +1 snapshot1 _G 2018-07-31 18:27:49 00:03:45.890 + +Format specific information: + compat: 1.1 + lazy refcounts: true + refcount bits: 16 + corrupt: false + +Actual result: raw and qcow2 formats respective versions –which are likely to be mentioned as "version"– to be exhibited. + +Additional information: in documentation lastly updated 2018-08-23 at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/index it is stated in chapters: + +14.10 – 'images in raw format can be resized in both directions, whereas qcow2 version 2 or qcow2 version 3 images can be grown'; +14.12 – 'the qcow2 version supplied with Red Hat Enterprise Linux 7 is 1.1', 'To know which version you are using, run qemu-img info qcow2 [imagefilename.img] command.'. + +There's no concept of versions for "raw" images as there's no metadata at all - its just raw disk content. + +For the qcow2 format, the version 2 vs version 3 distinction is not something that is intended to be exposed externally. A version 3 format file is still handled by the qcow2 format driver, and there is no qcow3 driver. + +At an end user level you can specify a compatibility level "0.10" vs "1.1" which indicates what QEMU version your image shuld be compatible with - from 'man qemu-img': + + "compat" + Determines the qcow2 version to use. "compat=0.10" uses the traditional image format + that can be read by any QEMU since 0.10. "compat=1.1" enables image format extensions + that only QEMU 1.1 and newer understand (this is the default). Amongst others, this + includes zero clusters, which allow efficient copy-on-read for sparse images. + +Internally, compat=1.1, will cause qemu to use qcow2 version 3, but that's not something users should be concerned with. + +Docs that talk about version 2 vs version 3 should be fixed as that's not something user should be exposed to. They should just talk about the compat level. + + +So many useful information that would be worth to be part of the documentation associated to the above mentioned link. + diff --git a/results/classifier/zero-shot/105/semantic/1791680 b/results/classifier/zero-shot/105/semantic/1791680 new file mode 100644 index 000000000..1aa139753 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1791680 @@ -0,0 +1,88 @@ +semantic: 0.878 +mistranslation: 0.848 +network: 0.829 +instruction: 0.800 +graphic: 0.785 +assembly: 0.772 +other: 0.763 +device: 0.751 +vnc: 0.661 +boot: 0.645 +socket: 0.619 +KVM: 0.600 + +network bridge does not work + +hi there + +the network bridge does not seem to work described as here: https://en.wikibooks.org/wiki/QEMU/Networking + +When i add that parameters in a 192.168.80.x subnet, my emulated raspbian ARM gets the IP 10.0.2.15.... While all other computers get 192.168.80.x + +The command i use is: + + +qemu-system-arm.exe -M versatilepb -cpu arm1176 -hda 2018-09-03_stretch_inkl_phalcon.img -kernel kernel-qemu-4.4.34-jessie -m 192 -append "root=/dev/sda2 panic=1 rootfstype=ext4 rw" -no-reboot -net nic -net user -device e1000,mac=52:54:00:12:34:56 & + + +Does not build up a network bridge to 192.168.80.x... + +The host system i use is win10 x64 v1803 + +Best regards, +Jan + +J:\Tools\qemu>qemu-system-arm.exe -M versatilepb -cpu arm1176 -hda 2018-09-03_stretch_inkl_phalcon.img -kernel kernel-qe +mu-4.4.34-jessie -m 192 -append "root=/dev/sda2 panic=1 rootfstype=ext4 rw" -no-reboot -net nic -net user -device e1000, +mac=52:54:00:12:34:56 +WARNING: Image format was not specified for '2018-09-03_stretch_inkl_phalcon.img' and probing guessed raw. + Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted. + + Specify the 'raw' format explicitly to remove the restrictions. +dsound: Could not initialize DirectSoundCapture +dsound: Reason: No sound driver is available for use, or the given GUID is not a valid DirectSound device ID +qemu-system-arm.exe: warning: nic e1000.0 has no peer + + +10.0.2.15 is neither a ip in our dhcp range nor an apipa address - strange + +but google is pingable, so i have internet. + +must be nat, right?? + +Yes, looks like nat - 10.10.2.15 is not pingable from 192.168.80.x but vice versa... + +but wqhat they write here is not nat: "If no network options are specified, QEMU will default to emulating a single Intel e1000 PCI card with a user-mode network stack that bridges to the host's network. The following three command lines are equivalent:" + +And i think my params are right? + +... -net nic -net user -device e1000,mac=52:54:00:12:34:56 & + +That comment about e1000 is only true for qemu-system-i386. For ARM machines, there are other default NICs. You should also not mix "-net" and "-device", see https://www.qemu.org/2018/05/31/nic-parameter/ for some details. And concerning NAT, yes the "user" backend is using NAT, see https://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29 for details about that. + +OK thx. + +"The -device option can only be used for pluggable NICs. Boards (e.g. embedded boards) which feature an on-board NIC cannot be configured with -device yet, so -net nic,netdev=<id> must be used here instead." + +when i only use "-net nic", i get an apipa address + +what do i need for netdev id? n1 as described in your links does not work. messsage: "netdev 'n1' not found" + +currently, only one nic adapter is enabled on my win10 host: the ethernet controller. + +the other 2, 1x internal wlan and 1x usb wlan is disabled.. + +"That comment about e1000 is only true for qemu-system-i386. For ARM machines, there are other default NICs." + +but why im able to ping google with that config?? + +"-nic tap,model=e1000" + +-> "Device 'tap' colud not be found + +incompatible with windows, right? so i need a linux machine with ethx?? + +https://bugs.launchpad.net/qemu/+bug/1404278 + +problem solved! :-) + diff --git a/results/classifier/zero-shot/105/semantic/1798659 b/results/classifier/zero-shot/105/semantic/1798659 new file mode 100644 index 000000000..5b9edc49a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1798659 @@ -0,0 +1,53 @@ +semantic: 0.775 +mistranslation: 0.640 +graphic: 0.498 +assembly: 0.449 +other: 0.431 +instruction: 0.393 +device: 0.352 +vnc: 0.337 +socket: 0.304 +network: 0.201 +boot: 0.153 +KVM: 0.050 + +Replace comma with semicolon in trace/simple.c + +In the master branch in trace/simple.c in writeout_thread (https://github.com/qemu/qemu/blob/master/trace/simple.c#L174) we currently have: + dropped.rec.length = sizeof(TraceRecord) + sizeof(uint64_t), + dropped.rec.pid = trace_pid; + +It seems to me like a typo that the first line ends with a comma. +Currently this causes no harm, but I think this should be fixed. + +It's perfect valid C to terminate a statement with "," instead of ";" - it just has a different meaning. Consider this: + +#include <stdio.h> + +int main() +{ + if (0) + printf("Hello!\n"), + + printf("Good bye!\n"); + + return 0; +} + +At a first glance, you'd expect this program to print "Good bye!" - but it does not. Actually, the "," is used here to put the two printf statements into the same block, so this program is the same as: + + if (0) { + printf("Hello!\n"); + printf("Good bye!\n"); + } + +Thus, there is no real bug in simple.c here, but of course it would be better style to clean this up and use ";" instead. + +By the way, two lines earlier there is another line ending in ",": + + dropped.rec.event = DROPPED_EVENT_ID, + +Fixed in commit 7ff5920717d413d8b7c3ba13d9, which will be in the upcoming 4.0 release. + + + diff --git a/results/classifier/zero-shot/105/semantic/1798780 b/results/classifier/zero-shot/105/semantic/1798780 new file mode 100644 index 000000000..cb4fb496e --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1798780 @@ -0,0 +1,104 @@ +semantic: 0.468 +graphic: 0.416 +vnc: 0.381 +other: 0.366 +instruction: 0.348 +KVM: 0.329 +device: 0.322 +assembly: 0.309 +boot: 0.301 +network: 0.278 +socket: 0.268 +mistranslation: 0.227 + +hw/usb/dev-mtp.c:1616: bad test ? + +hw/usb/dev-mtp.c:1616:52: warning: logical ‘or’ of collectively exhaustive tests is always true [-Wlogical-op] + +Source code is + + if ((ret == -1) && (errno != EINTR || errno != EAGAIN || + errno != EWOULDBLOCK)) { + +Maybe better code + + if ((ret == -1) && (errno != EINTR && errno != EAGAIN && + errno != EWOULDBLOCK)) { + +On Fri, 19 Oct 2018 at 10:22, dcb <email address hidden> wrote: +> hw/usb/dev-mtp.c:1616:52: warning: logical ‘or’ of collectively +> exhaustive tests is always true [-Wlogical-op] +> +> Source code is +> +> if ((ret == -1) && (errno != EINTR || errno != EAGAIN || +> errno != EWOULDBLOCK)) { +> +> Maybe better code +> +> if ((ret == -1) && (errno != EINTR && errno != EAGAIN && +> errno != EWOULDBLOCK)) { + +Hi Gerd, Bandan -- I was going through older launchpad bugs and +noticed that this one about a dubious conditional in dev-mtp.c is +still unfixed. + +Is the file descriptor being used here one that's in non-blocking +mode? If so, then busy-waiting in a loop while the write() returns +EWOULDBLOCK is probably not what you wanted. If it's not then +there's no need to check for EAGAIN or EWOULDBLOCK, I think. +Consider using qemu_write_full() instead of open-coding +the retry loop ? + +thanks +-- PMM + + +On Tue, Jan 22, 2019 at 07:41:16AM -0500, Bandan Das wrote: +> +> qemu_write_full takes care of partial blocking writes, +> as in cases of larger file sizes +> +> Suggested-by: Peter Maydell <email address hidden> +> Signed-off-by: Bandan <email address hidden> + +Hmm, doesn't apply, and git fails to do a 3way merge too due to unknown +sha1. + +cheers, + Gerd + + + +On Thu, Jan 24, 2019 at 03:19:03AM -0500, Bandan Das wrote: +> Gerd Hoffmann <email address hidden> writes: +> +> > On Tue, Jan 22, 2019 at 07:41:16AM -0500, Bandan Das wrote: +> >> +> >> qemu_write_full takes care of partial blocking writes, +> >> as in cases of larger file sizes +> >> +> >> Suggested-by: Peter Maydell <email address hidden> +> >> Signed-off-by: Bandan <email address hidden> +> > +> > Hmm, doesn't apply, and git fails to do a 3way merge too due to unknown +> > sha1. +> +> Oops, sorry, I realize now this is on top of the write buffer breakup patches. + +Hmm, they are queued up already, so that should have worked. + +> Should I resend a v2 on top of master and send a v3 for the write buffer breakup +> patches ? + +Can you just send a single series with both this fix and the breakup +patches, against latest master? + +thanks, + Gerd + + + +Fixed by commit 49f9e8d660d4 which will be in QEMU 4.0. + + diff --git a/results/classifier/zero-shot/105/semantic/1805913 b/results/classifier/zero-shot/105/semantic/1805913 new file mode 100644 index 000000000..819f6a0a0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1805913 @@ -0,0 +1,154 @@ +semantic: 0.829 +assembly: 0.794 +network: 0.791 +other: 0.779 +instruction: 0.737 +graphic: 0.729 +device: 0.725 +socket: 0.709 +boot: 0.680 +mistranslation: 0.664 +vnc: 0.650 +KVM: 0.589 + +readdir() returns NULL (errno=EOVERFLOW) for 32-bit user-static qemu on 64-bit host + +This can be simply reproduced by compiling and running the attached C code (readdir-bug.c) under 32-bit user-static qemu, such as qemu-arm-static: + +# Setup docker for user-static binfmt +docker run --rm --privileged multiarch/qemu-user-static:register --reset +# Compile the code and run (readdir for / is fine, so create a new directory /test). +docker run -v /path/to/qemu-arm-static:/usr/bin/qemu-arm-static -v /path/to/readdir-bug.c:/tmp/readdir-bug.c -it --rm arm32v7/ubuntu:18.10 bash -c '{ apt update && apt install -y gcc; } >&/dev/null && mkdir -p /test && cd /test && gcc /tmp/readdir-bug.c && ./a.out' +dir=0xff5b4150 +readdir(dir)=(nil) +errno=75: Value too large for defined data type + +Do remember to replace the /path/to/qemu-arm-static and /path/to/readdir-bug.c to the actual paths of the files. + +The root cause is in glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdents.c;h=6d09a5be7057e2792be9150d3a2c7b293cf6fc34;hb=a5275ba5378c9256d18e582572b4315e8edfcbfb#l87 + +By C standard, the return type of readdir() is DIR*, in which the inode number and offset are 32-bit integers, therefore, glibc calls getdents64() and check if the inode number and offset fits the 32-bit range, and reports EOVERFLOW if not. + +The problem here is for 32-bit user-static qemu running on 64-bit host, getdents64 simply passing through the inode number and offset from underlying getdents64 syscall (from 64-bit kernel), which is very likely to not fit into 32-bit range. On real hardware, the 32-bit kernel creates 32-bit inode numbers, therefore works properly. + +The glibc code makes sense to do the check to be conformant with C standard, therefore ideally it should be a fix on qemu side. I admit this is difficult because qemu has to maintain a mapping between underlying 64-bit inode numbers and 32-bit inode numbers, which would severely hurt the performance. I don't expect this could be fix anytime soon (or even there would be a fix), but it would be worthwhile to surface this issue. + + + +More notes: this bug hits glibc-2.28 and later. It works on glibc-2.27. Therefore to reproduce it it needs ubuntu 18.10 or later. Seems like it works for 18.04. + +This bug affects all Java programs that (implicitly) uses File.list() or File.listFiles(). Also it makes dash not expanding wildcard /some/directory/* . However, bash works because it uses glob() instead of readdir(). + + +The bug also affects shared-mime-info. update-mime-database uses readdir and ends up generating an empty database without reporting any errors, causing pixbuf and anything else that relies on the mime database not to work properly. + +Same things happens with update-ca-certificates. It calls c_rehash through openssl, which ends up doing nothing. As a result, curl with https and probably anything else that uses SSL fails to work. + +This probably makes the issue fairly critical for tools that create 32bit environments through qemu-debootstrap or build packages in said environment. + +I was also hit by this on Gentoo with a 64bit host running 32bit static chroot (arm). If it matters at all, I saw it after upgrading the 32bit arm chroot to glibc-2.28, while the host was still on 2.27. + +Downgrading again hides the issue. Upgrading the host to glibc 2.28, but keeping the chroot at 2.27 seems to not hit it either. + +https://lkml.org/lkml/2018/12/27/155 + +After studying linux-user/syscall.c a bit, would it be possible to work around this issue by doing something like the following: + +Add a new #define EMULATE_GETDENTS64_WITH_GETDENTS, and enable this iff we have getdents, and the target is 32, while the host is 64 bits. Something similar, but complementary is done with EMULATE_GETDENTS_WITH_GETDENTS64. + +In that case, when userspace calls getdents64, we implement a "conversion" (similar to getdents #if logic), which calls the host's getdents and converts the data structures back to their 64-bit variants before handing back to user-space. + +I'm likely over-simplifying a problem that I don't fully understand, but would happily work on a patch if someone higher up the food chain could fill in the gaps. + +Unfortunately there is no kernel API which we can use on the host to say "give me inodes and offsets which will fit into a 32 bit field". The 'getdents' syscall uses the "unsigned long" type for the d_ino and d_off fields, so on a 64-bit host these will be the same size as the ino64_t and off64_t used by 'getdents64', and you will still have the "trying to fit a quart into a pint pot" problem. + +The only way to fix this is to fix the host kernel to provide the API QEMU needs for this (see discussion in the kernel thread linked to in comment #5). + + +Is there a workaround for this? I tried: + +- Building on an XFS partition. +- Building from ubuntu:16.04 so the host has glib <2.27. + +It looks like the only way is to have the chroot with glib <2.27, and in alpine images glib is at minimum 2.56. + +If the bug is fixed in glib maybe I can install glib from master? I'm trying to build multi-arch docker images and this bug is what prevents me from providing arm/v7 images for the raspberry pi. + +Sorry, meant `< 2.28` above. + +There has been some motion on this by Aladjev Andrew. I will butcher the explanation of his approach if I try, but it is described in the following bugs. I have no idea of the schedule, or even possibility of adoption; it seems to still be in proof-of-concept phase. + +GLIBC bug (see last several posts) +https://sourceware.org/bugzilla/show_bug.cgi?id=23960 + +Kernel bugzilla (last two posts) +https://bugzilla.kernel.org/show_bug.cgi?id=205957 + +Ah, great thanks. It looks like there are patches that fix qemu, although the setup looks a bit complex. I'll report if I get something going. + +This problem affected my virtual environment which I used (via qemu-static) to build my project for RaspberryPI platform. After I upgraded my virtual Raspbian to buster release `readdir` stopped working (as described in this thread) due to mapping of 64 inode numbers to qemu 32bit ARM land. I needed this builder working and I found a workaround in some obscure (2nd page of google result) blog. + +Before the work around my virtual Raspbian was just a directory on one of my ext4 partitions. To fix the issue I created image file with dd, formatted with mkfs.ext4 it with `dir_index` option disabled and moved my virual Raspbian onto that newly created filesystem. This fixed the issue for me and my builder started again. + +I am posting it here so `dir_index` trick can be easier to found for others in this situation. + + +Thanks Marcin. I tested your solution but by me it still gets stuck at the same point. Here's what I did: + +$ tune2fs -O ^dir_index /dev/sda1 +$ tune2fs -l /dev/sda1 +tune2fs 1.44.2 (14-May-2018) +Filesystem volume name: <none> +Last mounted on: / +Filesystem UUID: c8fee0cb-a610-4fa5-aab8-c5c765678133 +Filesystem magic number: 0xEF53 +Filesystem revision #: 1 (dynamic) +Filesystem features: has_journal ext_attr resize_inode filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum +Filesystem flags: signed_directory_hash +Default mount options: user_xattr acl +Filesystem state: clean +(snip) + +But then my build still get stuck on: + +clock_gettime(CLOCK_REALTIME, {tv_sec=1580996038, tv_nsec=781126598}) = 0 +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 144 +tgkill(29974, 29977, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) +clock_gettime(CLOCK_REALTIME, {tv_sec=1580996038, tv_nsec=781461434}) = 0 +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 144 +tgkill(29974, 29977, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable + + + +I seem to have found another workaround. Knowing now what causes this my guess was: If I make the qemu-arm-static on the host a 32 bit binary and get "multilib" running to make my 64 bit Linux installation run this, then in theory this incompatibility should not happen. If it would, then 32 bit x86 applications whould run into the same problem. + +And at least according to my tries, I did so far, this seems to be the case. I was able to reproduce this with svn (no checkout possible from 32 bit armv7h). If the qemu-arm-static binary is a 32 bit x86 application, then SVN checkouts work well now. + +So until there is a better solution it seems to be a good idea to make the emulation layer run through multilib for 32 bit target architectures, so the host kernel can switch to its 32 bit backwards compatibility mode. + +Yes, using a 32-bit host QEMU process will also work. You might run into a few guest programs that don't work with that -- a 64-bit QEMU process allows us to give the guest the full address space it might need, while a 32-bit QEMU process means that QEMU itself must share with the guest, so if the guest uses a lot of virtual memory or is picky about where it maps things then it might fail to mmap() things where it wants them. But it's probably overall the least-bad workaround at the current time. + + +After reading through the discussion on the mailing list, as it's all about ext4, I got curious... +I'm testing with qemu-user-static and regulary build arm images in a tmpfs. This show similar behaviour and readdir() fails. However, running in the same root copied onto a btrfs, it seems fine. +Maybe this is an even less bad workaround for some folks? + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +This is still a bug, and still blocked on the kernel providing APIs to QEMU to request 32-bit directory entries. Linus Walleij proposed a kernel patch to add a suitable fcntl flag but as far as I'm aware it didn't get in so far: + https://<email address hidden>/ + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/263 + + diff --git a/results/classifier/zero-shot/105/semantic/1809252 b/results/classifier/zero-shot/105/semantic/1809252 new file mode 100644 index 000000000..8e40776b4 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1809252 @@ -0,0 +1,60 @@ +semantic: 0.906 +other: 0.894 +device: 0.862 +mistranslation: 0.842 +instruction: 0.785 +network: 0.783 +vnc: 0.776 +socket: 0.701 +graphic: 0.653 +assembly: 0.541 +KVM: 0.539 +boot: 0.515 + +Password authentication in FIPS-compliant mode + +The documentation states, that: + +"The VNC protocol has limited support for password based authentication. (...) Password authentication is not supported when operating in FIPS 140-2 compliance mode as it requires the use of the DES cipher." + +Would it be possible for qemu to use a different cipher and re-enable password as an option in VNC console? Is there a technical reason for not using a stronger cipher? + +On 12/20/18 6:59 AM, Tomasz Barański wrote: +> Public bug reported: +> +> The documentation states, that: +> +> "The VNC protocol has limited support for password based authentication. +> (...) Password authentication is not supported when operating in FIPS +> 140-2 compliance mode as it requires the use of the DES cipher." +> +> Would it be possible for qemu to use a different cipher and re-enable +> password as an option in VNC console? Is there a technical reason for +> not using a stronger cipher? + +The technical reason is that there are no other VNC endpoints out there +that support a different cipher. The VNC protocol itself declares what +all compliant servers/clients must use - and that spec is what makes the +non-FIPS-compliant requirement. You wouldn't have to patch just qemu, +but every other VNC endpoint out there that you want to interoperate +with a patched qemu. But it's really not worth doing that when there +are already better solutions available. That is, rather than trying to +fix VNC, just use an alternative protocol that doesn't have a baked-in +authentication limitation in the first place - namely, Spice. + +-- +Eric Blake, Principal Software Engineer +Red Hat, Inc. +1-919-301-3266 +Virtualization: qemu.org | libvirt.org + + +The VNC password authentication scheme is not extensible. It is unfixably broken by design. + +QEMU provides the SASL authentication scheme for VNC which allows for strong authentication, when combined with the VeNCrypt authentication scheme that uses TLS. + +These extensions are supported by the gtk-vnc client used by remote-viewer, virt-viewer, virt-manager, GNOME Boxes and more. Other VNC clients are also known to implement VeNCrypt, though SASL support is less wide spread. + +From a QEMU POV, there's nothing more we need todo really - any remaining gaps are client side. + +I understand. Thank you, guys! + diff --git a/results/classifier/zero-shot/105/semantic/1809546 b/results/classifier/zero-shot/105/semantic/1809546 new file mode 100644 index 000000000..dfeb2f575 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1809546 @@ -0,0 +1,90 @@ +semantic: 0.913 +instruction: 0.876 +device: 0.860 +other: 0.855 +boot: 0.852 +assembly: 0.850 +graphic: 0.848 +network: 0.811 +mistranslation: 0.754 +vnc: 0.735 +socket: 0.612 +KVM: 0.608 + +Writing a byte to a pl011 SFR overwrites the whole SFR + +The bug is present in QEMU 2.8.1 and, if my analysis is correct, also on master. + +I first noticed that a PL011 UART driver, which is fine on real hardware, fails to enable the RX interrupt in the IMSC register when running in QEMU. However, the problem only comes up if the code is compiled without optimizations. I think I've narrowed it down to a minimal example that will exhibit the problem if run as a bare-metal application. + +Given: + +pl011_addr: .word 0x10009000 + +The following snippet will be problematic: + + ldr r3, pl011_addr + ldrb r2, [r3, #0x38] // IMSC + mov r2, #0 + orr r2, r2, #0x10 // R2 == 0x10 + strb r2, [r3, #0x38] // Whole word reads correctly after this + ldrb r2, [r3, #0x39] + mov r2, #0 + strb r2, [r3, #0x39] // Problem here! Overwrites offset 0x38 as well + +After the first strb instruction, which writes to 0x10009038, everything is fine. It can be seen in the QEMU monitor: + +(qemu) xp 0x10009038 +0000000010009038: 0x00000010 + +After the second strb instruction, the write to 0x10009039 clears the entire word: + +(qemu) xp 0x10009038 +0000000010009038: 0x00000000 + +QEMU command-line, using the vexpress-a9 which has the PL011 at 0x10009000: + +qemu-system-arm -S -M vexpress-a9 -m 32M -no-reboot -nographic -monitor telnet:127.0.0.1:1234,server,nowait -kernel pl011-sfr.bin -gdb tcp::2159 -serial mon:stdio + +Compiling the original C code with optimizations makes the driver work. It compiles down to assembly that only does a single write: + + ldr r3, pl011_addr + mov r2, #0x10 + str r2, [r3, #0x38] + +Attached is the an assembly file, and linkscript, that shows the problem, and also includes the working code. + +I haven't debugged inside of QEMU itself but it seems to me that the problem is in pl011_write in pl011.c - the functions looks at which offset is being written, and then writes the entire SFR that offset falls under, which means that changing a single byte will change the whole SFR. + + + +Adding the link script. + +Yes, our PL011 implementation assumes that you only ever access the 32-bit registers with full width 32-bit word reads and writes. Don't try to do byte accesses to them. The PL011 data sheet doesn't specifically say that partial-width accesses to registers are permitted, so I think that trying to access offset 0x39 falls under the general note in section 3.1 that attempting to access reserved or unused address locations can result in unpredictable behaviour. + +You need to make sure you write your C code in a manner which enforces that accesses to device registers are done as single 32-bit accesses, and the compiler does not silently break them down into multiple reads and writes, or you will be in for a lot of pain trying to figure out what is going on if the compiler ever does it with registers that are write-to-clear or similar behaviour. Linux, for instance, does this by having readl() and writel() functions that end up doing inline asm of ldr/str instructions. + + +Thanks for the response. + +I don't think section 3.1 applies to 8-bit accesses. That is specifically about reserved locations, and neither offset 0x38 nor 0x39 are reserved, so I think it's a matter of whether 32-bit access is required or not. + +From what I usually see in ARM documentation, 32-bit access is explicitly mentioned when required. For the PL011, it's mentioned for the UARTPeriphID_n registers, for instance. In many other cases access size depends on the implementation and the corresponding memory mapping of that implementation. + +I understand that *in practice* you should ensure single-access writes unless doing otherwise is explicitly allowed. However, in cases like the PL011 it seems ambiguous whether that is actually required, so it seems like the best choice would be to explicitly document it for the QEMU implementation. That would save some guesswork. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1811711 b/results/classifier/zero-shot/105/semantic/1811711 new file mode 100644 index 000000000..50e572eaf --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1811711 @@ -0,0 +1,101 @@ +semantic: 0.867 +graphic: 0.863 +other: 0.854 +mistranslation: 0.834 +assembly: 0.832 +KVM: 0.793 +instruction: 0.790 +device: 0.790 +vnc: 0.772 +socket: 0.732 +boot: 0.709 +network: 0.684 + +qemu-img can not convert virtualbox virtual disk formats qcow + +Hello, I'm working with QEMU on macOS, and am experiencing issues working with the `qemu-img` command. + +Info +---- +$ sw_vers +ProductName: Mac OS X +ProductVersion: 10.13.6 +BuildVersion: 17G4015 + +VirtualBox +---------- +$ VBoxManage --version +6.0.0r127566 + +$ qemu-system-x86_64 --version +QEMU emulator version 3.1.50 (v3.1.0-rc2-745-g147923b1a9-dirty) +Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers + +$ qemu-img --version +qemu-img version 3.1.50 (v3.1.0-rc2-745-g147923b1a9-dirty) +Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers + +Steps to reproduce +------------------ + +> Prereq VirtualBox needs to be installed to run the `VBoxManage` command + +$ VBoxManage createmedium disk --filename vbox-vdisk-exp.qcow --format qcow --size 5 +0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100% +Medium created. UUID: e2b36955-3791-4c0e-93d4-913669b1d9fb + +$ file vbox-vdisk-exp.qcow +vbox-vdisk-exp.qcow: QEMU QCOW Image (v1), 5242880 bytes + +$ qemu-img info vbox-vdisk-exp.qcow +image: vbox-vdisk-exp.qcow +file format: qcow +virtual size: 5.0M (5242880 bytes) +disk size: 8.0K +cluster_size: 4096 + +# Convert vbox virtualdisk to qcow2 format using `qemu-img` +$ qemu-img convert -f qcow vbox-vdisk-exp.qcow -O qcow2 vbox-vdisk-exp-convert.qcow2 + +$ file vbox-vdisk-exp-convert.qcow2 +vbox-vdisk-exp-convert.qcow2: QEMU QCOW Image (v3), 5242880 bytes + +# Print info about qemu-img converted image from vbox created qcow image +$ qemu-img info vbox-vdisk-exp-convert.qcow2 mutts-6 | 0 < 10:53:00 +image: vbox-vdisk-exp-convert.qcow2 +file format: qcow2 +virtual size: 5.0M (5242880 bytes) +disk size: 196K +cluster_size: 65536 +Format specific information: + compat: 1.1 + lazy refcounts: false + refcount bits: 16 + corrupt: false + +# Print info about vbox created qcow image +qemu-img info vbox-vdisk-exp.qcow mutts-6 | 0 < 10:53:19 +image: vbox-vdisk-exp.qcow +file format: qcow +virtual size: 5.0M (5242880 bytes) +disk size: 8.0K +cluster_size: 4096 + +I've attached a zip file containing the vbox created qcow image along with the image that `qemu-img` converted. + + + +Hi, + +What exactly is the issue? All of that looks rather OK to me. + +Max + +This bug was related to an IRC discussion on Jan 14th but this bug description is not showing the problem that was raised on IRC. The IRC discussion showed a source image with a 9GB Windows 10 installation, turning into an image with only 8 MB of data present. The images in this bug description don't have any data written so are not illustrating the data conversion issue. + + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1813305 b/results/classifier/zero-shot/105/semantic/1813305 new file mode 100644 index 000000000..c01815f65 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1813305 @@ -0,0 +1,75 @@ +semantic: 0.858 +mistranslation: 0.826 +device: 0.796 +socket: 0.735 +KVM: 0.735 +instruction: 0.726 +vnc: 0.713 +boot: 0.707 +network: 0.686 +assembly: 0.657 +other: 0.588 +graphic: 0.566 + +trace-root.h is not regerenerated after re-configure + +Hi, + +I've just realized that after I reconfigured my qemu with +../configure --target-list=arm-softmmu,arm-linux-user,aarch64-softmmu,aarch64-linux-user --enable-trace-backends=simple + +$ make +did rebuild some stuff for the 'simple' trace, but it did not update trace-root.h until after I +$ make clean + + +I took me while to understand why I didn't get the traces I wanted (my trace-root.h still thought it was configured for the default 'log'). + +I didn't check how easy it is to fix this in the build system. + +Thanks + +On Fri, Jan 25, 2019 at 02:03:39PM -0000, Christophe Lyon wrote: +> I've just realized that after I reconfigured my qemu with +> ../configure --target-list=arm-softmmu,arm-linux-user,aarch64-softmmu,aarch64-linux-user --enable-trace-backends=simple +> +> $ make +> did rebuild some stuff for the 'simple' trace, but it did not update trace-root.h until after I +> $ make clean +> +> +> I took me while to understand why I didn't get the traces I wanted (my trace-root.h still thought it was configured for the default 'log'). +> +> I didn't check how easy it is to fix this in the build system. + +Thank you for reporting this. I have sent a patch to fix the makefile. + +Stefan + + +On Tue, 29 Jan 2019 at 03:55, Stefan Hajnoczi <email address hidden> wrote: +> +> On Fri, Jan 25, 2019 at 02:03:39PM -0000, Christophe Lyon wrote: +> > I've just realized that after I reconfigured my qemu with +> > ../configure --target-list=arm-softmmu,arm-linux-user,aarch64-softmmu,aarch64-linux-user --enable-trace-backends=simple +> > +> > $ make +> > did rebuild some stuff for the 'simple' trace, but it did not update trace-root.h until after I +> > $ make clean +> > +> > +> > I took me while to understand why I didn't get the traces I wanted (my trace-root.h still thought it was configured for the default 'log'). +> > +> > I didn't check how easy it is to fix this in the build system. +> +> Thank you for reporting this. I have sent a patch to fix the makefile. +> + +Thanks for the quick patch. + +> Stefan + + +This was fixed by commit 57b7bdf426445d83561, which will be in the 4.0 release. + + diff --git a/results/classifier/zero-shot/105/semantic/1828508 b/results/classifier/zero-shot/105/semantic/1828508 new file mode 100644 index 000000000..44142187f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1828508 @@ -0,0 +1,108 @@ +semantic: 0.880 +assembly: 0.866 +other: 0.863 +graphic: 0.860 +mistranslation: 0.846 +socket: 0.836 +KVM: 0.833 +device: 0.831 +instruction: 0.828 +vnc: 0.809 +network: 0.792 +boot: 0.768 + +qemu-img created VMDK files lead to "Unsupported or invalid disk type 7" + +Using qemu-img version 3.1.50 (v3.1.0-13607-geb2db0f7ba-dirty) on a Windows 10 machine. + +Converting a VHD to VMDK. +qemu-img.exe convert "c:\test\AppD-VM01.vhd" -O vmdk -o adapter_type=buslogic -p "c:\test\AppD-VM01.vmdk" + +I have also tried: +qemu-img.exe convert "c:\test\AppD-VM01.vhd" -O vmdk -o adapter_type=buslogic,hwversion=6 -p "c:\test\AppD-VM01.vmdk" + +Attaching the VMDK to a VM in VMware produces the following error when powering on. + +Power On virtual machine:Failed to open disk scsi0:1: Unsupported or invalid disk type 7. Ensure that the disk has been imported. +Target: MyVM1 +vCenter Server: VCENTER +Error Stack +An error was received from the ESX host while powering on VM MyVM1. +Failed to start the virtual machine. +Module DevicePowerOn power on failed. +Unable to create virtual SCSI device for scsi0:1, '/vmfs/volumes/5cca0155-bdddf31d-2714-00215acbeb1e/AppD-VM01/AppDdisk1-VM01.vmdk' +Failed to open disk scsi0:1: Unsupported or invalid disk type 7. Ensure that the disk has been imported. + + +If I do not specify the adapter type, it creates an IDE VMDK which works perfectly. + +On Fri, May 10, 2019 at 06:06:32AM -0000, Jake Mikelson wrote: +> Public bug reported: +> +> Using qemu-img version 3.1.50 (v3.1.0-13607-geb2db0f7ba-dirty) on a +> Windows 10 machine. +> +> Converting a VHD to VMDK. +> qemu-img.exe convert "c:\test\AppD-VM01.vhd" -O vmdk -o adapter_type=buslogic -p "c:\test\AppD-VM01.vmdk" +> +> I have also tried: +> qemu-img.exe convert "c:\test\AppD-VM01.vhd" -O vmdk -o adapter_type=buslogic,hwversion=6 -p "c:\test\AppD-VM01.vmdk" +> +> Attaching the VMDK to a VM in VMware produces the following error when +> powering on. +> +> Power On virtual machine:Failed to open disk scsi0:1: Unsupported or invalid disk type 7. Ensure that the disk has been imported. +> Target: MyVM1 +> vCenter Server: VCENTER +> Error Stack +> An error was received from the ESX host while powering on VM MyVM1. +> Failed to start the virtual machine. +> Module DevicePowerOn power on failed. +> Unable to create virtual SCSI device for scsi0:1, '/vmfs/volumes/5cca0155-bdddf31d-2714-00215acbeb1e/AppD-VM01/AppDdisk1-VM01.vmdk' +> Failed to open disk scsi0:1: Unsupported or invalid disk type 7. Ensure that the disk has been imported. +> +> +> If I do not specify the adapter type, it creates an IDE VMDK which works perfectly. +> +> ** Affects: qemu +> Importance: Undecided +> Status: New +> +> -- +> You received this bug notification because you are a member of qemu- +> devel-ml, which is subscribed to QEMU. +> https://bugs.launchpad.net/bugs/1828508 + +Which version of VMware are you running? + +Stefan + + +Hi, I'm running 5.5. + +I've been playing around with some of the options, and if I run the below, I end up with 2 files. + +qemu-img.exe convert "c:\test\AppD-VM01.vhd" -O vmdk -o adapter_type=lsilogic,subformat=monolithicFlat -p "c:\test\AppD-VM01.vmdk" + +The files I get are: +AppD-VM01.vmdk (which is always 12kb) +AppD-VM01-flat.vmdk (which is the full size of the disk, eg 30GB). + +If I then upload both of these files to the datastore, they somehow merge into 1 and I can attach and power on the VM. If you dont upload both files into the datastore, VMware does not recognise it. + +This is the only method that seems to work for me. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1829964 b/results/classifier/zero-shot/105/semantic/1829964 new file mode 100644 index 000000000..4418dec4e --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1829964 @@ -0,0 +1,101 @@ +semantic: 0.922 +other: 0.913 +graphic: 0.892 +mistranslation: 0.887 +device: 0.874 +assembly: 0.865 +boot: 0.845 +instruction: 0.807 +KVM: 0.749 +network: 0.736 +socket: 0.691 +vnc: 0.597 + +HOST VRAM Leak when performs android-x86 window rotation with Virt-GPU + +I will report something strange thing about host VRAM leakage after anroid-x86 window rotation when it runs with virt-gpu(+ virgl-renderer) + +Please watching below video link. + +https://www.youtube.com/watch?v=mJIbGZLWF1s&feature=youtu.be + +(orginal video file : https://drive.google.com/file/d/1lkdTx_8yTbSVjKXlnxnnk96fWe-w6Mxb/view?usp=sharing) + +I don't sure what is the problem... + +Here are my tested history +-------------------------------------------------------------------------------------------------- +Install android-x86 on I7 desktop PCs with intel UHD GPU - No leak. +Install android-x86 on I7 desktop PCs with NVIDIA GTX GPU series - No leak. +Install android-x86 on guest machine emulated skylake cpu with QEMU(+virt-gpu, virgl-renderer) - Leak +(HOST CPU - I5, INTEL UHD GPU) +Install android-x86 on guest machine emulated skylake cpu with QEMU(+virt-gpu, virgl-renderer) - Leak +(HOST CPU - I7, NVIDIA GTX GPU) + +COMMON: +In case of NVIDIA GPU : check vram using nvidia-smi +In case of intel UHD GPU : check shared-vram using free cmd + +We checked guest android-x86 system down when vram is full after performing many rotation +------------------------------------------------------------------------------------------- + +Is it virt-gpu driver's problem? + +I hope someone can help me... + +Thanks in advance!! + +Here are qemu options I used... + +-machine type=q35,accel=kvm -cpu host --enable-kvm \ +-smp cpus=4,cores=4,threads=1 -m 4096 \ +-drive file=ctb0319.qcow2,format=qcow2,if=virtio,aio=threads \ +-device virtio-vga,virgl=on \ +-device qemu-xhci,id=xhci -device usb-mouse,bus=xhci.0 -device usb-kbd,bus=xhci.0 \ +-soundhw hda -display sdl,gl=on -netdev user,id=qemunet0,hostfwd=tcp::4000-:7000,hostfwd=tcp::5555-:5555,hostfwd=tcp::4012-:7012,hostfwd=tcp::4013-:7013 -device virtio-net,netdev=qemunet0 -boot menu=on + +This is the *upstream* QEMU bug tracker here. If you've got a problem with the android emulator, please report these problems to the android emulator project instead. Thanks. + +To Thomas Huth, + +This is not android problem, qemu or virt-gpu problem,. +-------------------- our test log -------------------------------------- +Running android-x86 on I7 bare metal desktop PCs with intel UHD GPU - No leak. +Running android-x86 on QEMU(+virt-gpu, virgl-renderer) - Leak +------------------------------------------------------------------------ + +Also in case of a guest linux, it also have leak after windows manager rotation. + +Ok, sorry, got that wrong - we sometimes get bug reports about the android emulator (which is a fork of QEMU) here, and at a first glance, your bug report looked like one of these misguided bug tickets, too. + +Anyway, please provide some more information: Which version of QEMU are you using? Which operating system are you running in QEMU? + +I tested many qemu & linux versions.... + +in case of qemu, +2.12 +3.10 +3.12 +4.0.0 +All versions I tested have same problem.... + +also I tested many versions of linux +ubuntu 18.04 18.10 +centos 7 +fedora 18 19 +rhel + +Actually it is not only problem of windows rotation, if home launcher refreshed, vram usage is also up... + +I think it related gl related functions... + +so I don't sure it is qemu-virt-gpu problem or virt-gpu driver... + +That is why I already report this problem to android-x86 devel forum and author of virt-gpu drvier... + + + + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1834 b/results/classifier/zero-shot/105/semantic/1834 new file mode 100644 index 000000000..6965702f6 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1834 @@ -0,0 +1,197 @@ +semantic: 0.842 +other: 0.832 +device: 0.802 +assembly: 0.793 +boot: 0.772 +instruction: 0.761 +network: 0.728 +vnc: 0.707 +graphic: 0.707 +KVM: 0.698 +socket: 0.681 +mistranslation: 0.607 + +qemu-system-x86_64: ../hw/pci/msix.c:227: msix_table_mmio_write: Assertion `addr + size <= dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE' failed. +Description of problem: + +Steps to reproduce: +1. Run qemu using the provided command line +2. linux kernel boot and qemu crashes at pci bus scan step +3. +Additional information: +``` +SeaBIOS (version rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org +iPXE (http://ipxe.org) 00:02.0 CA00 PCI2.10 PnP PMM+3EFD0CE0+3EF30CE0 CA00 +iPXE (http://ipxe.org) 00:05.0 CB00 PCI2.10 PnP PMM+3EF1FCE0 3EF30CE0 CB00 +Booting from ROM... +[ 0.000000] Linux version 6.1.38-yocto-standard (oe-user@oe-host) (x86_64-poky-linux-gcc (GCC) 12.3.0, GNU ld (GNU Binutils) 2.40.0.20230620) #1 SMP PREEMPT_DYNAMIC Thu Jul 6 18:52:54 UTC 2023 +[ 0.000000] Command line: console=ttyS0 +[ 0.000000] x86/fpu: x87 FPU will use FXSAVE +[ 0.000000] signal: max sigframe size: 1040 +[ 0.000000] BIOS-provided physical RAM map: +[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable +[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved +[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved +[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003ffdefff] usable +[ 0.000000] BIOS-e820: [mem 0x000000003ffdf000-0x000000003fffffff] reserved +[ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved +[ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved +[ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved +[ 0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved +[ 0.000000] NX (Execute Disable) protection: active +[ 0.000000] SMBIOS 3.0.0 present. +[ 0.000000] DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 +[ 0.000000] last_pfn = 0x3ffdf max_arch_pfn = 0x400000000 +[ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT +[ 0.000000] found SMP MP-table at [mem 0x000f5b80-0x000f5b8f] +[ 0.000000] ACPI: Early table checksum verification disabled +[ 0.000000] ACPI: RSDP 0x00000000000F59A0 000014 (v00 BOCHS ) +[ 0.000000] ACPI: RSDT 0x000000003FFE238A 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: FACP 0x000000003FFE217A 0000F4 (v03 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: DSDT 0x000000003FFE0040 00213A (v01 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: FACS 0x000000003FFE0000 000040 +[ 0.000000] ACPI: APIC 0x000000003FFE226E 000080 (v03 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: FACS 0x000000003FFE0000 000040 +[ 0.000000] ACPI: APIC 0x000000003FFE226E 000080 (v03 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: HPET 0x000000003FFE22EE 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: MCFG 0x000000003FFE2326 00003C (v01 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: WAET 0x000000003FFE2362 000028 (v01 BOCHS BXPC 00000001 BXPC 00000001) +[ 0.000000] ACPI: Reserving FACP table memory at [mem 0x3ffe217a-0x3ffe226d] +[ 0.000000] ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe2179] +[ 0.000000] ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f] +[ 0.000000] ACPI: Reserving APIC table memory at [mem 0x3ffe226e-0x3ffe22ed] +[ 0.000000] ACPI: Reserving HPET table memory at [mem 0x3ffe22ee-0x3ffe2325] +[ 0.000000] ACPI: Reserving MCFG table memory at [mem 0x3ffe2326-0x3ffe2361] +[ 0.000000] ACPI: Reserving WAET table memory at [mem 0x3ffe2362-0x3ffe2389] +[ 0.000000] Zone ranges: +[ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] +[ 0.000000] DMA32 [mem 0x0000000001000000-0x000000003ffdefff] +[ 0.000000] Normal empty +[ 0.000000] Device empty +[ 0.000000] Movable zone start for each node +[ 0.000000] Early memory node ranges +[ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] +[ 0.000000] node 0: [mem 0x0000000000100000-0x000000003ffdefff] +[ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000003ffdefff] +[ 0.000000] On node 0, zone DMA: 1 pages in unavailable ranges +[ 0.000000] On node 0, zone DMA: 97 pages in unavailable ranges +[ 0.000000] On node 0, zone DMA32: 33 pages in unavailable ranges +[ 0.000000] ACPI: PM-Timer IO Port: 0x608 +[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) +[ 0.000000] IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23 +[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) +[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) +[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) +[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) +[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) +[ 0.000000] ACPI: Using ACPI (MADT) for SMP configuration information +[ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 +[ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs +[ 0.000000] [mem 0x40000000-0xafffffff] available for PCI devices +[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns +[ 0.000000] setup_percpu: NR_CPUS:8 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1 +[ 0.000000] percpu: Embedded 52 pages/cpu s173288 r8192 d31512 u1048576 +[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 257759 +[ 0.000000] Kernel command line: console=ttyS0 +[ 0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear) +[ 0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear) +[ 0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off +[ 0.000000] Memory: 1002116K/1048052K available (12294K kernel code, 1469K rwdata, 2600K rodata, 1488K init, 2040K bss, 45680K reserved, 0K cma-reserved) +[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 +[ 0.000000] ftrace: allocating 31276 entries in 123 pages +[ 0.000000] ftrace: allocated 123 pages with 6 groups +[ 0.000000] ftrace: allocating 31276 entries in 123 pages +[ 0.000000] ftrace: allocated 123 pages with 6 groups +[ 0.000000] Dynamic Preempt: none +[ 0.000000] rcu: Preemptible hierarchical RCU implementation. +[ 0.000000] rcu: RCU event tracing is enabled. +[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=2. +[ 0.000000] Trampoline variant of Tasks RCU enabled. +[ 0.000000] Rude variant of Tasks RCU enabled. +[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. +[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2 +[ 0.000000] NR_IRQS: 4352, nr_irqs: 440, preallocated irqs: 16 +[ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention. +[ 0.000000] Console: colour VGA+ 80x25 +[ 0.000000] printk: console [ttyS0] enabled +[ 0.000000] ACPI: Core revision 20220331 +[ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns +[ 0.020000] APIC: Switch to symmetric I/O mode setup +[ 0.040000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 +[ 0.120000] tsc: Unable to calibrate against PIT +[ 0.120000] tsc: using HPET reference calibration +[ 0.120000] tsc: Detected 2299.960 MHz processor +[ 0.001362] tsc: Marking TSC unstable due to TSCs unsynchronized +[ 0.002851] Calibrating delay loop (skipped), value calculated using timer frequency.. 4599.92 BogoMIPS (lpj=22999600) +[ 0.004441] pid_max: default: 32768 minimum: 301 +[ 0.019780] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear) +[ 0.020332] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear) +[ 0.078474] process: using AMD E400 aware idle routine +[ 0.079221] Last level iTLB entries: 4KB 512, 2MB 255, 4MB 127 +[ 0.079631] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0 +[ 0.081092] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization +[ 0.082698] Spectre V2 : Mitigation: Retpolines +[ 0.083053] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch +[ 0.083616] Spectre V2 : Spectre v2 / SpectreRSB : Filling RSB on VMEXIT +[ 0.348864] Freeing SMP alternatives memory: 32K +[ 0.514732] smpboot: CPU0: AMD QEMU Virtual CPU version 2.5+ (family: 0xf, model: 0x6b, stepping: 0x1) +[ 0.536546] cblist_init_generic: Setting adjustable number of callback queues. +[ 0.537604] cblist_init_generic: Setting shift to 1 and lim to 1. +[ 0.538995] cblist_init_generic: Setting shift to 1 and lim to 1. +[ 0.541338] Performance Events: PMU not available due to virtualization, using software events only. +[ 0.548504] rcu: Hierarchical SRCU implementation. +[ 0.548986] rcu: Max phase no-delay instances is 1000. +[ 0.563842] smp: Bringing up secondary CPUs ... +[ 0.583950] x86: Booting SMP configuration: +[ 0.584395] .... node #0, CPUs: #1 +[ 0.802667] smp: Brought up 1 node, 2 CPUs +[ 0.803300] smpboot: Max logical packages: 1 +[ 0.803821] smpboot: Total of 2 processors activated (9202.49 BogoMIPS) +[ 0.864556] devtmpfs: initialized +[ 0.897545] x86/mm: Memory block size: 128MB +[ 0.936982] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns +[ 0.938878] futex hash table entries: 512 (order: 3, 32768 bytes, linear) +[ 0.980994] NET: Registered PF_NETLINK/PF_ROUTE protocol family +[ 1.004001] thermal_sys: Registered thermal governor 'step_wise' +[ 1.004143] thermal_sys: Registered thermal governor 'user_space' +[ 1.009528] cpuidle: using governor menu +[ 1.022723] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 +[ 1.043717] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xb0000000-0xbfffffff] (base 0xb0000000) +[ 1.050546] PCI: MMCONFIG at [mem 0xb0000000-0xbfffffff] reserved in E820 +[ 1.060576] PCI: Using configuration type 1 for base access +[ 1.074215] mtrr: your CPUs had inconsistent fixed MTRR settings +[ 1.075157] mtrr: your CPUs had inconsistent variable MTRR settings +[ 1.076043] mtrr: your CPUs had inconsistent MTRRdefType settings +[ 1.076840] mtrr: probably your BIOS does not setup all CPUs. +[ 1.077612] mtrr: corrected configuration. +[ 1.453630] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages +[ 1.454286] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page +[ 1.467152] raid6: skipped pq benchmark and selected sse2x4 +[ 1.467152] raid6: using intx1 recovery algorithm +[ 1.485004] ACPI: Added _OSI(Module Device) +[ 1.485539] ACPI: Added _OSI(Processor Device) +[ 1.485909] ACPI: Added _OSI(3.0 _SCP Extensions) +[ 1.486309] ACPI: Added _OSI(Processor Aggregator Device) +[ 1.578101] ACPI: 1 ACPI AML tables successfully acquired and loaded +[ 1.670966] ACPI: Interpreter enabled +[ 1.676848] ACPI: PM: (supports S0 S3 S5) +[ 1.677404] ACPI: Using IOAPIC for interrupt routing +[ 1.683268] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug +[ 1.684107] PCI: Using E820 reservations for host bridge windows +[ 1.691382] ACPI: Enabled 2 GPEs in block 00 to 3F +[ 1.828171] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) +[ 1.831923] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] +[ 1.839401] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug LTR DPC] +[ 1.843631] acpi PNP0A08:00: _OSC: OS now controls [SHPCHotplug PME AER PCIeCapability] +[ 1.867627] PCI host bridge to bus 0000:00 +[ 1.868866] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window] +[ 1.870044] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window] +[ 1.870572] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window] +[ 1.871151] pci_bus 0000:00: root bus resource [mem 0x40000000-0xafffffff window] +[ 1.871719] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window] +[ 1.872269] pci_bus 0000:00: root bus resource [mem 0x100000000-0x8ffffffff window] +[ 1.873668] pci_bus 0000:00: root bus resource [bus 00-ff] +[ 1.880983] pci 0000:00:00.0: [8086:29c0] type 00 class 0x060000 +[ 1.898659] pci 0000:00:01.0: [1234:1111] type 00 class 0x030000 +qemu-system-x86_64: ../hw/pci/msix.c:227: msix_table_mmio_write: Assertion `addr + size <= dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE' failed. +``` diff --git a/results/classifier/zero-shot/105/semantic/1843151 b/results/classifier/zero-shot/105/semantic/1843151 new file mode 100644 index 000000000..637891d5c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1843151 @@ -0,0 +1,237 @@ +semantic: 0.773 +other: 0.744 +mistranslation: 0.733 +KVM: 0.709 +device: 0.693 +network: 0.684 +vnc: 0.647 +graphic: 0.601 +assembly: 0.584 +instruction: 0.563 +boot: 0.457 +socket: 0.361 + +Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10 + +Host is Arch Linux. linux 5.2.13, qemu 4.1.0. + +Guest is Arch Linux Sept 2019 ISO. linux 5.2.11. + +Have replicated this both on a system using amdgpu and one using integrated ASPEED graphics. + +Downgrading from 4.1.0 to 4.0.0 works as usual, see: https://www.youtube.com/watch?v=NyMdcYwOCvY + +Going back to 4.1.0 reproduces, see: https://www.youtube.com/watch?v=H3nGG2Mk6i0 + +4.1.0 displays fine until KMS kicks in. + +Hi, + Can you give the full qemu commandline you're using on 4.1.0 please? +(If you're starting it using libvirt/virsh then please include the xml description file for the VM). + + +Finding a minimal case did shed some light on this. + +Using QEMU's native graphics window, this works fine: + +$ /usr/bin/qemu-system-x86_64 \ + -m 1G \ + -blockdev raw,node-name=install_iso,read-only=on,file.driver=file,file.filename=/mnt/losable/ISOs/archlinux-2019.09.01-x86_64.iso \ + -device ide-cd,drive=install_iso,bus=ide.0,bootindex=0 + +But, introducing spice reproduces the problem: + +$ /usr/bin/qemu-system-x86_64 \ + -m 1G \ + -blockdev raw,node-name=install_iso,read-only=on,file.driver=file,file.filename=/mnt/losable/ISOs/archlinux-2019.09.01-x86_64.iso \ + -spice unix,addr=/tmp/spice.qxl.sock,disable-ticketing \ + -device ide-cd,drive=install_iso,bus=ide.0,bootindex=0 \ + -vga qxl + +$ remote-viewer "spice+unix:///tmp/spice.qxl.sock" + +I've been running remote-viewer (from virt-viewer package) since around March 13, version 8.0 since then. It's only when upgrading QEMU from 4.0.0 to 4.1.0 that introduces the problem. + +Running remote-viewer this way also shows that it outputs these, right when KMS changes resolution: + +(remote-viewer:15090): GLib-GObject-WARNING **: 23:56:03.914: value "64" of type 'gint' is invalid or out of range for property 'desktop-width' of type 'gint' + +(remote-viewer:15090): GLib-GObject-WARNING **: 23:56:03.915: value "64" of type 'gint' is invalid or out of range for property 'desktop-height' of type 'gint' + +When downgrading to QEMU 4.0.0, remote-viewer STILL outputs these lines regarding desktop-width and height, when KMS changes resolution. + +In case it helps, below are spice-debug logs from remote-viewer. I've included the whole log, but also added a bunch of spacing and a header showing the second worth of output correlating with the KMS resolution change. + +QEMU 4.0.0 without the bug: http://ix.io/1USn + +QEMU 4.1.0 with the bug: http://ix.io/1USo + +So, it's always possible the fix might need to be in remote-viewer, but at minimum, the case it would need to handle properly wasn't being given to it until QEMU 4.1.0. + +Comparing the spice debug logs, where I see this with QEMU 4.0.0 without the bug: + +(remote-viewer:19270): GSpice-DEBUG: 00:05:21.201: channel-display.c:1979 display-2:0: received new monitors config from guest: n: 1/4 +(remote-viewer:19270): GSpice-DEBUG: 00:05:21.201: channel-display.c:1997 display-2:0: monitor id: 0, surface id: 0, +0+0-1024x768 + +I see this with QEMU 4.1.0 with the bug: + +(remote-viewer:19896): GSpice-DEBUG: 00:07:40.019: channel-display.c:1975 display-2:0: received empty monitor config +(remote-viewer:19896): GSpice-DEBUG: 00:07:40.049: channel-cursor.c:542 cursor-4:0: cursor_handle_reset, init_done: 1 +(remote-viewer:19896): GSpice-DEBUG: 00:07:40.049: channel-display.c:1951 display-2:0: 0: FIXME primary destroy, but is display really disabled? + +Sorry, in comment #2 for the native graphics window command line, I copied from the wrong trial. The argument for QXL should have been included, because that works with a native graphics window: + + (...bootindex=0) \ + -vga qxl + +Hi James, + OK, thanks - some questions: + a) What version of spice-server have you got on your host? + b) Does swapping the '-vga qxl' for '-device qxl-vga,max_outputs=1' help? (try with and without the max_outputs=1) + c) Are you able to do bisect builds to try and track down which commit broke it? + + + +a) spice 0.14.2. Also spice-gtk 0.37, and spice-protocol 0.14.0. + +b) Swapping with "-device qxl-vga,max_outputs=1" does fix the problem. Swapping with "-device qxl-vga" still has the bug. + +c) Knowing b, would the bisect still help? If needed, sure, I will. + +OK that's interesting - I've got another bug I've been following that's also fixed by (b). + +A bisect would still be interesting; but one place to start might be to try before and after commit +be812c0 + +Bisection is not going well at all with this code base! + +Before your last reply, I started, and the first between 4.0.0 and 4.1.0 is aae6500972 which fails compilation: + +========== + +... + CC stubs/pci-host-piix.o + CC stubs/ram-block.o + CC stubs/ramfb.o + CC stubs/fw_cfg.o + CC stubs/semihost.o + CC qemu-keymap.o + CC util/filemonitor-stub.o + +Warning, treated as error: +/build/qemu-bisect/src/qemu/docs/interop/bitmaps.rst:202:Could not lex literal_block as "json". Highlighting skipped. + CC ui/input-keymap.o + CC contrib/elf2dmp/main.o + CC contrib/elf2dmp/addrspace.o + CC contrib/elf2dmp/download.o + CC contrib/elf2dmp/pdb.o + CC contrib/elf2dmp/qemu_elf.o + CC contrib/ivshmem-client/ivshmem-client.o + CC contrib/ivshmem-client/main.o + CC contrib/ivshmem-server/ivshmem-server.o + +========== + +I tried just marking it as good and hoping it was a more recent regression, instead of even doing a skip, but efa85a4d1a fails with the same error. I double checked that 4.0.0 and 4.1.0 still get past that spot for me, and they do. + +I tried your suggestion, be812c0, but that compiled with this error: + +========== + + CC crypto/cipher.o + CC crypto/tlscreds.o + CC crypto/tlscredsanon.o +/build/qemu-bisect/src/qemu/block/gluster.c: In function ‘qemu_gluster_co_pwrite_zeroes’: +/build/qemu-bisect/src/qemu/block/gluster.c:994:52: warning: passing argument 4 of ‘glfs_zerofill_async’ from incompatible pointer type [-Wincompatible-pointer +-types] + 994 | ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, &acb); + | ^~~~~~~~~~~~~~~~~~~~ + | | + | void (*)(struct glfs_fd *, ssize_t, void *) {aka void (*)(struct glfs_fd *, long int, void *)} +In file included from /build/qemu-bisect/src/qemu/block/gluster.c:12: +/usr/include/glusterfs/api/glfs.h:993:73: note: expected ‘glfs_io_cbk’ {aka ‘void (*)(struct glfs_fd *, long int, struct glfs_stat *, struct glfs_stat *, void + *)’} but argument is of type ‘void (*)(struct glfs_fd *, ssize_t, void *)’ {aka ‘void (*)(struct glfs_fd *, long int, void *)’} + 993 | glfs_zerofill_async(glfs_fd_t *fd, off_t length, off_t len, glfs_io_cbk fn, + | ~~~~~~~~~~~~^~ +/build/qemu-bisect/src/qemu/block/gluster.c: In function ‘qemu_gluster_do_truncate’: +/build/qemu-bisect/src/qemu/block/gluster.c:1035:13: error: too few arguments to function ‘glfs_ftruncate’ + 1035 | if (glfs_ftruncate(fd, offset)) { + | ^~~~~~~~~~~~~~ +In file included from /build/qemu-bisect/src/qemu/block/gluster.c:12: +/usr/include/glusterfs/api/glfs.h:768:1: note: declared here + 768 | glfs_ftruncate(glfs_fd_t *fd, off_t length, struct glfs_stat *prestat, + | ^~~~~~~~~~~~~~ +/build/qemu-bisect/src/qemu/block/gluster.c:1046:13: error: too few arguments to function ‘glfs_ftruncate’ + 1046 | if (glfs_ftruncate(fd, offset)) { + | ^~~~~~~~~~~~~~ + +========== + +So, I looked at configure and saw a "--disable-glusterfs" option, and tried it. It still failed with: + +========== + + GEN it.mo + GEN bg.mo + GEN fr_FR.mo + GEN zh_CN.mo + GEN de_DE.mo + GEN hu.mo + GEN tr.mo +for obj in hu.mo tr.mo it.mo bg.mo fr_FR.mo zh_CN.mo de_DE.mo; do \ + base=$(basename $obj .mo); \ + install -d /build/qemu-bisect/pkg/qemu-bisect/usr/share/locale/$base/LC_MESSAGES; \ + install -m644 $obj /build/qemu-bisect/pkg/qemu-bisect/usr/share/locale/$base/LC_MESSAGES/qemu.mo; \ +done +make[1]: Leaving directory '/build/qemu-bisect/src/build-full/po' +install -d -m 0755 "/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/keymaps" +set -e; for x in da en-gb et fr fr-ch is lt no pt-br sv ar de en-us fi fr-be hr it lv nl pl ru th de-ch es fo fr-ca hu ja mk pt sl tr bepo cz; do \ + install -c -m 0644 /build/qemu-bisect/src/qemu/pc-bios/keymaps/$x "/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/keymaps"; \ +done +install -c -m 0644 /build/qemu-bisect/src/build-full/trace-events-all "/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/trace-events-all" +for d in aarch64-softmmu alpha-softmmu arm-softmmu cris-softmmu hppa-softmmu i386-softmmu lm32-softmmu m68k-softmmu microblazeel-softmmu microblaze-softmmu mips64el-softmmu mips64-softmmu mipsel-softmmu mips-softmmu moxie-softmmu nios2-softmmu or1k-softmmu ppc64-softmmu ppc-softmmu riscv32-softmmu riscv64-softmmu s390x-softmmu sh4eb-softmmu sh4-softmmu sparc64-softmmu sparc-softmmu tricore-softmmu unicore32-softmmu x86_64-softmmu xtensaeb-softmmu xtensa-softmmu aarch64_be-linux-user aarch64-linux-user alpha-linux-user armeb-linux-user arm-linux-user cris-linux-user hppa-linux-user i386-linux-user m68k-linux-user microblazeel-linux-user microblaze-linux-user mips64el-linux-user mips64-linux-user mipsel-linux-user mips-linux-user mipsn32el-linux-user mipsn32-linux-user nios2-linux-user +or1k-linux-user ppc64abi32-linux-user ppc64le-linux-user ppc64-linux-user ppc-linux-user riscv32-linux-user riscv64-linux-user s390x-linux-user sh4eb-linux-user sh4-linux-user sparc32plus-linux-user sparc64-linux-user sparc-linux-user tilegx-linux-user x86_64-linux-user xtensaeb-linux-user xtensa-linux-user; do \ +make --no-print-directory --quiet BUILD_DIR=/build/qemu-bisect/src/build-full TARGET_DIR=$d/ -C $d install || exit 1 ; \ + done +make: Leaving directory '/build/qemu-bisect/src/build-full' +rm: cannot remove 'qemu/block-gluster.so': No such file or directory + +========== + +All of these builds are in clean chroot environments, so they're starting from source scratch builds without interference between previous attempts. + +P.S. Looks like I can use --disable-docs to hopefully get around the json parsing error, but that still doesn't help with the gluster error or that something is still looking the .so given --disable-glusterfs. + +hmm, disable-glusterfs *should* work around that; sometimes it's worth nuking the build directory and trying a fresh configure with these things. + + +Sorry, my #8 was really long. All builds I've done were in clean chroots, so starting from scratch with just git source, with no interference from other builds. Also later in #8, I show that --disable-glusterfs doesn't work because some part of the build looks for the .so that was never built. + +Luckily, be812c0 was easy enough to just manually revert on top of 4.1.0. + +And, good news. (I hope!) 4.1.0 with be812c0 manually reverted on top of it prevents the bug, even WITHOUT "max_outputs=1". + +James: Freedy proposed a fix for the bug I was looking at with a spice fix: + https://lists.freedesktop.org/archives/spice-devel/2019-September/050859.html +That's in the spice-server package. + +If you can check that it also fixes your bug that would be great. + +Yes, I first replicated the issue by removing "max_outputs=1", then patched spice server, and the issue no longer happens. + +QEMU 4.1.0 still changed something. If I understand correctly, it's now in some circumstances saying there are 0 monitors, even though there's a graphic card? + +Fixing this in spice to effectively ignore being told 0, and go with 1 instead, gets around the bug, but still makes me think there's something wrong in QEMU 4.1.0. Granted, perhaps with this spice fix, it might not cause any negative effects anymore. + +But, I don't know if there are any third party applications especially on Windows that don't use upstream spice-server and might be thrown off by this in a similar way. So, I wonder if QEMU 4.1.0 should still have something fixed. + +Hi James, + The change in QEMU in 4.1 is that it's using a newer spice interface; Freddy is on our spice team and we chatted about whether to change QEMU but they thought it best to fix Spice to be more tolerant; so I'm happy to go with that recommendation. + +Has this ever been fixed on the QEMU or Spice side? + +Anyway, the QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1843941 b/results/classifier/zero-shot/105/semantic/1843941 new file mode 100644 index 000000000..33d021deb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1843941 @@ -0,0 +1,28 @@ +semantic: 0.708 +device: 0.667 +graphic: 0.607 +network: 0.594 +mistranslation: 0.550 +other: 0.461 +instruction: 0.455 +socket: 0.453 +vnc: 0.416 +boot: 0.364 +assembly: 0.195 +KVM: 0.123 + +RBD Namespaces are not supported + +Ceph Nautilus (v14.2.0) introduced the Namespaces concept for RADOS Block Devices. This provides a logical separation within a RADOS Pool for RBD images which enables granular access control. See https://docs.ceph.com/docs/nautilus/releases/nautilus/ for additional details. + +librados and librbd support this, however qemu does not. The rbd man page defines how rbd images within a namespace can be referenced. https://docs.ceph.com/docs/nautilus/man/8/rbd/#image-snap-group-and-journal-specs + +Adding support for RBD namespaces would be beneficial for security and reducing the impact of a hypervisor being compromised and putting an entire Ceph pool or cluster at risk. + +I just posted a patch today on the qemu-devel mailing list, you can find it there : https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04344.html + +Thanks for adding the support. I was actually already play-testing your patch. I'll respond to the mailing list soon. + +Patch had been included here: +https://gitlab.com/qemu-project/qemu/-/commit/19ae9ae01471552 + diff --git a/results/classifier/zero-shot/105/semantic/1846 b/results/classifier/zero-shot/105/semantic/1846 new file mode 100644 index 000000000..45c0142cb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1846 @@ -0,0 +1,38 @@ +semantic: 0.875 +mistranslation: 0.851 +graphic: 0.818 +other: 0.817 +device: 0.729 +assembly: 0.723 +instruction: 0.684 +network: 0.531 +vnc: 0.512 +boot: 0.471 +socket: 0.382 +KVM: 0.354 + +Regression in q35 avocado tests due to fix for misaligned IO access +Description of problem: +Generally I'm seeing intermittent hangs, somewhere after the clock initialisation. + +``` +[ 4.137020] ALSA device list: +[ 4.137861] No soundcards found. +[ 4.634128] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3 +[ 24.085574] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: +[ 24.085712] rcu: 0-...!: (0 ticks this GP) idle=4d18/0/0x0 softirq=54/54 fqs=0 (false positive?) +[ 24.085712] (detected by 1, t=21004 jiffies, g=-1003, q=2151 ncpus=2) +[ 24.085712] Sending NMI from CPU 1 to CPUs 0: +[ 4.647507] NMI backtrace for cpu 0 +[ 4.647507] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.11 #5 +[ 4.647507] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 +[ 4.647507] RIP: 0010:amd_e400_idle+0x39/0x40 +[ 4.647507] Code: 00 e8 fb ab 0d 00 eb 07 0f 00 2d c2 7d 1d 01 fb f4 fa 31 ff e8 e8 ab 0d 00 fb c3 cc cc cc cc eb 07 0f 00 2d a9 7d 1d 01 fb f4 <c3> cc cc cc cc 66 90 bf +01 00 00 00 e8 a6 e4 06 00 65 48 8b 04 25 +``` + +In avocado the hang generally times out and the test fails. +Steps to reproduce: +See above command line. It's racy. +Additional information: + diff --git a/results/classifier/zero-shot/105/semantic/1846427 b/results/classifier/zero-shot/105/semantic/1846427 new file mode 100644 index 000000000..56273c94d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1846427 @@ -0,0 +1,721 @@ +semantic: 0.793 +other: 0.700 +graphic: 0.683 +instruction: 0.627 +device: 0.616 +network: 0.615 +assembly: 0.569 +vnc: 0.550 +mistranslation: 0.543 +boot: 0.532 +socket: 0.491 +KVM: 0.321 + +4.1.0: qcow2 corruption on savevm/quit/loadvm cycle + +I'm seeing massive corruption of qcow2 images with qemu 4.1.0 and git master as of 7f21573c822805a8e6be379d9bcf3ad9effef3dc after a few savevm/quit/loadvm cycles. I've narrowed it down to the following reproducer (further notes below): + +# qemu-img check debian.qcow2 +No errors were found on the image. +251601/327680 = 76.78% allocated, 1.63% fragmented, 0.00% compressed clusters +Image end offset: 18340446208 +# bin/qemu/bin/qemu-system-x86_64 -machine pc-q35-4.0.1,accel=kvm -m 4096 -chardev stdio,id=charmonitor -mon chardev=charmonitor -drive file=debian.qcow2,id=d -S +qemu-system-x86_64: warning: dbind: Couldn't register with accessibility bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. +QEMU 4.1.50 monitor - type 'help' for more information +(qemu) loadvm foo +(qemu) c +(qemu) qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +quit +[m@nargothrond:~] qemu-img check debian.qcow2 +Leaked cluster 85179 refcount=2 reference=1 +Leaked cluster 85180 refcount=2 reference=1 +ERROR cluster 266150 refcount=0 reference=2 +[...] +ERROR OFLAG_COPIED data cluster: l2_entry=422840000 refcount=1 + +9493 errors were found on the image. +Data may be corrupted, or further writes to the image may corrupt it. + +2 leaked clusters were found on the image. +This means waste of disk space, but no harm to data. +259266/327680 = 79.12% allocated, 1.67% fragmented, 0.00% compressed clusters +Image end offset: 18340446208 + +This is on a x86_64 Linux 5.3.1 Gentoo host with qemu-system-x86_64 and accel=kvm. The compiler is gcc-9.2.0 with the rest of the system similarly current. + +Reproduced with qemu-4.1.0 from distribution package as well as vanilla git checkout of tag v4.1.0 and commit 7f21573c822805a8e6be379d9bcf3ad9effef3dc (today's master). Does not happen with qemu compiled from vanilla checkout of tag v4.0.0. Build sequence: + +./configure --prefix=$HOME/bin/qemu-bisect --target-list=x86_64-softmmu --disable-werror --disable-docs +[...] +CFLAGS -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g +[...] (can provide full configure output if helpful) +make -j8 install + +The kind of guest OS does not matter: seen with Debian testing 64bit, Windows 7 x86/x64 BIOS and Windows 7 x64 EFI. + +The virtual storage controller does not seem to matter: seen with VirtIO SCSI, emulated SCSI and emulated SATA AHCI. + +Caching modes (none, directsync, writeback), aio mode (threads, native) or discard (ignore, unmap) or detect-zeroes (off, unmap) does not influence occurence either. + +Having more RAM in the guest seems to increase odds of corruption: With 512MB to the Debian guest problem hardly occurs at all, with 4GB RAM it happens almost instantly. + +An automated reproducer works as follows: + +- the guest *does* mount its root fs and swap with option discard and my testing leaves me with the impression that file deletion rather than reading is causing the issue + +- foo is a snapshot of the running Debian VM which is already running command + +# while true ; do dd if=/dev/zero of=foo bs=10240k count=400 ; done + +to produce some I/O to the disk (4GB file with 4GB of RAM). + +- on the host a loop continuously resumes and saves the guest state and quits qemu inbetween: + +# while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | bin/qemu-bisect/bin/qemu-system-x86_64 -machine pc-q35-3.1,accel=kvm -m 4096 -chardev stdio,id=charmonitor -mon chardev=charmonitor -drive file=debian.qcow2,id=d -S -display none ; done + +- quitting qemu inbetween saves and loads seems to be necessary for the problem to occur. Just continusouly in one session saving and loading guest state does not trigger it. + +- For me, after about 2 to 6 iterations of above loop the image is corrupted. + +- corruption manifests with other messages from qemu as well, e.g.: + +(qemu) loadvm foo +Error: Device 'd' does not have the requested snapshot 'foo' + +Using above reproducer I have to the be best of my ability bisected the introduction of the problem to commit 69f47505ee66afaa513305de0c1895a224e52c45 (block: avoid recursive block_status call if possible). qemu compiled from the commit before does not exhibit the issue, from that commit on it does and reverting the commit off of current master makes it disappear. + +cc'd in kwolf since he signed off on that change. + +> I'm seeing massive corruption of qcow2 images with qemu 4.1.0 and git master +> as of 7f21573c822805a8e6be379d9bcf3ad9effef3dc after a few +> savevm/quit/loadvm cycles. +[...] +> bisected the introduction of the problem to commit +> 69f47505ee66afaa513305de0c1895a224e52c45 +> (block: avoid recursive block_status call if possible). + +In case it got lost in all the blurb: qemu 4.1.0 is essentially eating VMs by corrupting their images in very short order. Asumming no aggravating circumstances on my end I'd expect this to have the potential to hit a lot of users very hard once qemu 4.1.0 starts appearing in distros. + +Having downgraded to 4.0.0 works around the problem for me for now. + +Just let me know if there's anything I can do to assist. + +Hi Michael, + How sure are you that it's that commit - have you checked the commit before it? + + +Yes. As said: + +> qemu compiled from the commit before does not exhibit the issue, from that +> commit on it does and reverting the commit off of current master makes it +> disappear. + +In my tests the problem only occurs with that commit in the code. I used git bisect to narrow it down to that commit. Even just reverting it off of current master made it go away with my automated reproducer. + +If helpful I can retest manually with a real-world VM. OTOH it would certainly be helpful if someone else said they can or cannot reproduce the problem based on my description of the reproducer. + +I just quickly retested with today's master (commit 69b81893bc28feb678188fbcdce52eff1609bdad) and the automated reproducer. With the attached revert patch applied the loadvm/sleep 10/savevm/quit loop ran 50 times without problem. As soon as I removed the patch, recompiled and replaced the qemu binary with the unpatched, newly compiled one it took another two runs of the loop to produce this output: + +QEMU 4.1.50 monitor - type 'help' for more information +(qemu) loadvm foo +(qemu) c +(qemu) stop +(qemu) savevm foo +(qemu) quit +QEMU 4.1.50 monitor - type 'help' for more information +(qemu) loadvm foo +(qemu) c +(qemu) stop +(qemu) savevm foo +Error: Error while deleting snapshot on device 'd': Failed to free the cluster and L1 table: Invalid argument +(qemu) quit +QEMU 4.1.50 monitor - type 'help' for more information +(qemu) loadvm foo +Error: Device 'd' does not have the requested snapshot 'foo' +(qemu) c +(qemu) qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +qcow2_free_clusters failed: Invalid argument +^Cqemu-system-x86_64: terminating on signal 2 + +qemu-img check then reports: + +48857 errors were found on the image. +Data may be corrupted, or further writes to the image may corrupt it. + +115210 leaked clusters were found on the image. +This means waste of disk space, but no harm to data. +259259/327680 = 79.12% allocated, 2.51% fragmented, 0.00% compressed clusters +Image end offset: 17942052864 + +$ qemu-img check debian.qcow2 2>&1 | grep OFLAG_COPIED | wc -l +17592 +$ qemu-img check debian.qcow2 2>&1 | grep ERROR | wc -l +48857 +$ qemu-img check debian.qcow2 2>&1 | grep Leaked | wc -l +115210 + + +I haven't done any sort of "narrowing down", but recent QEMUs (built from the master branch, post-v4.1) have corrupted at least two VM disk images (qcow2) for me as well. I had to reinstall both VMs. + +I didn't make any noise because I was sure that, if I wasn't seeing ghosts, then others must have encountered the symptom earlier than I did, and file bug reports with more details than I had time for. + +Perhaps relevant: my use case lacks savevm/loadvm. I only boot and shutdown VMs. + +My symptoms have been: +- qemu refusing to start, due to the qcow2 image being corrupt +- qemu-img reporting the image as corrupt +- applications in guests that checksum data reporting problems (such as RPM complaining about RPMDB corruption) + +I think the affected qcow2 images may have had compressed clusters. (I no longer have the images.) + +I can confirm exactly the same issue on Arch linux running qemu-4.1.0. + +After downgrading from 4.1.0 => 4.0.0 everything is running normal again, no corruption detected and all qcow2 images stays healthy. + + +After reading the message of commit 69f47505ee66 ("block: avoid +recursive block_status call if possible", 2019-06-04), I'm none the +wiser. But, I can at least confirm that all my qcow2 images are +pre-allocated, as a norm. I create them with the following command: + +qemu-img create \ + -f qcow2 \ + -o compat=1.1 \ + -o cluster_size=65536 \ + -o preallocation=metadata \ + -o lazy_refcounts=on \ + $FILENAME \ + 100G + +Perhaps this helps reproducing the issue. The commit message says, +"However, lseek is needed when we have metadata-preallocated image", so +that might be a special case that I've hit with some frequency. + +I do have a vague suspicion that the following idea: + + The idea is to compare allocation size in POV of filesystem with + allocations size in POV of Qcow2 (by refcounts). If allocation in fs is + significantly lower, consider it as metadata-preallocation case. + +is not robust enough. From the description, the "metadata-preallocation +case" appears to be determined with *heuristics*, but then again, "lseek +is needed when we have metadata-preallocated image". So if there is a +clear requirement to behave differently / particularly for +metadata-preallocated images, why is it safe to (basically) *guess* +whether a given image had its metadata pre-allocated? + ++ threshold = MAX(real_clusters * 10 / 9, real_clusters + 2); + +Where do those constants come from? + +... Not sure if it matters: the host filesystem holding my qcow2 images +is "ext4". Filesystem features (dumped with the fs being mounted r/w at +the moment): has_journal, ext_attr resize_inode, dir_index, filetype, +needs_recovery, extent, flex_bg, sparse_super, large_file, huge_file, +uninit_bg, dir_nlink, extra_isize. Filesystem flags: +signed_directory_hash. + +Thanks. + + +(See also / possible duplicate: <https://bugs.launchpad.net/qemu/+bug/1847793>.) + +My qcow2 images also reside on an ext4 with features "has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file dir_nlink extra_isize metadata_csum" on a luks-encrypt(ed|ing) device mapper device backed by a partition on an NVMe SSD. The setup is rock solid and I had no other indications of it causing corruption or being corrupted. + +I did a quick test with a 32GB USB3 flash drive formatted as a super floppy (without partitions nor encryption) as XFS and saw the same corruption though less heavily so, likely because the drive is much slower (~ 60MB/s write instead of ~600MB/s write for the NVMe SSD). + +The savevm/loadvm cycle was basically the first reliable and fast reproducer I was able to find. I have a dim recollection that some of my corruptions also did not involve any loadvm/savevm but were much rarer and not as easily reproducible. + + +Not sure if i have exactly the same problem, as my qcow2 corruption seems to be limited to windows10 guests - win2019 and debian10 guests with the same virtio-scsi setup are fine (as are various virtio-blk or ide/sata images from linux/solaris/macos guests). + +I find that i randomly have disk image corruption from little more than boot/shutdown cycles - no heavy usage or anything is required. "qemu-img check -r all" usually makes things worse, as does chkdsk. + +host filesystem is an ssd with ext4 on top of luks, discard not used (fstrim.timer instead) with features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum + +Reported to redhat as assumed it was a virtio-win bug: https://bugzilla.redhat.com/show_bug.cgi?id=1762944 - includes virt-install method to reproduce my test vm's (i don't use qemu directly). + +Host is debian sid running qemu version 4.1.0 (Debian 1:4.1-1+b3), libvirt 5.6.0-2, kernel 5.2.0-3 (5.2.17-1) + +Can't seem to reproduce if I convert the qcow2 image to raw+sparse. + +After reading some related code, I have more questions than before, but let's see... As more qcow2 code was merged since, I would suggest that we debug the problem on commit 69f4750 (the bisection result) rather than on anything newer. + + +First of all: Michael, you didn't specify explicitly how your images were created, but can I assume that the test image is not preallocated (in contrast to Laszlo's)? + +I find Laszlo's case with a preallocated image particularly surprising because the behaviour isn't supposed to have changed at all for preallocated images, at least if the heuristics still detects them as such. Once a preallocated image becomes almost fully allocated, it's expected that we won't detect it any more. So, Laszlo, do you know how much of your images was allocated? 'qemu-img check' prints the allocation statistics. + + +The next mystery is why bdrv_co_block_status() is even called. I found only a single call that happens with normal guest I/O and savevm/loadvm, and that's the one in handle_alloc_space(). This function is suspicious because it's relatively new, but commit 69f4750 shouldn't have any effect on it because BDRV_BLOCK_ALLOCATED is set independently of BDRV_BLOCK_RECURSE - and even if the change had an effect, it would be that the function is used less, so if anything, a bug could be expected to be hidden rather than become visible. + +I think it might be worth a try reproducing with the handle_alloc_space() call commented out. If that doesn't fix/hide the bug, it would be interesting to see what else calls qcow2_detect_metadata_preallocation(), e.g. by setting a breakpoint there in gdb and getting the stack backtrace when it triggers. + + +Another caller I see in the code, but didn't get run in my guest, is qcow2_co_pwrite_zeroes(). This is not discard, but maybe the discard mount option does cause a write_zeroes call (WRITE SAME in SCSI) sometimes? But then, your reproducer seems to use AHCI and I can't see a write_zeroes call in the AHCI or IDE device emulation. + +The possible (intended) effect of commit 69f4750 is that a block that was previously detected as containing only zeros (BDRV_BLOCK_ZERO) doesn't get this flag any more. This could cause unaligned qcow2_co_pwrite_zeroes() to fail, but then we'd just get a fallback to a normal write, which wouldn't explain any metadata-level corruption. + + +Michael, would you like to give it a try and figure out in which code path qcow2_detect_metadata_preallocation() is even called in your reproducer and if handle_alloc_space() is linked to this bug somehow? + +In reply to Kevin's comment#13: + +> I find Laszlo's case with a preallocated image particularly surprising +> because the behaviour isn't supposed to have changed at all for +> preallocated images, at least if the heuristics still detects them as +> such. + +But isn't that "if" at the core of this problem? What happens if the +detection misfires? (This is not a loaded question, I'm not implying any +particular circumstances; I'm just surprised that heuristics could be +considered at all.) + +> Once a preallocated image becomes almost fully allocated, it's +> expected that we won't detect it any more. So, Laszlo, do you know how +> much of your images was allocated? 'qemu-img check' prints the +> allocation statistics. + +I don't have the images any longer, and since then, I've been running +qemu 4.0 (for my upstream QEMU binaries). + +However, I can say some things (with both affected VMs being Fedora +installations): + +- As noted earlier, the images were formatted for 100GB, with + preallocation=metadata. + +- I always install Fedora from Live ISOs (never starting with + pre-installed images), and right after installation, "du" on the host + side always reports 5-8 GB usage. Definitely never more than 10GB. So + I'd say these images were very sparsely populated. + +- I always use qcow2 images like this, in the domain XMLs: + + <driver name='qemu' type='qcow2' cache='writeback' + error_policy='enospace' discard='unmap'/> + + and I always use virtio-scsi so that discard='unmap' actually have an + effect. + +- I occasionally run "fstrim" in the guest, and / or "virsh domfstrim" + on the host. (And re-run "du" on the host side in every such case.) + +- Right after installation (with the VM powered down), I might compress + the image with "qemu-img convert -c"; but I don't believe I've done + that too recently. + +- The general idea on my end is that I'd like to limit guest disk usage + by the *host* disk's free space, and not by an arbitrary pre-set disk + image size. Hence 100GB stands for "infinity" (I might have used 1TB + just as well), and error_policy='enospace' lets me act, should a guest + actually run out of space, on the host disk. Finally, discard='unmap' + prevents waste. I use "preallocation=metadata" because the initial + size cost is negligible, but I perceive writes to be faster. + +Hopefully this helps at least a tiny bit... Thanks! + +> After reading some related code, I have more questions than before, but +> let's see... As more qcow2 code was merged since, I would suggest that +> we debug the problem on commit 69f4750 (the bisection result) rather +> than on anything newer. + +Okay, for all of the following I did a fresh compile of qemu 69f4750 and +ran all commands in this version. + +> First of all: Michael, you didn't specify explicitly how your images +> were created, but can I assume that the test image is not preallocated +> (in contrast to Laszlo's)? + +Actually these were converted from vmdk files using qemu-img and +previously VMware Fusion VMs. To avoid any suspicion as to what that +may have brought with it in breakage I just created a fresh image using +this command: + +$ bin/qemu-bisect/bin/qemu-img create -f qcow2 qtest.qcow2 20G +Formatting 'qtest.qcow2', fmt=qcow2 size=21474836480 cluster_size=65536 lazy_refcounts=off refcount_bits=16 +$ bin/qemu-bisect/bin/qemu-img info qtest.qcow2 +image: qtest.qcow2 +file format: qcow2 +virtual size: 20 GiB (21474836480 bytes) +disk size: 196 KiB +cluster_size: 65536 +Format specific information: + compat: 1.1 + lazy refcounts: false + refcount bits: 16 + corrupt: false +$ ls -la qtest.qcow2 +-rw-r--r-- 1 m m 196928 Oct 21 22:43 qtest.qcow2 +$ du -sk qtest.qcow2 +196 qtest.qcow2 + +So I guess that means its not preallocated. + +Then I installed a minimal Debian buster into it by just entering +default values: + +$ bin/qemu-bisect/bin/qemu-system-x86_64 -machine pc-q35-3.1,accel=kvm -m 4096 -chardev stdio,id=charmonitor -mon chardev=charmonitor -drive file=qtest.qcow2,id=d -cdrom Downloads/mini.iso + +After that the image reported: + +$ bin/qemu-bisect/bin/qemu-img check qtest.qcow2 +No errors were found on the image. +26443/327680 = 8.07% allocated, 17.10% fragmented, 0.00% compressed clusters +Image end offset: 1734148096 + +Then I prepared it for the automatic reproducer by running the following +command in it and saving that running state as snapshot foo using savevm: + +$ while true ; do dd if=/dev/zero of=t bs=1024k count=4000 ; done + +Then I ran the reproducer using this command: + +$ while true ; do (echo loadvm foo ; echo c ; sleep 10 ; echo stop ; echo savevm foo ; echo quit ) | bin/qemu-bisect/bin/qemu-system-x86_64 -machine pc-q35-3.1,accel=kvm -m 4096 -chardev stdio,id=charmonitor -mon chardev=charmonitor -drive file=qtest.qcow2,id=d -display none -S ; done + +It took nine iterations for the image to corrupt. After that qemu-img +reads: + +$ bin/qemu-bisect/bin/qemu-img check qtest.qcow2 2>&1 | sed -e s,Leaked.*,Leaked, | uniq +Leaked +ERROR cluster 163840 refcount=3 reference=4 +ERROR cluster 163841 refcount=3 reference=4 +ERROR cluster 163848 refcount=1 reference=2 +ERROR cluster 163850 refcount=1 reference=2 +ERROR cluster 163921 refcount=1 reference=2 +ERROR cluster 163957 refcount=3 reference=4 +ERROR cluster 163958 refcount=3 reference=4 +Leaked +ERROR cluster 163962 refcount=1 reference=2 +Leaked +ERROR cluster 163968 refcount=1 reference=2 +Leaked +ERROR cluster 163974 refcount=1 reference=2 +Leaked + +10 errors were found on the image. +Data may be corrupted, or further writes to the image may corrupt it. + +129130 leaked clusters were found on the image. +This means waste of disk space, but no harm to data. +253326/327680 = 77.31% allocated, 1.77% fragmented, 0.00% compressed clusters +Image end offset: 18906611712 + +> Another caller I see in the code, but didn't get run in my guest, is +> qcow2_co_pwrite_zeroes(). This is not discard, but maybe the discard +> mount option does cause a write_zeroes call (WRITE SAME in SCSI) +> sometimes? But then, your reproducer seems to use AHCI and I can't see +> a write_zeroes call in the AHCI or IDE device emulation. + +In above test I had not knowingly configured any discard in the guest. +Neither /etc/fstab nor /proc/mounts contained the discard option. The +image also did not shrink when deleting files. Nor did it shrink when +explicitly calling fstrim / for that matter - presumably because because +unmap on discard is disabled by default. + +So I'd postulate that discard does at most play an aggravating role here +but is not necessary for the problem to occur. + +> I think it might be worth a try reproducing with the +> handle_alloc_space() call commented out. If that doesn't fix/hide the +> bug, + +I commented out the call to handle_alloc_space() in +block/qcow2.c:qcow2_co_pwritev() and that certainly hid the bug. The +reproducer ran for quarter of an hour without any corruption. The image +was fine after that: + +$ bin/qemu-bisect/bin/qemu-img check qtest.qcow2 +No errors were found on the image. +253376/327680 = 77.32% allocated, 2.00% fragmented, 0.00% compressed clusters +Image end offset: 16909860864 + +Commenting handle_alloc_space() back in, recompiling, reinstalling and +rerunning the reproducer took a single iteration to violently corrupt +the image. + +So I guess it's safe to say that the bug occurs in the +handle_alloc_space() codepath. + +This quick corruption made me think that maybe the level of +preallocation has something to do with it. So I filled up all disk space +in the guest by writing zeroes to a file using dd. This yielded a +preallocation above 80%: + +No errors were found on the image. +266793/327680 = 81.42% allocated, 3.92% fragmented, 0.00% compressed clusters +Image end offset: 21343764480 + +Running the reproducer again the image took five iterations to corrupt. +I'd call that inconclusive. + +> Michael, would you like to give it a try and figure out in which code path +> qcow2_detect_metadata_preallocation() is even called in your reproducer + +After letting the VM run for about ten seconds with gdb attached a breakpoint on qcow2_detect_metadata_preallocation triggers and I get this backtrace: + +(gdb) bt +#0 0x0000555555d22bfd in qcow2_detect_metadata_preallocation (bs=0x5555567c69e0) at block/qcow2-refcount.c:3449 +#1 0x0000555555d124b8 in qcow2_co_block_status + (bs=0x5555567c69e0, want_zero=false, offset=2148532224, count=4096, pnum=0x7ffee0ae2b28, map=0x7ffee0ae28a0, file=0x7ffee0ae28a8) at block/qcow2.c:1899 +#2 0x0000555555d6124a in bdrv_co_block_status (bs=0x5555567c69e0, want_zero=false, offset=2148532224, bytes=4096, pnum=0x7ffee0ae2b28, map=0x0, file=0x0) + at block/io.c:2081 +#3 0x0000555555d6166d in bdrv_co_block_status_above + (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=2148532224, bytes=4096, pnum=0x7ffee0ae2b28, map=0x0, file=0x0) at block/io.c:2190 +#4 0x0000555555d61753 in bdrv_block_status_above_co_entry (opaque=0x7ffee0ae2a10) at block/io.c:2220 +#5 0x0000555555d6187e in bdrv_common_block_status_above + (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=2148532224, bytes=4096, pnum=0x7ffee0ae2b28, map=0x0, file=0x0) at block/io.c:2255 +#6 0x0000555555d61ae9 in bdrv_is_allocated (bs=0x5555567c69e0, offset=2148532224, bytes=4096, pnum=0x7ffee0ae2b28) at block/io.c:2285 +#7 0x0000555555d61b7b in bdrv_is_allocated_above (top=0x5555567c69e0, base=0x0, offset=2148532224, bytes=4096, pnum=0x7ffee0ae2b80) at block/io.c:2323 +#8 0x0000555555d12d48 in is_unallocated (bs=0x5555567c69e0, offset=2148532224, bytes=4096) at block/qcow2.c:2151 +#9 0x0000555555d12dbc in is_zero_cow (bs=0x5555567c69e0, m=0x555556ed0520) at block/qcow2.c:2162 +#10 0x0000555555d12e9c in handle_alloc_space (bs=0x5555567c69e0, l2meta=0x555556ed0520) at block/qcow2.c:2188 +#11 0x0000555555d13310 in qcow2_co_pwritev (bs=0x5555567c69e0, offset=2148536320, bytes=61440, qiov=0x7fffe83507a0, flags=0) at block/qcow2.c:2301 +#12 0x0000555555d5e6c4 in bdrv_driver_pwritev (bs=0x5555567c69e0, offset=2148536320, bytes=61440, qiov=0x7fffe83507a0, flags=0) at block/io.c:1043 +#13 0x0000555555d6013a in bdrv_aligned_pwritev + (child=0x55555675cf80, req=0x7ffee0ae2e50, offset=2148536320, bytes=61440, align=1, qiov=0x7fffe83507a0, flags=0) at block/io.c:1670 +#14 0x0000555555d60d66 in bdrv_co_pwritev (child=0x55555675cf80, offset=2148536320, bytes=61440, qiov=0x7fffe83507a0, flags=0) at block/io.c:1897 +#15 0x0000555555d47cb6 in blk_co_pwritev (blk=0x5555567c6730, offset=2148536320, bytes=61440, qiov=0x7fffe83507a0, flags=0) at block/block-backend.c:1183 +#16 0x0000555555d48499 in blk_aio_write_entry (opaque=0x7fffe8350820) at block/block-backend.c:1382 +#17 0x0000555555e3ff80 in coroutine_trampoline (i0=-402575600, i1=32767) at util/coroutine-ucontext.c:116 +#18 0x00007ffff5fc61a0 in () at /lib64/libc.so.6 +#19 0x00007ffff17c5920 in () +#20 0x0000000000000000 in () + +A savevm command does not trigger the breakpoint. + +Hope this helps in narrowing down the culprit. +-- +Michael + +> But isn't that "if" at the core of this problem? What happens if the +> detection misfires? + +The information that a block driver must give is just whether the given block is allocated by the image or whether it is taken from the backing file. Almost everything else is just a hint that can be given if the driver can be more specific, but that can be omitted. + +In the specific case, what commit 69f4750 intends to do is avoid too much effort to determine whether a block is fully zeroed on the filesystem level because the qcow2 metadata should already accurately answer the question. It still keeps the additional checks for metadata preallocation because in this case, the qcow2 metadata says that the whole image is allocated while it's created sparse on the filesystem level, so the check can actually be useful in practice. + +If the detection fails (and the code is implemented correctly), we have two cases: + +1. Preallocated image detected as non-preallocated: It could happens that a fully zeroed block wouldn't be reported as "fully zeroed", but as "allocated (unknown content)". This could prevent some optimisations, but it's still a correct description of the block. + +2. Non-preallocated image detected as preallocated: We waste some cycles on finding out that the filesystem doesn't know more than the qcow2 layer. + +> Hopefully this helps at least a tiny bit... Thanks! + +Yes, that helps. With an image that is mostly sparse, preallocation detection should work perfectly. It works by comparing the number of allocated qcow2 clusters (the full 100 GB in your case) to the file size (around 10 GB). In other words, your case is one where the behaviour isn't supposed to have changed at all. + +I had a thought earlier that maybe the problem isn't with the value returned by bdrv_co_block_status(), but with the fact that bdrv_co_block_status(), and with it preallocation detection, is even running in some code paths. Your cases might support that idea. + +> To avoid any suspicion as to what that may have brought with it in +> breakage I just created a fresh image using this command: [...] + +I tried to reproduce the problem locally, on the same commit, with the steps you described, but I wasn't lucky. I tried keeping the image on my home directory (XFS), on tmpfs, and finally on a newly created ext4 filesystem on a spare LVM volume, but the image just wouldn't break even after letting the loop run for a quite a while. + +> So I'd postulate that discard does at most play an aggravating role here +> but is not necessary for the problem to occur. + +That makes sense to me because you have internal snapshots. Both discard and snapshots mean that the next write to the block will trigger a cluster allocation (and with it a handle_alloc_space() call) again. + +> So I guess it's safe to say that the bug occurs in the +> handle_alloc_space() codepath. + +This is an important finding. + +It's a bit odd because the only related thing handle_alloc_space() calls is bdrv_is_allocated_above(), which only cares about BDRV_BLOCK_ALLOCATED. I don't think the commit in question should make any difference as to whether this flag is set or cleared. The only possible difference should be BDRV_BLOCK_ZERO, and we don't even check that flag. + +So as the next step I would like to test my theory that the problem isn't bdrv_co_block_status() returning a different value after the commit, but that qcow2_detect_metadata_preallocation() even runs. I think the easiest way to do this would be modifying handle_alloc_space() so that it performs the checks, but skips its optimisation regardless of the is_zero_cow() return value: + + if (!is_zero_cow(bs, m) || true) { + continue; + } + +Unfortunately, as long as I can't reproduce the problem, I'll have to rely on you to actually run the tests I come up with after each step. If you'd prefer some more real-time interaction, feel free to ping me on IRC (kwolf on OFTC or Freenode). + +> I tried to reproduce the problem locally, on the same commit, with the +> steps you described, but I wasn't lucky. I tried keeping the image on my +> home directory (XFS), on tmpfs, and finally on a newly created ext4 +> filesystem on a spare LVM volume, but the image just wouldn't break even +> after letting the loop run for a quite a while. + +That's certainly an important data point. Is it possible that we're talking about some kind of miscompilation here, maybe because gcc-9.2.0 is just that tiny bit too spanking current? + +> So as the next step I would like to test my theory that the problem +> isn't bdrv_co_block_status() returning a different value after the +> commit, but that qcow2_detect_metadata_preallocation() even runs. I +> think the easiest way to do this would be modifying handle_alloc_space() +> so that it performs the checks, but skips its optimisation regardless of +> the is_zero_cow() return value: + +> if (!is_zero_cow(bs, m) || true) { +> continue; +> } + +I made the change and the problem went away. + +Then, extrapolating the jest of your methodology :), I went ahead and disabled only bdrv_co_pwrite_zeroes() by placing a continue in front of it but let qcow2_pre_write_overlap_check() execute and the problem reappeared. I certainly did not expect that to happen because the function name ends in _check(), suggesting read-only access. And it's not even touched by the commit. + +This had me so rattled that I revalidated that the problem does indeed not occur with the commit before. And it does not. I left it running for about half an hour without problems. + +After some more tests I finally figured out that even with -g and no -O gcc is smart enough to optimize out (!is_zero_cow() || true) and that corruption only happens if is_zero_cow() is actually called. Corruption also does not occur if I make is_zero_cow() or is_unallocated() return 0 always. + +So my first guess was that is_unallocated() sometimes returns false positives, making is_zero_cow() report false positives which is not caught by qcow2_pre_write_overlap_check() and causes bdrv_co_pwrite_zeroes() to zero out actual data. That seemed a bit convoluted to me. + +But then I realized that corruption still occurs if the rest of handle_alloc_space() is disabled like so: + +--- a/block/qcow2.c ++++ b/block/qcow2.c +@@ -2185,9 +2185,8 @@ static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta) + continue; + } + +- if (!is_zero_cow(bs, m)) { +- continue; +- } ++ is_zero_cow(bs, m); ++ continue; + + /* + * instead of writing zero COW buffers, + +So it's much more likely that is_zero_cow() has a side-effect that somehow causes corruption later on even without handle_alloc_space() ever calling bdrv_co_pwrite_zeroes(). That would also explain why qcow2_pre_write_overlap_check() does not catch those false positives overwriting metadata because there simply are none. + +Putting a breakpoint on handle_alloc_space() and single stepping into is_zero_cow() I do indeed end up in bdrv_co_block_status(): + +gdb) bt +#0 0x0000555555d610fd in bdrv_co_block_status (bs=0x5555567c69e0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0) + at block/io.c:2048 +#1 0x0000555555d6167e in bdrv_co_block_status_above + (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0) at block/io.c:2190 +#2 0x0000555555d61764 in bdrv_block_status_above_co_entry (opaque=0x7ffedffd7a10) at block/io.c:2220 +#3 0x0000555555d6188f in bdrv_common_block_status_above + (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0) at block/io.c:2255 +#4 0x0000555555d61afa in bdrv_is_allocated (bs=0x5555567c69e0, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28) at block/io.c:2285 +#5 0x0000555555d61b8c in bdrv_is_allocated_above (top=0x5555567c69e0, base=0x0, offset=5242880, bytes=12288, pnum=0x7ffedffd7b80) at block/io.c:2323 +#6 0x0000555555d12d48 in is_unallocated (bs=0x5555567c69e0, offset=5242880, bytes=12288) at block/qcow2.c:2151 +#7 0x0000555555d12dbc in is_zero_cow (bs=0x5555567c69e0, m=0x5555569d35b0) at block/qcow2.c:2162 +#8 0x0000555555d12e9c in handle_alloc_space (bs=0x5555567c69e0, l2meta=0x5555569d35b0) at block/qcow2.c:2188 +#9 0x0000555555d13321 in qcow2_co_pwritev (bs=0x5555567c69e0, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/qcow2.c:2302 +#10 0x0000555555d5e6d5 in bdrv_driver_pwritev (bs=0x5555567c69e0, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/io.c:1043 +#11 0x0000555555d6014b in bdrv_aligned_pwritev (child=0x55555675cf80, req=0x7ffedffd7e50, offset=5255168, bytes=4096, align=1, qiov=0x7fffe82ec310, flags=0) + at block/io.c:1670 +#12 0x0000555555d60d77 in bdrv_co_pwritev (child=0x55555675cf80, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/io.c:1897 +#13 0x0000555555d47cc7 in blk_co_pwritev (blk=0x5555567c6730, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/block-backend.c:1183 +#14 0x0000555555d484aa in blk_aio_write_entry (opaque=0x7fffe823f920) at block/block-backend.c:1382 +#15 0x0000555555e3ff91 in coroutine_trampoline (i0=-399759776, i1=32767) at util/coroutine-ucontext.c:116 +#16 0x00007ffff5fc61a0 in () at /lib64/libc.so.6 +#17 0x00007ffff17c5920 in () +#18 0x0000000000000000 in () + +At that point it had gotten too late to even attempt to wrap my brain around the whole BDRV_BLOCK_RECURSE logic. But I think the above gives a strong(er|ish) connection between the change and the corruption and how handle_alloc_space() ties into it. Let me know what else I could check to help track this down. + +Please ignore the stuff about (!is_zero_cow(bs, m) || true) being optimized out. Of course it isn't. And corruption still occurs with that way of calling only is_zero_cow(). Dunno what I did there. It seems to be even later than I thought. The rest of my testing holds true though. + +In reply to <https://bugs.launchpad.net/qemu/+bug/1846427/comments/18>: + +> Is it possible that we're talking about some kind of miscompilation +> here, maybe because gcc-9.2.0 is just that tiny bit too spanking +> current? + +I'm riding the trailing edge here (gcc-4.8 in RHEL7) :) + +[...] + +> So it's much more likely that is_zero_cow() has a side-effect that somehow +> causes corruption later on even without handle_alloc_space() ever calling +> bdrv_co_pwrite_zeroes(). + +Yes, looks like it. I think we have ruled out that a changing return value is the cause of the problems because the return code was completely ignored and it still broke for you. + +Basically the only other thing I see that our commit has changed is that qcow2_detect_metadata_preallocation() runs now. I assume that if you replace its call in qcow2_co_block_status() with a fixed ret = true or ret = false, the problem will still disappear. + +Now what is problematic inside qcow2_detect_metadata_preallocation()? At the moment I see two options: + +1. qcow2_get_refcount() is the only thing that does something with the qcow2 internals, the other calls are about bs->file->bs (the raw image file), which is pretty certainly harmless. The interesting thing about the qcow2_get_refcount() call is that other code paths call it with s->lock locked, but this one is unlocked. I wonder if moving qemu_co_mutex_lock(&s->lock); in qcow2_co_block_status() to above the qcow2_detect_metadata_preallocation() call would change anything. + +2. Or the problem isn't even related to what qcow2_detect_metadata_preallocation() does, but it's a race elsewhere that is just uncovered because of the timing - preallocation detection must be pretty slow because it checks the refcount of every single cluster in the image. In that case, replacing it with something like qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000000); should have the same effect and cause corruption, too. + +I finally got an image with which I can reproduce the problem. I think I may have had the wrong image size before because both tmpfs and my spare LVM volume are rather limited in size. + +Anyway, so far locking around qcow2_get_refcount() seems to do the trick. I'll try to investigate the details a bit more, but this is something that would actually feel reasonable as a fix. + +> I think I may have had the wrong image size before because both tmpfs and +> my spare LVM volume are rather limited in size. + +I also had a hard time to get my image to corrupt on tmpfs because it could not grow to its final size, it seems. Sometimes qemu ran into acutal ENOSPC but most of the time lack of space on tmpfs seemed to trigger early cleanup of unused blocks and in turn prevent corruption. Only after I increased tmpfs size again and again until my machine actually started to swap would I get the spurious corruption. Both facts would seems to support your suspicion of a race condition because qcow2_detect_metadata_preallocation() would run longer the more of the image is/was allocated. + +For completeness's sake: All the changes you proposed (replacing call to qcow2_detect_metadata_preallocation() with ret = true and ret = false, moving acquiring s->lock before the call and replacing the call with a sleep) prevent corruption on my system. The latter would suggest that it's not so much a race being exposed by a timing change as a race directly when accessing qcow2 internals without the lock being held. + +My understanding is that Kevin has fixed this bug in (as yet unreleased) commit 5e9785505210 ("qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()", 2019-10-25). + +The patch had been posted as a part of the following sets: + +[PATCH 0/3] qcow2: Fix image corruption bug in 4.1 +http://<email address hidden> + +[PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 +http://<email address hidden> + +Updating the ticket status accordingly. + + +I tried the ArchLinux package that includes three patches applied to qemu 4.1 ( see https://git.archlinux.org/svntogit/packages.git/commit/trunk/PKGBUILD?h=packages/qemu&id=e9707066408de26aa04f8d0ddebe5556aa87e662 ). My Windows 10 qcow2 image got corrupted again after a short time of use. Host filesystem is ext4. + +"OFLAG_COPIED data cluster: l2_entry=382c70000 refcount=1" + +The Windows installation seems to be fine after repair. At least Windows did not found anything wrong. + +Is this a fresh image or is it possible that it already had some latent corruption from a previous run with an unfixed version? If it wasn't fresh, did you run qemu-img check after upgrading QEMU and it still was clean, so we know the corruption was introduced by the new version? + +Is the problem easily reproducible or do you hit it only randomly so far? If it is reproducible, can you reproduce it on current qemu.git master or is it only with the Arch package? + +I have been dragging my feet exposing my production VMs to a patched 4.1.0 TBH. I have now taken the opportunity to upgrade from 4.0.0 to a 4.1.0 with the fix patches applied. As expected, I can not produce any image corruption with the reproducer I've been using all along. I will now use it in production and report back. + +For the record: All my images are intact right now, do not have any corruption and have never had any corruption. + +BTW: I have had one image corrupt with identical symptoms (i.e. OFLAG COPIED and such) with an unaffected qemu 4.0.0 because of a completely differrent bug in the host system's Linux kernel causing panics when booting Windows 10: https://bugzilla.kernel.org/show_bug.cgi?id=205247. So identical image corruption can seemingly have different causes... + +The image was fine before upgrading qemu. I rechecked the image after the first use and it was fine. But after the larger Windows 1903 -> 1909 upgrade done in qemu 4.1.0 the image was damaged. I will try the git master version of qemu in the coming days and report back. + +I've done some security updates on my Debian, Windows 7 64 and 32 Bit VMs and quite intensively used a Windows 1903 VM today without any corruption. + +All my images are still fine after some heavy use with qemu-4.1.0 and fix patches applied. Just upgraded to 4.1.1 and will report back. But it's certainly looks like this bug is fixed for good. + +My images are still fine after some heavy use with qemu-4.1.1 and no additional patches. I consider this bug fixed for good. Thanks for all your support on this! + +I was unable to compile the qemu-git package and I currently have not time to investigate that. But I updated to 4.1.1. I just started my Windows 10 VM with that and after a short time of use the image was corrupted again. Here is my full start parameter set. Maybe there is something wrong or I should change something? + +qemu-system-x86_64 -cpu Haswell-noTSX -M q35 -enable-kvm -smp 4,cores=4,threads=1,sockets=1 -net nic,model=virtio -net user,hostname=WindowsKVM.local -drive if=none,id=hd,file=hdd.qcow2,discard=unmap -device virtio-scsi-pci,id=scsi --enable-kvm -device scsi-hd,drive=hd -m 4096 -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd -drive if=pflash,format=raw,file=./OVMF_VARS.fd -drive file=Windows10ISO/Windows.iso,index=0,media=cdrom -drive file=virtio-win-0.1.173.iso,index=1,media=cdrom -no-quit + +My Linux VM is still fine. + +I don't see anything suspicious in that command line. My only idea for a different configuration to test would be discard=off, which would remove a few code paths that could contain a bug. + +Anyway, I think it's pretty clear now that you're hitting a different bug than Michael. Maybe it would be better to create a new report, too, and to continue there. + +Did you upgrade from 4.0 to 4.1 when you first hit the bug or was it from an earlier version? + +It would be perfect if you could bisect the problem like Michael did with his, but I understand that this might not be possible soon. Alternatively, I could also debug it myself if I had a clear reproducer (that ideally doesn't involve Windows). + +The qemu 4.1.0 upgrade killed pretty much all my VMs. I had data corruption (i.e. tar was unable to extract some larger data archives for testing purposes) in all my Linux VMs and other strange errors. The Windows VM was killed after I ran "qemu-img check -r all" on the image. Afterwards Windows was damaged beyond repair and unbootable. + +So I reinstalled everything with qemu 4.0, new images and stayed on that version except for testing purposes. Last test was qemu 4.1.1. + +Sadly I currently have no time to investigate this error until March next year. + +FWIW, my VMs run with SATA and Virtio SCSI with discard=unmap and detect-zeroes=unmap (among the plethora of options from libvirtd) for maximum space savings. No problems since the fix patches went in and had no bearing on the bug occurence before that. + +/usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -machine pc-q35-3.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off -m 4096 -smp 2,sockets=2,cores=1,threads=1 -drive file=win10.qcow2,format=qcow2,if=none,id=drive-sata0-0-2,cache=writeback,discard=unmap,detect-zeroes=unmap -device ide-hd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2,bootindex=1,write-cache=on + +/usr/bin/qemu-system-x86_64 -name guest=debian,debug-threads=on -machine pc-q35-4.0.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,hle=off,rtm=off,mpx=off -m 512 -smp 1,sockets=1,cores=1,threads=1 -device virtio-scsi-pci,id=scsi0,bus=pci.8,addr=0x0 -drive file=debian.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=writeback,discard=unmap,detect-zeroes=unmap -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,device_id=drive-scsi0-0-0-0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on + + +Commit 5e9785505210 was released in v4.2.0; closing this ticket. + diff --git a/results/classifier/zero-shot/105/semantic/1848231 b/results/classifier/zero-shot/105/semantic/1848231 new file mode 100644 index 000000000..7bb6c7245 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1848231 @@ -0,0 +1,41 @@ +semantic: 0.583 +mistranslation: 0.566 +device: 0.455 +graphic: 0.370 +other: 0.323 +socket: 0.196 +network: 0.109 +instruction: 0.107 +vnc: 0.091 +boot: 0.066 +KVM: 0.062 +assembly: 0.056 + +serial/parallel character devices created for the none-machine + +The none-machine can not be started unless using "-serial null": + +qemu-system-x86_64 -machine none -nographic -monitor stdio +QEMU 3.1.1 monitor - type 'help' for more information +(qemu) qemu-system-x86_64: cannot use stdio by multiple character devices +qemu-system-x86_64: could not connect serial device to character backend 'stdio' +$ + +$ qemu-system-mips -machine none -nographic -serial null -monitor stdio +QEMU 4.1.50 monitor - type 'help' for more information +(qemu) info chardev +parallel0: filename=null +compat_monitor0: filename=stdio +serial0: filename=null +(qemu) + +You can start 'none' without "-serial null". Examples: + +qemu-system-x86_64 -machine none +qemu-system-x86_64 -machine none -monitor stdio +qemu-system-x86_64 -machine none -nographic +qemu-system-x86_64 -machine none -monitor stdio -display none + +Your command line "qemu-system-x86_64 -machine none -nographic -monitor stdio" fails because "-nographic" says "please create a serial port using stdio" but "-monitor stdio" tries to use stdio for something else. You get the same message for any machine (eg "pc"), not just "none". If what you wanted was "just don't create the graphical display" that's "-display none" -- "-nographic" is a collection of things including both 'no display' and also 'default to creating a serial device to stdio' and 'default to creating a monitor muxed with that serial'. + + diff --git a/results/classifier/zero-shot/105/semantic/185 b/results/classifier/zero-shot/105/semantic/185 new file mode 100644 index 000000000..656d19e0b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/185 @@ -0,0 +1,14 @@ +semantic: 0.381 +boot: 0.300 +graphic: 0.244 +KVM: 0.154 +vnc: 0.142 +mistranslation: 0.128 +network: 0.115 +device: 0.062 +assembly: 0.013 +instruction: 0.013 +other: 0.002 +socket: 0.001 + +Coroutines: Audit use of "coroutine_fn" specifier diff --git a/results/classifier/zero-shot/105/semantic/1851095 b/results/classifier/zero-shot/105/semantic/1851095 new file mode 100644 index 000000000..2175a6140 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1851095 @@ -0,0 +1,85 @@ +semantic: 0.924 +instruction: 0.919 +graphic: 0.904 +other: 0.896 +device: 0.892 +assembly: 0.863 +socket: 0.799 +vnc: 0.799 +network: 0.764 +boot: 0.751 +mistranslation: 0.690 +KVM: 0.493 + +[feature request] awareness of instructions that are well emulated + +While qemu's scalar emulation tends to be excellent, qemu's SIMD emulation tends to be incorrect (except for arm64 from x86_64). Until these code paths are audited, which is probably a large job, it would be nice if qemu knew its emulation of this class of instructions was not very good, and thus it would give up on finding these instructions if a "careful" operation is passed. + +Here is a pull request for the zig language that runs into this problems in qemu https://github.com/ziglang/zig/pull/2945/ + +I have more code for validation if someone is working on this. + +On Sun, 3 Nov 2019 at 04:41, Shawn Landden <email address hidden> wrote: +> While qemu's scalar emulation tends to be excellent, qemu's SIMD +> emulation tends to be incorrect (except for arm64 from x86_64)--i have +> found this both for mipsel and arm32. Until these code paths are +> audited, which is probably a large job, it would be nice if qemu knew +> its emulation of this class of instructions was not very good, and thus +> it would give up on finding these instructions if a "careful" operation +> is passed. + +I'm not sure how this could work. If QEMU reports (via ID regs +etc) to the guest that it supports instruction class X when it +does not, that's a bug and we should fix it. If QEMU implements +an instruction but gets it wrong, that's also a bug and we should +fix it. In both cases, we'd need to have specific bug reports, +ideally with reproduce-cases. But we don't really have "known +areas where the emulation is incorrect" that we could somehow +differentiate and disable (except at a very vague level, eg +"probably better not to rely on the x86 emulation"). + +You might be able by careful selection of the cpu type to avoid +CPUs which implement vector operations. Some architectures +also allow individual CPU features to be disabled with extra +'-foo' qualifiers on the -cpu argument. + +For Arm in particular (32 or 64 bit) I believe our implementation +should be correct and am happy to look at bug reports for that. + +thanks +-- PMM + + +ok, here is a double precision exponent implementation that works on arm32 hardware, but fails in qemu with the wrong checksum. https://github.com/shawnl/zig-libmvec/blob/master/exp.zig + +You need to build zig with the above patch-set. + +I guess I am starting from a pessimistic perspective, where I have only ever seen SIMD work with arm64 emulation (which is quite new), and am sorry for that. + +Can you please provide a binary (preferably statically built or with required shared libraries attached)? + +Thanks, + +Laurent + +example binary doing double-precision exponent on 16 megs + +expected output: + +checksum: f181b401cd42aa7b + +actual output: + +checksum: 4004022b0ba624fb + + +Here is the same thing compiled with optimizations on + +appears the random number generator produces different results on 32-bit arches, while my code seems to work fine in qemu + +I can confirm bench_simple gives the same result on both qemu-arm and my aarch32 hardware. + +Can you provide a clearer repro example of what doesn't wirk on mipsel platform? + +In last two QEMU releases mips (Wave) developers went to great lenghts making sure both mips SIMD and mips FP instructions (in both scalar and vector variants) are emulated properly. Some of the unit tests were published, but also many were left internal, and there are many integration tests devised and run as well. We in mips (Wave) consider these two areas well tested. Still, we'll consider seriuosly fixing your example, if you prove experimentally that this is a mips-related bug, but just provides us with a reasonably convenient repro procedure. + diff --git a/results/classifier/zero-shot/105/semantic/1856335 b/results/classifier/zero-shot/105/semantic/1856335 new file mode 100644 index 000000000..5f20608dc --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1856335 @@ -0,0 +1,1085 @@ +semantic: 0.926 +graphic: 0.883 +instruction: 0.851 +assembly: 0.836 +device: 0.825 +mistranslation: 0.805 +vnc: 0.803 +socket: 0.794 +other: 0.750 +network: 0.677 +KVM: 0.582 +boot: 0.506 + +Cache Layout wrong on many Zen Arch CPUs + +AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems to always map Cache ass if it was an 4-Core per CCX CPU, which is incorrect, and costs upwards 30% performance (more realistically 10%) in L3 Cache Layout aware applications. + +Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT): + + <cpu mode='custom' match='exact' check='full'> + <model fallback='forbid'>EPYC-IBPB</model> + <vendor>AMD</vendor> + <topology sockets='1' cores='8' threads='1'/> + +In windows, coreinfo reports correctly: + +****---- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64 +----**** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64 + +On a 3-CCX CPU (3960X /w 6 cores and no SMT): + + <cpu mode='custom' match='exact' check='full'> + <model fallback='forbid'>EPYC-IBPB</model> + <vendor>AMD</vendor> + <topology sockets='1' cores='6' threads='1'/> + +in windows, coreinfo reports incorrectly: + +****-- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64 +----** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64 + + +Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm. + +With newer Qemu there is a fix (that does behave correctly) in using the dies parameter: + <qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/> + +The problem is that the dies are exposed differently than how AMD does it natively, they are exposed to Windows as sockets, which means, you can't ever have a machine with more than two CCX (6 cores) as Windows only supports two sockets. (Should this be reported as a separate bug?) + +Hi, + +I've since confirmed that this bug also exist (as expected) on Linux guests, as well as Zen1 EPYC 7401 CPUs, to make sure this wasn't a problem with the detection of the newer consumer platform. + +Basically it seems (looking at the code with layman eyes) that as long as you have a topology that is dividable by 4 or 8, it will always result in the wrong topology being exposed to the guest, even when the correct option can be built (12, 24 core CPUs, although, it would be great if we could support 9 Core VM CPus as that is a reasonable use case for VMs (3 CCXs of 3 Cores for a total of 9 (or 18 SMT threads)). + +Pinging the author and committer of the TopoEXT feature / EPYC cpu model as they should probably know best how to solve this issue. + +This is the commit I am referencing: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=8f4202fb1080f86958782b1fca0bf0279f67d136 + +Damir, + We normally test Linux guests here. Can you please give me exact qemu command line. Even the SMP parameters(sockets,cores,threads,dies) will also work. I will try to recreate it locally first. +Give me example what works and what does not work. + +I have recently sent few more patches to fix another bug. Please check if this makes any difference. +https://patchwork.kernel.org/cover/11272063/ +https://lore.kernel<email address hidden>/ + +This should apply cleanly on git://github.com/ehabkost/qemu.git (branch x86-next) + +Note: I will be on vacation until first week of Jan. Responses will be delayed. + +Same problem for Ryzen 9 3900X. There should be 4x L3 caches, but there are only 3. + +Same results with "host-passthrough" and "EPYC-IBPB". Windows doesn't recognize the correct L3 cache layout. + +From coreinfo.exe: + +Logical Processor to Cache Map: +**---------------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 +********---------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 +--**-------------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------------- Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 +----**------------------ Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------------ Instruction Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------------ Unified Cache 3, Level 2, 512 KB, Assoc 8, LineSize 64 +------**---------------- Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------------- Instruction Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 +--------**-------------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------------- Instruction Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------------- Unified Cache 5, Level 2, 512 KB, Assoc 8, LineSize 64 +--------********-------- Unified Cache 6, Level 3, 16 MB, Assoc 16, LineSize 64 +----------**------------ Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------------ Instruction Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------------ Unified Cache 7, Level 2, 512 KB, Assoc 8, LineSize 64 +------------**---------- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---------- Instruction Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---------- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 +--------------**-------- Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------**-------- Instruction Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------**-------- Unified Cache 9, Level 2, 512 KB, Assoc 8, LineSize 64 +----------------**------ Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------**------ Instruction Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------**------ Unified Cache 10, Level 2, 512 KB, Assoc 8, LineSize 64 +----------------******** Unified Cache 11, Level 3, 16 MB, Assoc 16, LineSize 64 +------------------**---- Data Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 +------------------**---- Instruction Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 +------------------**---- Unified Cache 12, Level 2, 512 KB, Assoc 8, LineSize 64 +--------------------**-- Data Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------------**-- Instruction Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------------**-- Unified Cache 13, Level 2, 512 KB, Assoc 8, LineSize 64 +----------------------** Data Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------------** Instruction Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------------** Unified Cache 14, Level 2, 512 KB, Assoc 8, LineSize 64 + + +AMD does not use dies. For AMD dies is normally set to 1. You probably have to pass dies in some other ways. Did you try the latest qemu v 5.0? Please try it. + +Qemu expects the user to configure the topology based on their requirement. + +Try replacing <qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/> +with <qemu:arg value='cores=6,threads=1,dies=1,sockets=1'/> + +You can also use the numa configuration. There are multiple ways you can achieve your required configuration. + + +Damir, Example of how to use numa configuration. +-smp 16,maxcpus=16,cores=16,threads=1,sockets=1 -numa node,nodeid=0,cpus=0-7 -numa node,nodeid=1,cpus=8-15 + +This will help to put all the cores in correct L3 boundary. I strongly suggest to use the latest qemu release. + +It could be an issue of how the kernel presents the CPU topology. + +Hardware: AMD Ryzen 3900X 12 core 24 threads (SMT) +Host: Kernel 5.6.6, QEMU 4.2 + +virsh capabilities | grep "cpu id" + <cpu id='0' socket_id='0' core_id='0' siblings='0,12'/> + <cpu id='1' socket_id='0' core_id='1' siblings='1,13'/> + <cpu id='2' socket_id='0' core_id='2' siblings='2,14'/> + <cpu id='3' socket_id='0' core_id='4' siblings='3,15'/> + <cpu id='4' socket_id='0' core_id='5' siblings='4,16'/> + <cpu id='5' socket_id='0' core_id='6' siblings='5,17'/> + <cpu id='6' socket_id='0' core_id='8' siblings='6,18'/> + <cpu id='7' socket_id='0' core_id='9' siblings='7,19'/> + <cpu id='8' socket_id='0' core_id='10' siblings='8,20'/> + <cpu id='9' socket_id='0' core_id='12' siblings='9,21'/> + <cpu id='10' socket_id='0' core_id='13' siblings='10,22'/> + <cpu id='11' socket_id='0' core_id='14' siblings='11,23'/> + <cpu id='12' socket_id='0' core_id='0' siblings='0,12'/> + <cpu id='13' socket_id='0' core_id='1' siblings='1,13'/> + <cpu id='14' socket_id='0' core_id='2' siblings='2,14'/> + <cpu id='15' socket_id='0' core_id='4' siblings='3,15'/> + <cpu id='16' socket_id='0' core_id='5' siblings='4,16'/> + <cpu id='17' socket_id='0' core_id='6' siblings='5,17'/> + <cpu id='18' socket_id='0' core_id='8' siblings='6,18'/> + <cpu id='19' socket_id='0' core_id='9' siblings='7,19'/> + <cpu id='20' socket_id='0' core_id='10' siblings='8,20'/> + <cpu id='21' socket_id='0' core_id='12' siblings='9,21'/> + <cpu id='22' socket_id='0' core_id='13' siblings='10,22'/> + <cpu id='23' socket_id='0' core_id='14' siblings='11,23'/> + +See how cpu id=3 gets core id=4, and cpu id=6 gets core id=8, etc. + +cat /sys/devices/system/cpu/cpu2/topology/core_id +2 + +cat /sys/devices/system/cpu/cpu3/topology/core_id +4 + +However, the association of CPU IDs to L3 caches seems to be correct: + +echo "Level CPU list";for file in /sys/devices/system/cpu/cpu*/cache/index3; do echo $(cat $file/id) " " $(cat $file/shared_cpu_list); done | sort --version-sort +Level CPU list +0 0-2,12-14 +0 0-2,12-14 +0 0-2,12-14 +0 0-2,12-14 +0 0-2,12-14 +0 0-2,12-14 +1 3-5,15-17 +1 3-5,15-17 +1 3-5,15-17 +1 3-5,15-17 +1 3-5,15-17 +1 3-5,15-17 +2 6-8,18-20 +2 6-8,18-20 +2 6-8,18-20 +2 6-8,18-20 +2 6-8,18-20 +2 6-8,18-20 +3 9-11,21-23 +3 9-11,21-23 +3 9-11,21-23 +3 9-11,21-23 +3 9-11,21-23 +3 9-11,21-23 + +There are 4 L3 caches with the correct CPU lists (6 CPUs/threads each). + +Is it possible that this weird CPU ID enumeration is causing the confusion? + +Haven't had a chance to check out QEMU 5.0, but hope to do that today. + +Finally installed QEMU 5.0.0.154 - still the same. QEMU doesn't recognize the L3 caches and still lists 3 L3 caches instead of 4 with 3 cores/6 threads. + +Here the vm.log with the qemu command line (shortened): + +2020-05-03 18:23:38.674+0000: starting up libvirt version: 5.10.0, qemu version: 5.0.50v5.0.0-154-g2ef486e76d-dirty, kernel: 5.4.36-1-MANJARO + +-machine pc-q35-4.2,accel=kvm,usb=off,vmport=off,dump-guest-core=off,kernel_irqchip=on,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \ +-cpu host,invtsc=on,hypervisor=on,topoext=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-synic,hv-stimer,hv-vendor-id=AuthenticAMD,hv-frequencies,hv-crash,kvm=off,host-cache-info=on,l3-cache=off \ +-m 49152 \ +-mem-prealloc \ +-mem-path /dev/hugepages/libvirt/qemu/1-win10 \ +-overcommit mem-lock=off \ +-smp 24,sockets=1,cores=12,threads=2 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=34,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=localtime,driftfix=slew \ +-global kvm-pit.lost_tick_policy=delay \ +-no-hpet \ +-no-shutdown \ +-global ICH9-LPC.disable_s3=1 \ +-global ICH9-LPC.disable_s4=1 \ +-boot menu=off,strict=on \ + + +Hi Seiger, +I am not an expert on libvirt. I mostly use qemu command line for my test. I was able to achieve the 3960X configuration with the following command line. + +# qemu-system-x86_64 -name rhel7 -m 16384 -smp 24,cores=12,threads=2,sockets=1 -hda vdisk.qcow2 -enable-kvm -net nic -net bridge,br=virbr0,helper=/usr/libexec/qemu-bridge-helper -cpu host,+topoext -nographic -numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-11 -numa node,nodeid=2,cpus=12-17 -numa node,nodeid=3,cpus=18-23 + +Basically qemu does not have all the information to build the topology for every configuration. It depends on libvirt for that information. See if this combination works for you. + +Hello Babu, + +Thanks for the reply and the QEMU command line. I will try to implement it in the XML. + +So essentially what you do is to define each group of cpus and associate them with a numa node: + +-numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-11 -numa node,nodeid=2,cpus=12-17 -numa node,nodeid=3,cpus=18-23 + +Haven't tried it but that might work. Do you need QEMU 5.0 for this to work, or is 4.2 OK? + +Yes. Sieger. Please install 5.0 it should work fine. I am not sure about 4.2. + +Hello, + +I took a look today at the layouts when using 1950X (which previously worked, and yes, admittedly, I am using Windows / coreinfo), and any basic config (previously something simple as Sockets=1,Cores=8, Theads=1 (now also Dies=1) worked, but now, the topology presents as if all cores share L3, and that each two cores share L1C/L1D/L2, like if they were smt-siblings. I would call this a serious regression. + +I don't think using Numa Nodes is an ok way to solve this (especially not when at least for 4CCX CPUs, this worked flawlessly before), as that will make numa-aware applications start taking note of numa nodes, and possibly do wierd things (plus, it introduces more configuration where it was not needed before). + +I upgraded to QEMU emulator version 5.0.50 +Using q35-5.1 (the latest) and the following libvirt configuration: + + <memory unit="KiB">50331648</memory> + <currentMemory unit="KiB">50331648</currentMemory> + <memoryBacking> + <hugepages/> + </memoryBacking> + <vcpu placement="static">24</vcpu> + <cputune> + <vcpupin vcpu="0" cpuset="0"/> + <vcpupin vcpu="1" cpuset="12"/> + <vcpupin vcpu="2" cpuset="1"/> + <vcpupin vcpu="3" cpuset="13"/> + <vcpupin vcpu="4" cpuset="2"/> + <vcpupin vcpu="5" cpuset="14"/> + <vcpupin vcpu="6" cpuset="3"/> + <vcpupin vcpu="7" cpuset="15"/> + <vcpupin vcpu="8" cpuset="4"/> + <vcpupin vcpu="9" cpuset="16"/> + <vcpupin vcpu="10" cpuset="5"/> + <vcpupin vcpu="11" cpuset="17"/> + <vcpupin vcpu="12" cpuset="6"/> + <vcpupin vcpu="13" cpuset="18"/> + <vcpupin vcpu="14" cpuset="7"/> + <vcpupin vcpu="15" cpuset="19"/> + <vcpupin vcpu="16" cpuset="8"/> + <vcpupin vcpu="17" cpuset="20"/> + <vcpupin vcpu="18" cpuset="9"/> + <vcpupin vcpu="19" cpuset="21"/> + <vcpupin vcpu="20" cpuset="10"/> + <vcpupin vcpu="21" cpuset="22"/> + <vcpupin vcpu="22" cpuset="11"/> + <vcpupin vcpu="23" cpuset="23"/> + </cputune> + <os> + <type arch="x86_64" machine="pc-q35-5.1">hvm</type> + <loader readonly="yes" type="pflash">/usr/share/OVMF/x64/OVMF_CODE.fd</loader> + <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram> + <boot dev="hd"/> + <bootmenu enable="no"/> + </os> + <features> + <acpi/> + <apic/> + <hyperv> + <relaxed state="on"/> + <vapic state="on"/> + <spinlocks state="on" retries="8191"/> + <vpindex state="on"/> + <synic state="on"/> + <stimer state="on"/> + <vendor_id state="on" value="AuthenticAMD"/> + <frequencies state="on"/> + </hyperv> + <kvm> + <hidden state="on"/> + </kvm> + <vmport state="off"/> + <ioapic driver="kvm"/> + </features> + <cpu mode="host-passthrough" check="none"> + <topology sockets="1" cores="12" threads="2"/> + <cache mode="passthrough"/> + <feature policy="require" name="invtsc"/> + <feature policy="require" name="hypervisor"/> + <feature policy="require" name="topoext"/> + <numa> + <cell id="0" cpus="0-2,12-14" memory="12582912" unit="KiB"/> + <cell id="1" cpus="3-5,15-17" memory="12582912" unit="KiB"/> + <cell id="2" cpus="6-8,18-20" memory="12582912" unit="KiB"/> + <cell id="3" cpus="9-11,21-23" memory="12582912" unit="KiB"/> + </numa> + </cpu> + +... + +/var/log/libvirt/qemu/win10.log: + +-machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,kernel_irqchip=on,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \ +-cpu host,invtsc=on,hypervisor=on,topoext=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-synic,hv-stimer,hv-vendor-id=AuthenticAMD,hv-frequencies,hv-crash,kvm=off,host-cache-info=on,l3-cache=off \ +-m 49152 \ +-overcommit mem-lock=off \ +-smp 24,sockets=1,cores=12,threads=2 \ +-mem-prealloc \ +-mem-path /dev/hugepages/libvirt/qemu/3-win10 \ +-numa node,nodeid=0,cpus=0-2,cpus=12-14,mem=12288 \ +-numa node,nodeid=1,cpus=3-5,cpus=15-17,mem=12288 \ +-numa node,nodeid=2,cpus=6-8,cpus=18-20,mem=12288 \ +-numa node,nodeid=3,cpus=9-11,cpus=21-23,mem=12288 \ +... + +For some reason I always get l3-cache=off. + +CoreInfo.exe in Windows 10 then produces the following report (shortened): + +Logical to Physical Processor Map: +**---------------------- Physical Processor 0 (Hyperthreaded) +--*--------------------- Physical Processor 1 +---*-------------------- Physical Processor 2 +----**------------------ Physical Processor 3 (Hyperthreaded) +------**---------------- Physical Processor 4 (Hyperthreaded) +--------*--------------- Physical Processor 5 +---------*-------------- Physical Processor 6 +----------**------------ Physical Processor 7 (Hyperthreaded) +------------**---------- Physical Processor 8 (Hyperthreaded) +--------------*--------- Physical Processor 9 +---------------*-------- Physical Processor 10 +----------------**------ Physical Processor 11 (Hyperthreaded) +------------------**---- Physical Processor 12 (Hyperthreaded) +--------------------*--- Physical Processor 13 +---------------------*-- Physical Processor 14 +----------------------** Physical Processor 15 (Hyperthreaded) + +Logical Processor to Socket Map: +************************ Socket 0 + +Logical Processor to NUMA Node Map: +***---------***--------- NUMA Node 0 +---***---------***------ NUMA Node 1 +------***---------***--- NUMA Node 2 +---------***---------*** NUMA Node 3 + +Approximate Cross-NUMA Node Access Cost (relative to fastest): + 00 01 02 03 +00: 1.4 1.2 1.1 1.2 +01: 1.1 1.1 1.3 1.1 +02: 1.0 1.1 1.0 1.2 +03: 1.1 1.2 1.2 1.2 + +Logical Processor to Cache Map: +**---------------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 +***--------------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 +--*--------------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--*--------------------- Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--*--------------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 +---*-------------------- Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +---*-------------------- Instruction Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +---*-------------------- Unified Cache 3, Level 2, 512 KB, Assoc 8, LineSize 64 +---***------------------ Unified Cache 4, Level 3, 16 MB, Assoc 16, LineSize 64 +----**------------------ Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------------ Instruction Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------------ Unified Cache 5, Level 2, 512 KB, Assoc 8, LineSize 64 +------**---------------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------------- Instruction Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------------- Unified Cache 6, Level 2, 512 KB, Assoc 8, LineSize 64 +------**---------------- Unified Cache 7, Level 3, 16 MB, Assoc 16, LineSize 64 +--------*--------------- Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +--------*--------------- Instruction Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +--------*--------------- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 +--------*--------------- Unified Cache 9, Level 3, 16 MB, Assoc 16, LineSize 64 +---------*-------------- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +---------*-------------- Instruction Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +---------*-------------- Unified Cache 10, Level 2, 512 KB, Assoc 8, LineSize 64 +---------***------------ Unified Cache 11, Level 3, 16 MB, Assoc 16, LineSize 64 +----------**------------ Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------------ Instruction Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------------ Unified Cache 12, Level 2, 512 KB, Assoc 8, LineSize 64 +------------**---------- Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---------- Instruction Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---------- Unified Cache 13, Level 2, 512 KB, Assoc 8, LineSize 64 +------------***--------- Unified Cache 14, Level 3, 16 MB, Assoc 16, LineSize 64 +--------------*--------- Data Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------*--------- Instruction Cache 9, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------*--------- Unified Cache 15, Level 2, 512 KB, Assoc 8, LineSize 64 +---------------*-------- Data Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 +---------------*-------- Instruction Cache 10, Level 1, 32 KB, Assoc 8, LineSize 64 +---------------*-------- Unified Cache 16, Level 2, 512 KB, Assoc 8, LineSize 64 +---------------*-------- Unified Cache 17, Level 3, 16 MB, Assoc 16, LineSize 64 +----------------**------ Data Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------**------ Instruction Cache 11, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------**------ Unified Cache 18, Level 2, 512 KB, Assoc 8, LineSize 64 +----------------**------ Unified Cache 19, Level 3, 16 MB, Assoc 16, LineSize 64 +------------------**---- Data Cache 12, Level 1, 32 KB, Assoc 8, LineSize 64 +------------------**---- Instruction Cache 12, Level 1, 32 KB, Assoc 8, LineSize 64 +------------------**---- Unified Cache 20, Level 2, 512 KB, Assoc 8, LineSize 64 +------------------***--- Unified Cache 21, Level 3, 16 MB, Assoc 16, LineSize 64 +--------------------*--- Data Cache 13, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------------*--- Instruction Cache 13, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------------*--- Unified Cache 22, Level 2, 512 KB, Assoc 8, LineSize 64 +---------------------*-- Data Cache 14, Level 1, 32 KB, Assoc 8, LineSize 64 +---------------------*-- Instruction Cache 14, Level 1, 32 KB, Assoc 8, LineSize 64 +---------------------*-- Unified Cache 23, Level 2, 512 KB, Assoc 8, LineSize 64 +---------------------*** Unified Cache 24, Level 3, 16 MB, Assoc 16, LineSize 64 +----------------------** Data Cache 15, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------------** Instruction Cache 15, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------------** Unified Cache 25, Level 2, 512 KB, Assoc 8, LineSize 64 + +Logical Processor to Group Map: +************************ Group 0 + + +The above result is even further away from the actual L3 cache configuration. + +So numatune doesn't produce the expected outcome. + + +Same problem here on 5.0 and 3900x (3 cores per CCX). And as stated before - declaring NUMA nodes is definitely not the right solution if the aim is to emulate the host CPU as close as possible. + +The problem is that disabled cores are not taken into account.. ALL Zen2 CPUs have L3 cache group per CCX and every CCX has 4 cores, the problem is that some cores in each CCX (1 for 6 and 12-core CPUs, 2 for 3100) are disabled for some models, but they still use their core ids (as can be seen in virsh capabilities | grep "cpu id" output in above comments). Looking at target/i386/cpu.c:5529, this is not taken into account. + +Maybe the cleanest way to fix this is to emulate the host topology by also skipping disabled core ids in the VM? That way, die offset will actually match the real host CPU topology... + +A workaround for Linux VMs is to disable CPUs (and setting their number/pinnings accordingly, e.g. every 4th (and 3rd for 3100) core is going to be 'dummy' and disabled system-wide) by e.g. echo 0 > /sys/devices/system/cpu/cpu3/online + +No good workaround for Windows VMs exists, as far as I know - the best you can do is setting affinity to specific process(es) and avoid the 'dummy' CPUs, but I am not aware of any possibility to disable specific CPUs (only limiting the overall number). + +Hi Jan, + +Problem for me now is why does every config (I can figure out) now result in SMT on/L3 across all cores which is obviously never true on Zen except if you have only less than 4 cores, 8 cores should always result in 2 L3 Caches, and so should 16 Threads /w 8+SMT. This worked in my initial post. + +Latest qemu has removed all the hard coded configurations for AMD. It is leaving everything to customize. One way is to configure is using numa nodes. This will make sure cpus under one numa node share same L3. Then pin the correct host cpus to guest cpus using vcpupin. I would change this -numa node,nodeid=0,cpus=0-2,cpus=12-14,mem=12288 to -numa node,nodeid=0,cpus=0-2,cpus=3-5,mem=12288. Then have vcpupin map the correct host cpu to guest cpu. Check if this works for you. Can you please post lscpu output from host for everybody's understanding? + + +No, creating artificial NUMA nodes is, simply put, never a good solution for CPUs that operate as a single NUMA node - which is the case for all Zen2 CPUs (except maybe EPYCs? not sure about those). + +You may workaround the L3 issue that way, but hit many new bugs/problems by introducing multiple NUMA nodes, _especially_ on Windows VMs, because that OS has crappy NUMA handling and multitude of bugs related to it - which was one of the major reasons why even Zen2 Threadrippers are now single NUMA node (e.g. https://www.servethehome.com/wp-content/uploads/2019/11/AMD-Ryzen-Threadripper-3960X-Topology.png ). + +The host CPU architecture should be replicated as closely as possible on the VM and for Zen2 CPUs with 4 cores per CCX, _this already works perfectly_ - there are no problems on 3300X/3700(X)/3800X/3950X/3970X/3990X. + +There is, unfortunately, no way to customize/specify the "disabled" CPU cores in QEMU, and therefore no way to emulate 1 NUMA node + L3 cache per 2/3 cores - only to passthrough the cache config from host, which is unfortunately not done correctly for CPUs with disabled cores (but again, works perfectly for CPUs with all 4 cores enabled per CCX). + +lscpu: +Architecture: x86_64 +CPU op-mode(s): 32-bit, 64-bit +Byte Order: Little Endian +Address sizes: 43 bits physical, 48 bits virtual +CPU(s): 24 +On-line CPU(s) list: 0-23 +Thread(s) per core: 2 +Core(s) per socket: 12 +Socket(s): 1 +NUMA node(s): 1 +Vendor ID: AuthenticAMD +CPU family: 23 +Model: 113 +Model name: AMD Ryzen 9 3900X 12-Core Processor +Stepping: 0 +Frequency boost: enabled +CPU MHz: 2972.127 +CPU max MHz: 3800.0000 +CPU min MHz: 2200.0000 +BogoMIPS: 7602.55 +Virtualization: AMD-V +L1d cache: 384 KiB +L1i cache: 384 KiB +L2 cache: 6 MiB +L3 cache: 64 MiB +NUMA node0 CPU(s): 0-23 +Vulnerability Itlb multihit: Not affected +Vulnerability L1tf: Not affected +Vulnerability Mds: Not affected +Vulnerability Meltdown: Not affected +Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp +Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization +Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling +Vulnerability Tsx async abort: Not affected +Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonsto + p_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a mi + salignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall fsgsbase b + mi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru + wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca + + +But the important thing has already been posted here in previous comments - notice the skipped core ids belonging to the disabled cores: + +virsh capabilities | grep "cpu id": +<cpu id='0' socket_id='0' core_id='0' siblings='0,12'/> +<cpu id='1' socket_id='0' core_id='1' siblings='1,13'/> +<cpu id='2' socket_id='0' core_id='2' siblings='2,14'/> +<cpu id='3' socket_id='0' core_id='4' siblings='3,15'/> +<cpu id='4' socket_id='0' core_id='5' siblings='4,16'/> +<cpu id='5' socket_id='0' core_id='6' siblings='5,17'/> +<cpu id='6' socket_id='0' core_id='8' siblings='6,18'/> +<cpu id='7' socket_id='0' core_id='9' siblings='7,19'/> +<cpu id='8' socket_id='0' core_id='10' siblings='8,20'/> +<cpu id='9' socket_id='0' core_id='12' siblings='9,21'/> +<cpu id='10' socket_id='0' core_id='13' siblings='10,22'/> +<cpu id='11' socket_id='0' core_id='14' siblings='11,23'/> +<cpu id='12' socket_id='0' core_id='0' siblings='0,12'/> +<cpu id='13' socket_id='0' core_id='1' siblings='1,13'/> +<cpu id='14' socket_id='0' core_id='2' siblings='2,14'/> +<cpu id='15' socket_id='0' core_id='4' siblings='3,15'/> +<cpu id='16' socket_id='0' core_id='5' siblings='4,16'/> +<cpu id='17' socket_id='0' core_id='6' siblings='5,17'/> +<cpu id='18' socket_id='0' core_id='8' siblings='6,18'/> +<cpu id='19' socket_id='0' core_id='9' siblings='7,19'/> +<cpu id='20' socket_id='0' core_id='10' siblings='8,20'/> +<cpu id='21' socket_id='0' core_id='12' siblings='9,21'/> +<cpu id='22' socket_id='0' core_id='13' siblings='10,22'/> +<cpu id='23' socket_id='0' core_id='14' siblings='11,23'/> + +Damir: +Hm, must be some misconfiguration, then. My config for Linux VMs to utilize 3 out of the 4 CCXs. Important parts of the libvirt domain XML: + + <vcpu placement="static">24</vcpu> + <iothreads>1</iothreads> + <cputune> + <vcpupin vcpu="0" cpuset="3"/> + <vcpupin vcpu="1" cpuset="15"/> + <vcpupin vcpu="2" cpuset="4"/> + <vcpupin vcpu="3" cpuset="16"/> + <vcpupin vcpu="4" cpuset="5"/> + <vcpupin vcpu="5" cpuset="17"/> + <vcpupin vcpu="6" cpuset="0,12"/> + <vcpupin vcpu="7" cpuset="0,12"/> + <vcpupin vcpu="8" cpuset="6"/> + <vcpupin vcpu="9" cpuset="18"/> + <vcpupin vcpu="10" cpuset="7"/> + <vcpupin vcpu="11" cpuset="19"/> + <vcpupin vcpu="12" cpuset="8"/> + <vcpupin vcpu="13" cpuset="20"/> + <vcpupin vcpu="14" cpuset="0,12"/> + <vcpupin vcpu="15" cpuset="0,12"/> + <vcpupin vcpu="16" cpuset="9"/> + <vcpupin vcpu="17" cpuset="21"/> + <vcpupin vcpu="18" cpuset="10"/> + <vcpupin vcpu="19" cpuset="22"/> + <vcpupin vcpu="20" cpuset="11"/> + <vcpupin vcpu="21" cpuset="23"/> + <vcpupin vcpu="22" cpuset="0,12"/> + <vcpupin vcpu="23" cpuset="0,12"/> + <emulatorpin cpuset="1,13"/> + <iothreadpin iothread="1" cpuset="2,14"/> + </cputune> + <os> + <type arch="x86_64" machine="pc-q35-5.0">hvm</type> + <loader readonly="yes" type="pflash">/usr/share/ovmf/x64/OVMF_CODE.fd</loader> + <nvram>/var/lib/libvirt/qemu/nvram/ccxtest-clone_VARS.fd</nvram> + </os> +. +. +. + <qemu:commandline> + <qemu:arg value="-cpu"/> + <qemu:arg value="host,topoext=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,host-cache-info=on,-amd-stibp"/> + </qemu:commandline> + +The CPUs with cpuset="0,12" are disabled once booted. The host-cache-info=on is the part that makes sure that the cache config is passed to the VM (but unfortunately does not take disabled cores into account, which results in incorrect config). The qemu:commandline is added because I need to add -amd-stibp, otherwise I wouldn't be able to boot. This overrides most parts in the <cpu> XML part. + +"The CPUs with cpuset="0,12" are disabled once booted. The host-cache-info=on is the part that makes sure that the cache config is passed to the VM (but unfortunately does not take disabled cores into account, which results in incorrect config). The qemu:commandline is added because I need to add -amd-stibp, otherwise I wouldn't be able to boot. This overrides most parts in the <cpu> XML part." + +Is there a XML equivalent for host-cache-info=on ? + +Will that work with model EPYC-IBPB as well? + +Sieger, I am not an expert on XML. So, I dont know. Qemu probably cannot handle disabled cores. I am still trying to learn more about this problem. + +With regard to Jan's comment earlier and the virsh capabilities listing the cores and siblings, also note the following lines from virsh capabilities for a 3900X CPU: + + <cache> + <bank id='0' level='3' type='both' size='16' unit='MiB' cpus='0-2,12-14'/> + <bank id='1' level='3' type='both' size='16' unit='MiB' cpus='3-5,15-17'/> + <bank id='2' level='3' type='both' size='16' unit='MiB' cpus='6-8,18-20'/> + <bank id='3' level='3' type='both' size='16' unit='MiB' cpus='9-11,21-23'/> + </cache> + +virsh capabilities is perfectly able to identify the L3 cache structure and associate the right cpus. It would be ideal to just use the above output inside the libvirt domain configuration to "manually" define the L3 cache, or something to that effect on the qemu command line. + +Users could then decide to pin only part of the cpus, usually a multiple of 6 (in the case of the 3900X) to align with the CCX. + +I'm now on kernel 5.6.11 and QEMU v5.0.0.r533.gdebe78ce14-1 (from Arch Linux AUR qemu-git), running q35-5.1. I will try the host-passthrough with host-cache-info=on option Jan posted. Question - is host-cache-info=on the same as <cache mode="passthrough"/> under <cpu mode=host-passthrough...? + +<cache mode="passthrough"/> + +adds "host-cache-info=on,l3-cache=off" + +to the qemu -cpu args + +I believe l3-cache=off is useless with host-cache-info=on + +So <cache mode="passthrough"/> should do what you want. + +Thanks Jan. I had some new hardware/software issues combined with the QEMU 5.0.. issues that had my Windows VM crash after some minutes. + +I totally overlooked the following: + <vcpupin vcpu="6" cpuset="0,12"/> + <vcpupin vcpu="7" cpuset="0,12"/> + +So I guess you posted to answer to this: https://www.reddit.com/r/VFIO/comments/erwzrg/think_i_found_a_workaround_to_get_l3_cache_shared/ + +As it's late, I'll try tomorrow. Sorry for all the confusion but I had a real tough time with this Ryzen build. + +Jan, I tried your suggestion but it didn't make a difference. Here is my current setup: + +h/w: AMD Ryzen 9 3900X +kernel: 5.4 +QEMU: 5.0.0-6 +Chipset selection: Q35-5.0 + +Configuration: host-passthrough, cache enabled + +Use CoreInfo.exe inside Windows. The problem is this: + +Logical Processor to Cache Map: +**---------------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 +********---------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 + +The last line above should be as follows: + +******------------------ Unified Cache 0, Level 3, 16 MB, Assoc 16, LineSize 64 + +The cache is supposed to be associated with 3 cores a 2 threads in group 0. Yet it shows 8 (2x4) vcpus inside a cache that is associated with the next group. + +In total, I always get 3 L3 caches instead of 4 L4 caches for my 12 cores / 24 threads. Also see my next post. + + +This is the CPU cache layout as shown by lscpu -a -e + +CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ + 0 0 0 0 0:0:0:0 yes 3800.0000 2200.0000 + 1 0 0 1 1:1:1:0 yes 3800.0000 2200.0000 + 2 0 0 2 2:2:2:0 yes 3800.0000 2200.0000 + 3 0 0 3 3:3:3:1 yes 3800.0000 2200.0000 + 4 0 0 4 4:4:4:1 yes 3800.0000 2200.0000 + 5 0 0 5 5:5:5:1 yes 3800.0000 2200.0000 + 6 0 0 6 6:6:6:2 yes 3800.0000 2200.0000 + 7 0 0 7 7:7:7:2 yes 3800.0000 2200.0000 + 8 0 0 8 8:8:8:2 yes 3800.0000 2200.0000 + 9 0 0 9 9:9:9:3 yes 3800.0000 2200.0000 + 10 0 0 10 10:10:10:3 yes 3800.0000 2200.0000 + 11 0 0 11 11:11:11:3 yes 3800.0000 2200.0000 + 12 0 0 0 0:0:0:0 yes 3800.0000 2200.0000 + 13 0 0 1 1:1:1:0 yes 3800.0000 2200.0000 + 14 0 0 2 2:2:2:0 yes 3800.0000 2200.0000 + 15 0 0 3 3:3:3:1 yes 3800.0000 2200.0000 + 16 0 0 4 4:4:4:1 yes 3800.0000 2200.0000 + 17 0 0 5 5:5:5:1 yes 3800.0000 2200.0000 + 18 0 0 6 6:6:6:2 yes 3800.0000 2200.0000 + 19 0 0 7 7:7:7:2 yes 3800.0000 2200.0000 + 20 0 0 8 8:8:8:2 yes 3800.0000 2200.0000 + 21 0 0 9 9:9:9:3 yes 3800.0000 2200.0000 + 22 0 0 10 10:10:10:3 yes 3800.0000 2200.0000 + 23 0 0 11 11:11:11:3 yes 3800.0000 2200.0000 + +I was trying to allocate cache using the cachetune feature in libvirt, but it turns out to be either misleading or much too complicated to be usable. Here is what I tried: + + <vcpu placement="static">24</vcpu> + <cputune> + <vcpupin vcpu="0" cpuset="0"/> + <vcpupin vcpu="1" cpuset="12"/> + <vcpupin vcpu="2" cpuset="1"/> + <vcpupin vcpu="3" cpuset="13"/> + <vcpupin vcpu="4" cpuset="2"/> + <vcpupin vcpu="5" cpuset="14"/> + <vcpupin vcpu="6" cpuset="3"/> + <vcpupin vcpu="7" cpuset="15"/> + <vcpupin vcpu="8" cpuset="4"/> + <vcpupin vcpu="9" cpuset="16"/> + <vcpupin vcpu="10" cpuset="5"/> + <vcpupin vcpu="11" cpuset="17"/> + <vcpupin vcpu="12" cpuset="6"/> + <vcpupin vcpu="13" cpuset="18"/> + <vcpupin vcpu="14" cpuset="7"/> + <vcpupin vcpu="15" cpuset="19"/> + <vcpupin vcpu="16" cpuset="8"/> + <vcpupin vcpu="17" cpuset="20"/> + <vcpupin vcpu="18" cpuset="9"/> + <vcpupin vcpu="19" cpuset="21"/> + <vcpupin vcpu="20" cpuset="10"/> + <vcpupin vcpu="21" cpuset="22"/> + <vcpupin vcpu="22" cpuset="11"/> + <vcpupin vcpu="23" cpuset="23"/> + <cachetune vcpus="0-2,12-14"> + <cache id="0" level="3" type="both" size="16" unit="MiB"/> + <monitor level="3" vcpus="0-2,12-14"/> + </cachetune> + <cachetune vcpus="3-5,15-17"> + <cache id="1" level="3" type="both" size="16" unit="MiB"/> + <monitor level="3" vcpus="3-5,15-17"/> + </cachetune> + <cachetune vcpus="6-8,18-20"> + <cache id="2" level="3" type="both" size="16" unit="MiB"/> + <monitor level="3" vcpus="6-8,18-20"/> + </cachetune> + <cachetune vcpus="9-11,21-23"> + <cache id="3" level="3" type="both" size="16" unit="MiB"/> + <monitor level="3" vcpus="9-11,21-23"/> + </cachetune> + </cputune> + +Unfortunately it gives the following error when I try to start the VM: + +Error starting domain: internal error: Missing or inconsistent resctrl info for memory bandwidth allocation + +I have resctrl mounted like this: + +mount -t resctrl resctrl /sys/fs/resctrl + +This error leads to the following description on how to allocate memory bandwith: https://software.intel.com/content/www/us/en/develop/articles/use-intel-resource-director-technology-to-allocate-memory-bandwidth.html + +I think this is over the top and perhaps I'm trying the wrong approach. All I can say is that every suggestion I've seen and tried so far has led me to one conclusion: QEMU does NOT support the L3 cache layout of the new ZEN 2 arch CPUs such as the Ryzen 9 3900X. + +h-sieger, +that is a misunderstanding, read my comment carefully again: +"A workaround for Linux VMs is to disable CPUs (and setting their number/pinnings accordingly, e.g. every 4th (and 3rd for 3100) core is going to be 'dummy' and disabled system-wide) by e.g. echo 0 > /sys/devices/system/cpu/cpu3/online + +No good workaround for Windows VMs exists, as far as I know - the best you can do is setting affinity to specific process(es) and avoid the 'dummy' CPUs, but I am not aware of any possibility to disable specific CPUs (only limiting the overall number)." + +I do NOT have a fix - only a very ugly workaround for Linux guests only - I cannot fix the cache layout, but on Linux, I can get around that by adding dummy CPUs that I then disable in the guest during startup, so they are not used - effectively making sure that only the correct 6 vCPUs / 3 cores are used. On Windows, you cannot do that, AFAIK. + +Thanks for clarifying, Jan. + +In the meantime I tried a number of so-called solutions published on Reddit and other places, none of which seems to work. + +So if I understand it correctly, there is currently no solution to the incorrect l3 cache layout for Zen architecture CPUs. At best a workaround for Linux guests. + +I hope somebody is looking into that. + +Thanks for clarifying, Jan. + +In the meantime I tried a number of so-called solutions published on Reddit and other places, none of which seems to work. + +So if I understand it correctly, there is currently no solution to the incorrect l3 cache layout for Zen architecture CPUs. At best a workaround for Linux guests. + +I hope somebody is looking into that. + +The problem is caused by the fact that with Ryzen CPUs with disabled cores, the APIC IDs are not sequential on host - in order for cache topology to be configured properly, there is a 'hole' in APIC ID and core ID numbering (I have added full output of cpuid for my 3900X). Unfortunately, adding holes to the numbering is the only way to achieve what is needed for 3 cores per CCX as CPUID Fn8000_001D_EAX NumSharingCache parameter rounds to powers of two (for Ryzen 3100 with 2 cores per CCX, lowering NumSharingCache should also work, correctly setting the L3 cache cores with their IDs still being sequential). + +A small hack in x86_apicid_from_topo_ids() in include/hw/i386/topology.h can introduce a correct numbering (at least if you do not have epyc set as your cpu, then _epyc variant of the functions are used). But to fix this properly will probably require some thought - maybe introduce the ability to assign APIC IDs directly somehow? Or the ability to specify the 'holes' somehow in the -smt param, or maybe -cpu host,topoext=on should do this automatically? I don't know... + +e.g. For 3 core per CCX CPUs, to fix this, at include/hw/i386/topology.h:220 change: + +(topo_ids->core_id << apicid_core_offset(topo_info)) | + +to + +((topo_ids->core_id + (topo_ids->core_id / 3)) << apicid_core_offset(topo_info)) | + + +The cache topology is now correct (-cpu host,topoext=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,host-cache-info=on -smp 18,sockets=1,dies=1,cores=9,threads=2), even in Windows: + +Logical Processor to Cache Map: +**---------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 +******------------ Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 +--**-------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------- Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 +----**------------ Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------ Instruction Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------ Unified Cache 3, Level 2, 512 KB, Assoc 8, LineSize 64 +------**---------- Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------- Instruction Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 +------******------ Unified Cache 5, Level 3, 16 MB, Assoc 16, LineSize 64 +--------**-------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------- Instruction Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------- Unified Cache 6, Level 2, 512 KB, Assoc 8, LineSize 64 +----------**------ Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------ Instruction Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------ Unified Cache 7, Level 2, 512 KB, Assoc 8, LineSize 64 +------------**---- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---- Instruction Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 +------------****** Unified Cache 9, Level 3, 16 MB, Assoc 16, LineSize 64 + + + +@Jan: this coreinfo output looks good. + +I finally managed to get the core /cache alignment right, I believe: + + <vcpu placement="static" current="24">32</vcpu> + <vcpus> + <vcpu id="0" enabled="yes" hotpluggable="no"/> + <vcpu id="1" enabled="yes" hotpluggable="yes"/> + <vcpu id="2" enabled="yes" hotpluggable="yes"/> + <vcpu id="3" enabled="yes" hotpluggable="yes"/> + <vcpu id="4" enabled="yes" hotpluggable="yes"/> + <vcpu id="5" enabled="yes" hotpluggable="yes"/> + <vcpu id="6" enabled="no" hotpluggable="yes"/> + <vcpu id="7" enabled="no" hotpluggable="yes"/> + <vcpu id="8" enabled="yes" hotpluggable="yes"/> + <vcpu id="9" enabled="yes" hotpluggable="yes"/> + <vcpu id="10" enabled="yes" hotpluggable="yes"/> + <vcpu id="11" enabled="yes" hotpluggable="yes"/> + <vcpu id="12" enabled="yes" hotpluggable="yes"/> + <vcpu id="13" enabled="yes" hotpluggable="yes"/> + <vcpu id="14" enabled="no" hotpluggable="yes"/> + <vcpu id="15" enabled="no" hotpluggable="yes"/> + <vcpu id="16" enabled="yes" hotpluggable="yes"/> + <vcpu id="17" enabled="yes" hotpluggable="yes"/> + <vcpu id="18" enabled="yes" hotpluggable="yes"/> + <vcpu id="19" enabled="yes" hotpluggable="yes"/> + <vcpu id="20" enabled="yes" hotpluggable="yes"/> + <vcpu id="21" enabled="yes" hotpluggable="yes"/> + <vcpu id="22" enabled="no" hotpluggable="yes"/> + <vcpu id="23" enabled="no" hotpluggable="yes"/> + <vcpu id="24" enabled="yes" hotpluggable="yes"/> + <vcpu id="25" enabled="yes" hotpluggable="yes"/> + <vcpu id="26" enabled="yes" hotpluggable="yes"/> + <vcpu id="27" enabled="yes" hotpluggable="yes"/> + <vcpu id="28" enabled="yes" hotpluggable="yes"/> + <vcpu id="29" enabled="yes" hotpluggable="yes"/> + <vcpu id="30" enabled="no" hotpluggable="yes"/> + <vcpu id="31" enabled="no" hotpluggable="yes"/> + </vcpus> + <cputune> + <vcpupin vcpu="0" cpuset="0"/> + <vcpupin vcpu="1" cpuset="12"/> + <vcpupin vcpu="2" cpuset="1"/> + <vcpupin vcpu="3" cpuset="13"/> + <vcpupin vcpu="4" cpuset="2"/> + <vcpupin vcpu="5" cpuset="14"/> + <vcpupin vcpu="8" cpuset="3"/> + <vcpupin vcpu="9" cpuset="15"/> + <vcpupin vcpu="10" cpuset="4"/> + <vcpupin vcpu="11" cpuset="16"/> + <vcpupin vcpu="12" cpuset="5"/> + <vcpupin vcpu="13" cpuset="17"/> + <vcpupin vcpu="16" cpuset="6"/> + <vcpupin vcpu="17" cpuset="18"/> + <vcpupin vcpu="18" cpuset="7"/> + <vcpupin vcpu="19" cpuset="19"/> + <vcpupin vcpu="20" cpuset="8"/> + <vcpupin vcpu="21" cpuset="20"/> + <vcpupin vcpu="24" cpuset="9"/> + <vcpupin vcpu="25" cpuset="21"/> + <vcpupin vcpu="26" cpuset="10"/> + <vcpupin vcpu="27" cpuset="22"/> + <vcpupin vcpu="28" cpuset="11"/> + <vcpupin vcpu="29" cpuset="23"/> + </cputune> + +... + <cpu mode="host-passthrough" check="none"> + <topology sockets="1" dies="1" cores="16" threads="2"/> + <cache mode="passthrough"/> + + +The Windows Coreinfo output is this: + +Logical to Physical Processor Map: +**---------------- Physical Processor 0 (Hyperthreaded) +--**-------------- Physical Processor 1 (Hyperthreaded) +----**------------ Physical Processor 2 (Hyperthreaded) +------**---------- Physical Processor 3 (Hyperthreaded) +--------**-------- Physical Processor 4 (Hyperthreaded) +----------**------ Physical Processor 5 (Hyperthreaded) +------------**---- Physical Processor 6 (Hyperthreaded) +--------------**-- Physical Processor 7 (Hyperthreaded) +----------------** Physical Processor 8 (Hyperthreaded) + +Logical Processor to Socket Map: +****************** Socket 0 + +Logical Processor to NUMA Node Map: +****************** NUMA Node 0 + +No NUMA nodes. + +Logical Processor to Cache Map: +**---------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 +**---------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 +******------------ Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 +--**-------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------- Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 +--**-------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 +----**------------ Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------ Instruction Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64 +----**------------ Unified Cache 3, Level 2, 512 KB, Assoc 8, LineSize 64 +------**---------- Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------- Instruction Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64 +------**---------- Unified Cache 4, Level 2, 512 KB, Assoc 8, LineSize 64 +------******------ Unified Cache 5, Level 3, 16 MB, Assoc 16, LineSize 64 +--------**-------- Data Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------- Instruction Cache 4, Level 1, 32 KB, Assoc 8, LineSize 64 +--------**-------- Unified Cache 6, Level 2, 512 KB, Assoc 8, LineSize 64 +----------**------ Data Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------ Instruction Cache 5, Level 1, 32 KB, Assoc 8, LineSize 64 +----------**------ Unified Cache 7, Level 2, 512 KB, Assoc 8, LineSize 64 +------------**---- Data Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---- Instruction Cache 6, Level 1, 32 KB, Assoc 8, LineSize 64 +------------**---- Unified Cache 8, Level 2, 512 KB, Assoc 8, LineSize 64 +------------****** Unified Cache 9, Level 3, 16 MB, Assoc 16, LineSize 64 +--------------**-- Data Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------**-- Instruction Cache 7, Level 1, 32 KB, Assoc 8, LineSize 64 +--------------**-- Unified Cache 10, Level 2, 512 KB, Assoc 8, LineSize 64 +----------------** Data Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------** Instruction Cache 8, Level 1, 32 KB, Assoc 8, LineSize 64 +----------------** Unified Cache 11, Level 2, 512 KB, Assoc 8, LineSize 64 + +Logical Processor to Group Map: +****************** Group 0 + + +Haven't been able to test if it performs as expected. Need to do that. + +Of course it would be great if QEMU was patched to recognize correct CCX alignment as I'm not sure if and what will be the penalty of this weird setup. + +Yep, I read the Reddit thread, had no idea this was possible. + +Still, both solutions are ugly workarounds and it would be nice to fix this properly. But at least I don't have to patch and compile QEMU on my own anymore. + +h-sieger, +Your XML gave me very significant performance gains. +Is there any way to do this with more than 24 assigned cores? + + +@sanjaybmd + +I'm glad to read that it worked for you. In fact, since I posted the XML I didn't have the time to do benchmarking, now my motherboard is dead and I have to wait for repair/replacement. + +Do you have any data to quantify the performance gain? + +As to the number of cores, you will notice that my 3900X has only 12 physical cores, that is 24 threads. Yet I assigned 32 vcpus in total. 8 of them are disabled. This is to align the vcpus to the actual CCX topology of 3 cores per CCX. + +QEMU thinks the cores per CCX should be a multiple of 2, e.g. 2, 4, etc. cores. So I assign 4 cores = 8 vcpus, and disable 2 vcpus to simulate the actual topology. + +If your CPU has more cores, you could scale it up. Be aware that the 3950X should not have this issue as it has 4 cores per CCX, if I remember correctly. + +Note: I took this idea from a Reddit post (see link somewhere above). + +h-sieger, +I did some testing with geekbench 5: + +baseline multicore score = 12733 +https://browser.geekbench.com/v5/cpu/3069626 + +score with <cache="passthrough"> option = 12775 +https://browser.geekbench.com/v5/cpu/3069415 + +best score with your xml above = 16960 +https://browser.geekbench.com/v5/cpu/3066003 + +I'm running a 3960x and it is 3 cores per CCX so your xml above works well. I'm just now learning about all this so I'm still trying to figure out how to modify your xml to assign more cores. Anyway, I'm getting better performance out of my Windows 10 VM now assigning 24 vcpu as opposed to the 32 that I was assigning before! +By the way, I tried to email you directly because I'm not sure this is appropriate discussion for this bug report but I could not create an account on your website (captcha was malfunctioning). Hope you you can fix that soon. + + +Sanjay, + +You can just increase the number of vcpus, such as: + +<vcpu placement="static" current="48">64</vcpu> + +then continue to define the vcpus: + + <vcpu id="32" enabled="yes" hotpluggable="yes"/> + <vcpu id="33" enabled="yes" hotpluggable="yes"/> + <vcpu id="34" enabled="yes" hotpluggable="yes"/> + <vcpu id="35" enabled="yes" hotpluggable="yes"/> + <vcpu id="36" enabled="yes" hotpluggable="yes"/> + <vcpu id="37" enabled="yes" hotpluggable="yes"/> + <vcpu id="38" enabled="no" hotpluggable="yes"/> + <vcpu id="39" enabled="no" hotpluggable="yes"/> + <vcpu id="40" enabled="yes" hotpluggable="yes"/> + <vcpu id="41" enabled="yes" hotpluggable="yes"/> + <vcpu id="42" enabled="yes" hotpluggable="yes"/> + <vcpu id="43" enabled="yes" hotpluggable="yes"/> + <vcpu id="44" enabled="yes" hotpluggable="yes"/> + <vcpu id="45" enabled="yes" hotpluggable="yes"/> + <vcpu id="46" enabled="no" hotpluggable="yes"/> + <vcpu id="47" enabled="no" hotpluggable="yes"/> + <vcpu id="48" enabled="yes" hotpluggable="yes"/> + <vcpu id="49" enabled="yes" hotpluggable="yes"/> + <vcpu id="50" enabled="yes" hotpluggable="yes"/> + <vcpu id="51" enabled="yes" hotpluggable="yes"/> + <vcpu id="52" enabled="yes" hotpluggable="yes"/> + <vcpu id="53" enabled="yes" hotpluggable="yes"/> + <vcpu id="54" enabled="no" hotpluggable="yes"/> + <vcpu id="55" enabled="no" hotpluggable="yes"/> + <vcpu id="56" enabled="yes" hotpluggable="yes"/> + <vcpu id="57" enabled="yes" hotpluggable="yes"/> + <vcpu id="58" enabled="yes" hotpluggable="yes"/> + <vcpu id="59" enabled="yes" hotpluggable="yes"/> + <vcpu id="60" enabled="yes" hotpluggable="yes"/> + <vcpu id="61" enabled="yes" hotpluggable="yes"/> + <vcpu id="62" enabled="no" hotpluggable="yes"/> + <vcpu id="63" enabled="no" hotpluggable="yes"/> + +(6x enabled=yes, then 2x enabled=no.) + +You will get more vcpu ids than you have threads, but since you disable 16 out of 64, you will have 48 active. + +vcpupin should continue as follows: + + <vcpupin vcpu="32" cpuset="24"/> + <vcpupin vcpu="33" cpuset="36"/> + <vcpupin vcpu="34" cpuset="25"/> + <vcpupin vcpu="35" cpuset="37"/> + <vcpupin vcpu="36" cpuset="26"/> + <vcpupin vcpu="37" cpuset="38"/> + <vcpupin vcpu="40" cpuset="27"/> + <vcpupin vcpu="41" cpuset="39"/> + <vcpupin vcpu="42" cpuset="28"/> + <vcpupin vcpu="43" cpuset="40"/> + <vcpupin vcpu="44" cpuset="29"/> + <vcpupin vcpu="45" cpuset="41"/> + <vcpupin vcpu="48" cpuset="30"/> + <vcpupin vcpu="49" cpuset="42"/> + <vcpupin vcpu="50" cpuset="31"/> + <vcpupin vcpu="51" cpuset="43"/> + <vcpupin vcpu="52" cpuset="32"/> + <vcpupin vcpu="53" cpuset="44"/> + <vcpupin vcpu="56" cpuset="33"/> + <vcpupin vcpu="57" cpuset="45"/> + <vcpupin vcpu="58" cpuset="34"/> + <vcpupin vcpu="59" cpuset="46"/> + <vcpupin vcpu="60" cpuset="35"/> + <vcpupin vcpu="61" cpuset="47"/> + +This is if you pin all vcpus to the VM, which may not be the best thing to do. The maximum number of vcpus you can pin on a Threadripper 3960X are 48. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1859359 b/results/classifier/zero-shot/105/semantic/1859359 new file mode 100644 index 000000000..2dd9ccba3 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1859359 @@ -0,0 +1,107 @@ +semantic: 0.887 +mistranslation: 0.870 +device: 0.863 +assembly: 0.861 +other: 0.861 +instruction: 0.859 +graphic: 0.856 +vnc: 0.809 +network: 0.805 +socket: 0.794 +KVM: 0.778 +boot: 0.769 + +xHCI and event ring handling + +I believe that the Event Ring handling in QEMU is not correct. For example, an Event Ring may have multiple segments. However, the code in xhci_write_event() (https://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd-xhci.c;hb=HEAD#l645), starting with line 668, seems to only support a single segment. + +Also, QEMU is sending a spurious interrupt after the Guest writes to the ERDP register due to the fact that the address written does not match the current index. This is because the index is incremented after sending the event. The xHCI specification states in section 5.5.2.3.3 "When software finishes processing an Event TRB, it will write the address of that Event TRB to the ERDP." + +Since xhci_write_event() has already incremented the index pointer (intr->er_ep_idx), the check at line 3098 (https://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd-xhci.c;hb=HEAD#l3090) no longer is valid. + +I have not studied QEMU's code enough yet to offer a patch. However, this should be a simple fix. + +intr->er_ep_idx++; +if (intr->er_ep_idx >= intr->er_table[intr->er_segment].er_size) { + intr->er_ep_idx = 0; + intr->er_segment++; + if (intr->er_segment >= intr->er_table_size) { + intr->er_segment = 0; + intr->er_pcs = !intr->er_pcs; + } +} + +Being sure to incorporate this new segment member into the above code (possibly as shown) as well as change the lines at 665 to use the new segment member of the structure, and of course in the initialization portion of the event ring. + +As for the spurious interrupt at line 3101, a new member will need to be added to the code to keep track of the last inserted ED (TRB) into the Event Ring and then of course checking against this new member, not the now newly incremented member. + +I have sent an email to the author listed at the top of the file as well, not sure if this is proper bug reporting etiquette or not. + +Thank you. + +I failed to note above that the HCSPARAMS2 register does indeed limit the count of segments in the Event Ring. I guess as long as you never change this value from one (1) you will be okay. + +However, the spurious interrupt stuff still stands as a bug. + +Thank you, +Ben + +Please note that the current code reports zero (0) + +https://git.qemu.org/?p=qemu.git;a=blob;f=hw/usb/hcd-xhci.c#l2737 + +Bits 7:4 is this limit and the current code has these bits as zero. + + +My apologizes. I forgot that it was 2^ERSTMAX. I really need to get some sleep :-) + +qemu behavior matches linux guest driver expectations on erdp writes, I don't think we have a bug there. + +And, yes, qemu doesn't support multiple segments and correctly says so in the capabilities registers. + +The xHCI specification states that after processing the Event TRB, software is to write the address of that TRB to the xHC_INTERRUPTER_DEQUEUE. QEMU currently checks this value written and compares it to its own current pointer, which is now incremented to the next available TRB, therefore not matching. When finding that it does not match, it sends another interrupt. + +On receiving this interrupt, software will see this interrupt as a mismatch in cycle bits and simply write the address of the last processed Event TRB to the xHC_INTERRUPTER_DEQUEUE and return again. QEMU will then again check the address and find again that it is a mismatch, again firing the interrupt. This causes an infinite loop and will halt the USB. + +I do believe this to be in error. + +However, it is up to the majority, which seams to be a Linux based majority, so if it works on Linux, if it isn't broken, why fix it. + +As for the multiple segments in the Event Ring, this was more of a request than a bug report. Sorry for the miss representation on that part. + +Thank you, +Ben + +The part you're missing is in section 4.9.4. + +> System software shall advance the Event Ring Dequeue Pointer by writing the address of the last +processed Event TRB to the Event Ring Dequeue Pointer (ERDP) register. Note, the “last processed +Event TRB” includes the case where software detects a Cycle bit mismatch when evaluating an Event +TRB and the ring is empty. + +So the bug is in your code, because it's supposed to check the next TRB, find the cycle bit mismatch, and *that* qualifies as "processing" it, and then it should write *that* address to the ERDP, which is going to equal the actual last valid TRB's address + 1 (modulo wraparound), which is exactly what qemu expects, and what Linux does too. + +Hi Hector, guys, + +I think I have found my/the error: + +xHCI, version 1.0, Page 136: +"Note: The detection of a Cycle bit mismatch in an Event TRB processed + by software indicates the location of the xHC Event Ring Enqueue Pointer + and that the Event Ring is empty. Software shall write the ERDP with the + address of this TRB to indicate that it has processed all Events in the + ring." + +It does not state to advance the Consumer's internal Dequeue pointer. +Just the register is mentioned. + +This is my error. I thought that it implied to advance the Consumer's +internal Dequeue Pointer as well. (Implied being the big word here) + +Sorry for the bother. It was my error. It took me a bit of (re)reading +and a little more work to find that it was/is indeed my error. + +Sorry and thank you for your time, +Ben + + diff --git a/results/classifier/zero-shot/105/semantic/186 b/results/classifier/zero-shot/105/semantic/186 new file mode 100644 index 000000000..effe78790 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/186 @@ -0,0 +1,14 @@ +semantic: 0.235 +device: 0.204 +boot: 0.200 +KVM: 0.159 +socket: 0.151 +vnc: 0.111 +mistranslation: 0.087 +network: 0.083 +other: 0.051 +instruction: 0.034 +graphic: 0.024 +assembly: 0.010 + +Audit consistent option usage in documentation diff --git a/results/classifier/zero-shot/105/semantic/1860575 b/results/classifier/zero-shot/105/semantic/1860575 new file mode 100644 index 000000000..b22a5da56 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1860575 @@ -0,0 +1,73 @@ +semantic: 0.642 +device: 0.496 +other: 0.396 +instruction: 0.396 +mistranslation: 0.382 +network: 0.334 +socket: 0.306 +vnc: 0.244 +boot: 0.230 +assembly: 0.192 +KVM: 0.175 +graphic: 0.105 + +qemu64 CPU model is incorrect + +At the moment the "qemu64" CPU is defined as follows: + +``` + .vendor = CPUID_VENDOR_AMD, + .family = 6, + .model = 6, + .stepping = 3, +``` + +According to Wikipedia [1] this means the CPU is defined as part of the +K7 family while the AMD64 ISA was only introduced with the K8 series! + +This causes some software such as LLVM to notice the problem (32-bit cpu +with 64-bit capability reported in the cpuid flag) and produce various +error messages. + +The simple solution would be to upgrade this definition to use the Sledgehammer +family (15) instead. + +[1] https://en.wikipedia.org/wiki/List_of_AMD_CPU_microarchitectures + +Your analysis of the problem with family makes sense & we do have mechanism to fix this in QEMU while keeping back compat for existing deployments. + +I'm curious as to the actual errors LLVM reports ? + +FWIW, even though qemu64 is the default CPU, practically everyone would be better off choosing one of the other CPU models explicitly to better suit their desired use case. There is some guidance here https://qemu.weilnetz.de/doc/qemu-doc.html#cpu_005fmodels + + +The error message is a rather cryptic "LLVM ERROR: 64-bit code requested on a subtarget +that doesn't support it!" as it knows Athlon CPUs don't support the AMD64 ISA. + +I will relay the tip to the people managing the VMs, I guess this problem went unnoticed +for so long because there are not many `qemu64` users. + +I'm available to test a patch whenever it becomes available, I didn't directly send one +because I was afraid of breaking the backward compatibility and some (many?) VMs. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/191 + + diff --git a/results/classifier/zero-shot/105/semantic/1865252 b/results/classifier/zero-shot/105/semantic/1865252 new file mode 100644 index 000000000..1b43f0abf --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1865252 @@ -0,0 +1,49 @@ +semantic: 0.951 +mistranslation: 0.897 +other: 0.876 +boot: 0.867 +device: 0.846 +network: 0.823 +vnc: 0.816 +socket: 0.809 +graphic: 0.787 +instruction: 0.756 +KVM: 0.712 +assembly: 0.663 + +QEMU Windows Portable Version (with HAXM accelerator and QEMU GUI) + +Please consider providing a QEMU Windows portable [1] [2] [3] version on official qemu.org. + +Reasons: + +* This would improve usability, the out of the box user experience of laymen (non-technical) users. +* Linux distributions could add the QEMU Windows portable to their installer / live ISO images (and the DVD's autorun.inf). Users who are still running on the Windows platform could be having an easy path to try out a Linux distribution by running int inside QEMU. I've seen that in many some years ago. Was running Windows. Just open the DVD drive in Windows explorer, double click and QEMU (shipped with the ISO) booted the ISO. + +Ideally EMU Windows portable version would be bundled with: + +* the [QEMU HAXM accelerator] by default. Related ticket: [5] +* a QEMU GUI by default. Related ticket: [6] + + +[1] When I say "Windows Portable" I mean "USB portable". [4] + +[2] A compress archive (zip or so) which after extraction can be executed without further installation / setup required. As far I know [https://portableapps.com portableapps.com] is the most popular project of that kind. + +[3] QEMU might already be portable or mostly portable. See: + +* https://portableapps.com/search/node/QEMU +* https://www.google.com/search?hl=en&q=site%3Aportableapps.com%20QEMU%20portable +* https://www.portablefreeware.com/?id=640 +* https://willhaley.com/blog/simple-portable-linux-qemu-vm-usb/ + +But not sure above projects are still maintained. Would be certainly better if official qemu.org would be providing a QEMU Windows portable version. + +[4] Or more generally "can be run on any external storage medium on any Windows [10] computer. + +[5] https://bugs.launchpad.net/qemu/+bug/1864955 + +[6] https://bugs.launchpad.net/qemu/+bug/1865248 + +QEMU, like most open source projects, relies on contributors who have motivation, skills and available time to work on implementing particular features. They naturally tend to focus on features that result in the greatest benefit to their own use cases. I'm sorry, but as far as I know there is currently nobody working on such a topic, and opening a ticket like this won't make it happen without some new contributor to step up to do the job. Thus I'm closing this ticket now. Feel free to re-open if you know someone who could contribute this feature. + diff --git a/results/classifier/zero-shot/105/semantic/1867519 b/results/classifier/zero-shot/105/semantic/1867519 new file mode 100644 index 000000000..6a4777b51 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1867519 @@ -0,0 +1,263 @@ +semantic: 0.842 +socket: 0.830 +instruction: 0.827 +other: 0.809 +assembly: 0.801 +device: 0.791 +vnc: 0.787 +graphic: 0.776 +network: 0.773 +mistranslation: 0.766 +KVM: 0.721 +boot: 0.662 + +qemu 4.2 segfaults on VF detach + +After updating Ubuntu 20.04 to the Beta version, we get the following error and the virtual machines stucks when detaching PCI devices using virsh command: + +Error: +error: Failed to detach device from /tmp/vf_interface_attached.xml +error: internal error: End of file from qemu monitor + +steps to reproduce: + 1. create a VM over Ubuntu 20.04 (5.4.0-14-generic) + 2. attach PCI device to this VM (Mellanox VF for example) + 3. try to detaching the PCI device using virsh command: + a. create a pci interface xml file: + + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address type='pci' domain='0x0000' bus='0x11' slot='0x00' function='0x2' /> + </source> + </hostdev> + + b. #virsh detach-device <VM-Doman-name> <pci interface xml file> + + + +- Ubuntu release: + Description: Ubuntu Focal Fossa (development branch) + Release: 20.04 + +- Package ver: + libvirt0: + Installed: 6.0.0-0ubuntu3 + Candidate: 6.0.0-0ubuntu5 + Version table: + 6.0.0-0ubuntu5 500 + 500 http://il.archive.ubuntu.com/ubuntu focal/main amd64 Packages + *** 6.0.0-0ubuntu3 100 + 100 /var/lib/dpkg/status + +- What you expected to happen: + PCI device detached without any errors. + +- What happened instead: + getting the errors above and he VM stuck + +additional info: +after downgrading the libvirt0 package and all the dependent packages to 5.4 the previous, version, seems that the issue disappeared + +Hi Mohammad, +I'll to recreate your issue, but while that goes on I already would want to ask if you could report the following tracked while you try to detach the device: + +1. host dmesg -w +2. journalctl -f +3. /var/log/libvirt/qemu/<questname>.log + +Please report all these in case there is something that helps to pinpoint the root cause. + +Could you also please try more devices to know which kind this issue it restricted to? +1. try other VFs on the same device +2. try VFs on a different device (if you have any) +3. try non-VF but full PCI passthrough + +Does the issue occor on your system for all of these ? + +For the time being I only found a system with a full device to passthrough not a VF. +That worked fine for me, the guest gets the dev and can load its drivers. +[ 297.340525] mlx5_core 0000:00:08.0: Port module event: module 0, Cable unplugged +[ 297.361111] mlx5_core 0000:00:08.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) +[ 297.572313] mlx5_core 0000:00:08.0 ens8: renamed from eth0 + +But since that was "only" PCI-passthrough and not yet VFs I'm looking forward to your answers on your case. + +Once more thing you could track and report is the guests "dmesg -w" while trying to attach to see if there is anything appearing there or aborting much earlier. + +I got VFs enabled now, attach works fine as well. + +But I can confirm that detach breaks it. + +XML used for the device: + <interface type='hostdev' managed='yes'> + <driver name='vfio'/> + <mac address='52:54:00:c3:0e:32'/> + <source> + <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x2'/> + </source> + </interface> + +$ virsh detach-device focal-pttest VF-pass-through.xml +error: Failed to detach device from VF-pass-through.xml +error: internal error: End of file from qemu monitor + +Logs: +1. Guest dmesg (doesn't exist as it dies immediately). + +2. qemu log: +2020-03-18 11:02:18.221+0000: shutting down, reason=crashed + +This one is interesting, Host dmesg: +[ 5819.223023] CPU 0/KVM[2763]: segfault at 0 ip 000055d37b4b245d sp 00007f2f5fffe188 error 6 in qemu-system-x86_64[55d37b008000+529000] +[ 5819.223030] Code: 08 48 89 50 10 48 89 37 48 89 7e 10 c3 f3 0f 1e fa 48 8b 47 08 48 8b 57 10 48 85 c0 74 0c 48 89 50 10 48 8b 57 10 48 8b 47 08 <48> 89 02 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e + +Afterwards I see the device come back to the host, but the segfault is the reason it died. + +I re-run the above, full PCI passthrough still attaches/detaches fine. + +VFs attach fine +VFs break on detach + +I've thrown qemu into GDB and this is the backtrace +Thread 4 "CPU 0/KVM" received signal SIGSEGV, Segmentation fault. +[Switching to Thread 0x7f82f0e31700 (LWP 3998)] +0x000055d2f322d45d in notifier_remove (notifier=notifier@entry=0x55d2f40c5078) at ./util/notify.c:31 +31 QLIST_REMOVE(notifier, node); +(gdb) bt +#0 0x000055d2f322d45d in notifier_remove (notifier=notifier@entry=0x55d2f40c5078) at ./util/notify.c:31 +#1 0x000055d2f2df8df9 in kvm_irqchip_remove_change_notifier (n=n@entry=0x55d2f40c5078) at ./accel/kvm/kvm-all.c:1409 +#2 0x000055d2f2e56989 in vfio_exitfn (pdev=<optimized out>) at ./hw/vfio/pci.c:3079 +#3 0x000055d2f3025c1b in pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>) at ./hw/pci/pci.c:1131 +#4 0x000055d2f2f8c6e2 in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x0) at ./hw/core/qdev.c:932 +#5 0x000055d2f312449b in property_set_bool (obj=0x55d2f40c4430, v=<optimized out>, name=<optimized out>, opaque=0x55d2f4083ee0, errp=0x0) at ./qom/object.c:2078 +#6 0x000055d2f3128c84 in object_property_set_qobject (obj=obj@entry=0x55d2f40c4430, value=value@entry=0x7f82dc2f7130, name=name@entry=0x55d2f330d85d "realized", errp=errp@entry=0x0) + at ./qom/qom-qobject.c:26 +#7 0x000055d2f31264ba in object_property_set_bool (obj=0x55d2f40c4430, value=<optimized out>, name=0x55d2f330d85d "realized", errp=0x0) at ./qom/object.c:1336 +#8 0x000055d2f2f56bca in acpi_pcihp_device_unplug_cb (hotplug_dev=<optimized out>, s=<optimized out>, dev=0x55d2f40c4430, errp=<optimized out>) at ./hw/acpi/pcihp.c:269 +#9 0x000055d2f2f56253 in acpi_pcihp_eject_slot (s=<optimized out>, bsel=<optimized out>, slots=slots@entry=256) at ./hw/acpi/pcihp.c:170 +#10 0x000055d2f2f56383 in pci_write (size=<optimized out>, data=256, addr=8, opaque=<optimized out>) at ./hw/acpi/pcihp.c:341 +#11 pci_write (opaque=<optimized out>, addr=<optimized out>, data=256, size=<optimized out>) at ./hw/acpi/pcihp.c:332 +#12 0x000055d2f2de9cfb in memory_region_write_accessor (mr=mr@entry=0x55d2f4780970, addr=8, value=value@entry=0x7f82f0e304f8, size=size@entry=4, shift=<optimized out>, + mask=mask@entry=4294967295, attrs=...) at ./memory.c:483 +#13 0x000055d2f2de79ee in access_with_adjusted_size (addr=addr@entry=8, value=value@entry=0x7f82f0e304f8, size=size@entry=4, access_size_min=<optimized out>, + access_size_max=<optimized out>, access_fn=access_fn@entry=0x55d2f2de9bd0 <memory_region_write_accessor>, mr=0x55d2f4780970, attrs=...) at ./memory.c:544 +#14 0x000055d2f2debfc3 in memory_region_dispatch_write (mr=mr@entry=0x55d2f4780970, addr=8, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ./memory.c:1475 +#15 0x000055d2f2d96a30 in flatview_write_continue (fv=fv@entry=0x7f82dc14bbc0, addr=addr@entry=44552, attrs=..., buf=buf@entry=0x7f82f17e9000 "", len=len@entry=4, addr1=<optimized out>, + l=<optimized out>, mr=0x55d2f4780970) at ./include/qemu/host-utils.h:164 +#16 0x000055d2f2d96c46 in flatview_write (fv=0x7f82dc14bbc0, addr=44552, attrs=..., buf=0x7f82f17e9000 "", len=4) at ./exec.c:3169 +#17 0x000055d2f2d9b01f in address_space_write (as=as@entry=0x55d2f3956960 <address_space_io>, addr=addr@entry=44552, attrs=..., buf=<optimized out>, len=len@entry=4) at ./exec.c:3259 +#18 0x000055d2f2d9b09e in address_space_rw (as=as@entry=0x55d2f3956960 <address_space_io>, addr=addr@entry=44552, attrs=..., attrs@entry=..., buf=<optimized out>, len=len@entry=4, + is_write=is_write@entry=true) at ./exec.c:3269 +#19 0x000055d2f2dfc94f in kvm_handle_io (count=1, size=4, direction=<optimized out>, data=<optimized out>, attrs=..., port=44552) at ./accel/kvm/kvm-all.c:2104 +#20 kvm_cpu_exec (cpu=cpu@entry=0x55d2f3dc9090) at ./accel/kvm/kvm-all.c:2350 +#21 0x000055d2f2dde53e in qemu_kvm_cpu_thread_fn (arg=0x55d2f3dc9090) at ./cpus.c:1318 +#22 qemu_kvm_cpu_thread_fn (arg=arg@entry=0x55d2f3dc9090) at ./cpus.c:1290 +#23 0x000055d2f321fe13 in qemu_thread_start (args=<optimized out>) at ./util/qemu-thread-posix.c:519 +#24 0x00007f82f4290609 in start_thread (arg=<optimized out>) at pthread_create.c:477 +#25 0x00007f82f41b7153 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 + +I changed the bug task to Qemu (Ubuntu) as this isn't a libvirt error. + +I also added an upstream qemu task in case this is a known issue for the developers there. Someone might be able to point us at a known discussion/fix. + +The Backtrace I added in the last comment should help to identify known cases. + +Hi Christian, + +yes that exactly what we see in our tests, +so are the logs that you asked for in comment#1 still needed? +also if you fix it can you please provide us a link for a package or even a workaround until the issue resolved, since this issue stuck our QA from testing ASAP over Focal. + +At the breaking function we have: + +29 void notifier_remove(Notifier *notifier) +30 { +31 QLIST_REMOVE(notifier, node); +32 } + + + +(gdb) p notifier +$1 = (Notifier *) 0x55d2f40c5078 +(gdb) p *notifier +$2 = {notify = 0x0, node = {le_next = 0x0, le_prev = 0x0}} + +And since QLIST_REMOVE is defined as: +140 #define QLIST_REMOVE(elm, field) do { \ +141 if ((elm)->field.le_next != NULL) \ +142 (elm)->field.le_next->field.le_prev = \ +143 (elm)->field.le_prev; \ +144 *(elm)->field.le_prev = (elm)->field.le_next; \ +145 } while (/*CONSTCOND*/0) + +(gdb) p (notifier)->node.le_next +$5 = (struct Notifier *) 0x0 +(gdb) p &(notifier->node) +$11 = (struct {...} *) 0x55d2f40c5080 + +There actually is a != NULL check, might it have changed on the fly. +I need to look at it more thoroughly, but it should be enough to recognize a known issue. + +Might be https://git.qemu.org/?p=qemu.git;a=commit;h=0446f8121723b134ca1d1ed0b73e96d4a0a8689d + +This would also match the backtrace path. + +That commit you mention is confirmed to solve a bug reported against Fedora with almost the same stack trace you see here. + +Hi Christian, + +Yes, +seems that the patch you mentioned fixing the issue i rebuilt the qemu with the patch and it's work fine now. +Thank you guys. + + +I regularly before a release pull in fixes that were posted for qemu-stable. +This is one of them, I'll again do such a build and retest this issue with it. + +I identified and backported (only one needed modification) 33 patches. +But as usual there might be some context needed on top - I have build that over night in [1] + +Testing that on my reproducer + +Attach-host: +[84652.671123] vfio-pci 0000:08:00.2: enabling device (0000 -> 0002) + +Attach-guest: +[ 45.199920] pci 0000:00:08.0: [15b3:1016] type 00 class 0x020000 +[ 45.200374] pci 0000:00:08.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit pref] +[ 45.201358] pci 0000:00:08.0: enabling Extended Tags +[ 45.202726] pci 0000:00:08.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0000:00:08.0 (capable of 63.008 Gb/s with 8 GT/s x8 link) +[ 45.208316] pci 0000:00:08.0: BAR 0: assigned [mem 0x100000000-0x1000fffff 64bit pref] +[ 45.256566] mlx5_core 0000:00:08.0: enabling device (0000 -> 0002) +[ 45.262103] mlx5_core 0000:00:08.0: firmware version: 14.27.1016 +[ 45.544010] mlx5_core 0000:00:08.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) +[ 45.710123] mlx5_core 0000:00:08.0 ens8: renamed from eth0 +[ 60.992547] random: crng init done +[ 60.992552] random: 3 urandom warning(s) missed due to ratelimiting + +Detach-host: +[84926.767411] mlx5_core 0000:08:00.2: enabling device (0000 -> 0002) +[84926.767514] mlx5_core 0000:08:00.2: firmware version: 14.27.1016 +[84927.036146] mlx5_core 0000:08:00.2: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) +[84927.208523] mlx5_core 0000:08:00.2 ens1v1: renamed from eth0 + +Detach-guest: +<nothing> + + +So yes, these changes fix the issue here (and a bunch of others). +I'll open up an MP to get these changes into Ubuntu 20.04. + +[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3981 + +This bug was fixed in the package qemu - 1:4.2-3ubuntu3 + +--------------- +qemu (1:4.2-3ubuntu3) focal; urgency=medium + + * d/p/stable/lp-1867519-*: Stabilize qemu 4.2 with upstream + patches @qemu-stable (LP: #1867519) + + -- Christian Ehrhardt <email address hidden> Wed, 18 Mar 2020 13:57:57 +0100 + diff --git a/results/classifier/zero-shot/105/semantic/1868055 b/results/classifier/zero-shot/105/semantic/1868055 new file mode 100644 index 000000000..26c480b35 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1868055 @@ -0,0 +1,149 @@ +semantic: 0.689 +instruction: 0.664 +other: 0.595 +mistranslation: 0.563 +device: 0.562 +graphic: 0.530 +assembly: 0.518 +network: 0.507 +vnc: 0.487 +boot: 0.476 +KVM: 0.426 +socket: 0.415 + +cannot run golang app with docker, version 17.09.1-ce, disabling core 0 and qemu-arm, version 2.7. + +Hello! +I figure out that sometimes simple go application is not working. +I am using docker + qemu-arm + go( for armv7l). + +These are version info below. + +root@VDBS1535:~# docker -v +Docker version 17.09.1-ce, build 19e2cf6 + +bash-3.2# qemu-arm --version +qemu-arm version 2.7.0, Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers + +$ go version +go version go1.12.6 linux/arm +$ go env +GOARCH="arm" +GOBIN="" +GOCACHE="/home/quickbuild/.cache/go-build" +GOEXE="" +GOFLAGS="" +GOHOSTARCH="arm" +GOHOSTOS="linux" +GOOS="linux" +GOPATH="/home/quickbuild/go" +GOPROXY="" +GORACE="" +GOROOT="/usr/lib/golang" +GOTMPDIR="" +GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_arm" +GCCGO="gccgo" +GOARM="7" +CC="gcc" +CXX="g++" +CGO_ENABLED="1" +GOMOD="" +CGO_CFLAGS="-g -O2" +CGO_CPPFLAGS="" +CGO_CXXFLAGS="-g -O2" +CGO_FFLAGS="-g -O2" +CGO_LDFLAGS="-g -O2" +PKG_CONFIG="pkg-config" +GOGCCFLAGS="-fPIC -marm -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build242285369=/tmp/go-build -gno-record-gcc-switches" + +This issue is come only when I disable core 0 using a command below. +please check "--cpuset-cpus=1-55" option. + +sudo docker run --privileged -d -i -t --cpuset-cpus=1-55 --mount type=bind,source="/home/dw83kim/mnt",destination="/mnt" --network host --name="ubuntu_core1" ubuntu:xenial-20200212 + + +This is what I have tested in the environment above. + +package main +func main(){ + for i:=0; i<1000; i++ { + println("Hello world") + } +} + +This is one of the error logs have faced sometimes not always. + +bash-3.2# go run test.go +fatal error: schedule: holding locks +panic during panic +SIGILL: illegal instruction +PC=0xc9ec4c m=3 sigcode=2 + +goroutine 122 [runnable]: +qemu: uncaught target signal 11 (Segmentation fault) - core dumped +Segmentation fault (core dumped) +bash-3.2# + +Please check it. +Thanks in advance. + +This is a known and fixed bug in QEMU. Please try a more recent version than 2.7 (eg 4.2, which is the most recent release). + + +Could you retest with latest version (4.2.0) of QEMU? + +LP:1696773 is the old bug that I think is probably the cause here, though 2.7 is old enough it has a bunch of other linux-user race condition bugs that we've since fixed. + + +Hello! Peter and Laurent, +Thanks for your kind & rapid reply. + +It took long to merge the patch Peter mentioned. +After applying the patch the problem is gone but I found new issue. + +When I had tried to test for the first time after making new docker container it took much longer time. + + +bash-3.2# time go run test.go +Hello world + +real 5m3.516s +user 5m48.696s +sys 13m32.600s +bash-3.2# time go run test.go +Hello world + +real 0m1.784s +user 0m2.339s +sys 0m1.742s +bash-3.2# time go run test.go +Hello world + +real 0m1.881s +user 0m2.302s +sys 0m1.926s +bash-3.2# pwd + +I believe that 5 min for just printing "Hello world" is not your expectation. + +Is it also known issue? +Please check it. + +Thanks. + + + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting older bugs to "Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1868527 b/results/classifier/zero-shot/105/semantic/1868527 new file mode 100644 index 000000000..8c884ad90 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1868527 @@ -0,0 +1,44 @@ +semantic: 0.608 +graphic: 0.596 +device: 0.527 +instruction: 0.510 +other: 0.411 +mistranslation: 0.403 +assembly: 0.296 +vnc: 0.262 +socket: 0.207 +network: 0.174 +boot: 0.152 +KVM: 0.086 + +alignment may overlap the TLB flags + +Hi, +In QEMU-4.2.0, or git-9b26a610936deaf436af9b7e39e4b7f0a35e4409, alignment may overlap the TLB flags. +For example, the alignment: MO_ALIGN_32, + MO_ALIGN_32 = 5 << MO_ASHIFT, +and the TLB flag: TLB_DISCARD_WRITE +#define TLB_DISCARD_WRITE (1 << (TARGET_PAGE_BITS_MIN - 6)) + +then, in the function "get_alignment_bits", the assert may fail: + +#if defined(CONFIG_SOFTMMU) + /* The requested alignment cannot overlap the TLB flags. */ + tcg_debug_assert((TLB_FLAGS_MASK & ((1 << a) - 1)) == 0); +#endif + +However, the alignment of MO_ALIGN_32 is not used for now, so the assert cannot be triggered in current version. Anyway it seems like a potential conflict. + +That is of course completely dependent on the target page size. So, yes, a target with a very small page size cannot use large alignments. The assert makes sure. + +Is this comment simply by inspection, or did you have an actual bug to report? + +This is an inspection yet. +For ARM SMMU simulation, TARGET_PAGE_BITS_MIN is 10. All low bits of the TLB virtual address are used up by TLB flags and alignment flags. It's a little crowded. +/* + * ARMv7 and later CPUs have 4K pages minimum, but ARMv5 and v6 + * have to support 1K tiny pages. + */ +# define TARGET_PAGE_BITS_VARY +# define TARGET_PAGE_BITS_MIN 10 + diff --git a/results/classifier/zero-shot/105/semantic/1870098 b/results/classifier/zero-shot/105/semantic/1870098 new file mode 100644 index 000000000..b75253ccd --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1870098 @@ -0,0 +1,54 @@ +semantic: 0.892 +mistranslation: 0.854 +other: 0.788 +device: 0.743 +graphic: 0.719 +vnc: 0.629 +socket: 0.626 +network: 0.596 +instruction: 0.581 +boot: 0.541 +KVM: 0.466 +assembly: 0.291 + +[block/vpc] dynamic disk header: off-by-one error for "num_bat_entries" + +In current qemu versions (observed in 5.0.0-rc1 as well as 2833ad487cfff7dc33703e4731b75facde1c561e), disk headers for dynamic VPCs are written with an incorrect "block allocation table entries" value. + +https://www.microsoft.com/en-us/download/details.aspx?id=23850 (the corresponding spec) states that: + +"Max Table Entries +This field holds the maximum entries present in the BAT. This should be equal to the number of blocks in the disk (that is, the disk size divided by the block size)." + +Inside the qemu code, the value is "disk size divided by the block size *plus one*". + +Calculating "num_bat_entries" as "total_sectors/(block_size / 512)" *should* fix the issue. + +Is there any actual bug resulting from this that you're observing? As I read the spec, having a longer BAT is merely unconventional, not strictly wrong. So if another application fails to deal with such images, it's probably a bug in that application. + +Of course, I can't see a reason for making the BAT longer than necessary either. We do, however, need to round up if the disk size is not a multiple of the image block size. So I think what it really should be is: + +num_bat_entries = DIV_ROUND_UP(total_sectors, block_size / 512) + +If you agree, please let me know if I should submit a patch or if you would like to do that yourself. (See https://wiki.qemu.org/Contribute/SubmitAPatch) + +Ah, sorry, I failed to mention this: Due to this bug, qemu currently cannot create VHDs that are suitable for upload to Azure (because Azure expects disks that are aligned exactly to 1MB). + +If it would not be too much trouble for you to submit the patch, I would appreciate that a lot. I've never submitted a patch to qemu and the contribution doc reads somewhat complex, so I'm a bit concerned about dragging a very small patch out longer than strictly necessary. + +Thanks a lot! + +As I don't have your email address, I could not CC you on the patch email. Can you please verify if the following patch on the mailing list fixes your problem? + +https://lists.gnu.org/archive/html/qemu-block/2020-04/msg00086.html + +Thanks a lot for looking into it! + +Yes, we were able to verify that this patch does fix the problem. + +Many thanks! + +Fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=3f6de653b946 + + diff --git a/results/classifier/zero-shot/105/semantic/1872847 b/results/classifier/zero-shot/105/semantic/1872847 new file mode 100644 index 000000000..17906a153 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1872847 @@ -0,0 +1,62 @@ +semantic: 0.680 +graphic: 0.635 +other: 0.569 +device: 0.485 +instruction: 0.427 +vnc: 0.322 +network: 0.283 +socket: 0.263 +boot: 0.256 +mistranslation: 0.256 +KVM: 0.098 +assembly: 0.076 + +qemu-alpha linux-user breaks python3.6 + +Running on Gentoo Linux in a chroot environment: +# python3 -c 'import selectors; selectors.DefaultSelector()' +Traceback (most recent call last): + File "<string>", line 1, in <module> + File "/usr/lib/python3.7/selectors.py", line 349, in __init__ + self._selector = self._selector_cls() +OSError: [Errno 22] Invalid argument + +However, on real hardware, with the same binaries there is no exception. + +This impacts whole python3 based Gentoo ebuild system (package management), and renders linux user mode alpha emulation in chroot environment building useless, more or less. + +The used systems: +# qemu-alpha --version +qemu-alpha version 4.2.0 +Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers +# uname -a +Linux blackbird 5.4.28-gentoo-blackbird-06 #2 SMP Sat Apr 4 13:13:10 CEST 2020 x86_64 AMD Ryzen 5 3600 6-Core Processor AuthenticAMD GNU/Linux +(chroot)# python3 --version +Python 3.7.7 + +Hi, + +do you know if it works with previous version of qemu? +Do you know if it works with qemu built from git repo? + +I know, that it is broken since I use for alpha emulation, since 2017-2018. However it worked with python2.7 before. But python 2.7 reached end of life support, and HAVE TO use 3.6 or 3.7, so this one became a pain now. I will try the git version, but have no high hopes... + +Tried git version qemu-alpha as well, and I can confirm it gives the same error. + +For additional information, neither of these has this bug (nor in 4.2.0, neither in git): +- qemu-mips64 +- qemu-arm +- qemu-aarch64 + + +Related Gentoo bug: https://bugs.gentoo.org/717548 + +Proposed possible fix as https://lists.nongnu.org/archive/html/qemu-devel/2020-04/msg02545.html + +Tested the proposed patch from Sergei Trofimovich, and it solves the problem, while doesn't break the other archs I use (mips64,arm,aarch64 also tested). + +Thank you! + +386d38656889 ("linux-user/syscall.c: add target-to-host mapping for epoll_create1()") + + diff --git a/results/classifier/zero-shot/105/semantic/1875702 b/results/classifier/zero-shot/105/semantic/1875702 new file mode 100644 index 000000000..b9de893a5 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1875702 @@ -0,0 +1,44 @@ +semantic: 0.439 +device: 0.425 +socket: 0.400 +network: 0.387 +other: 0.338 +instruction: 0.258 +vnc: 0.162 +mistranslation: 0.134 +boot: 0.131 +graphic: 0.088 +KVM: 0.065 +assembly: 0.047 + +madvise reports success, but doesn't implement WIPEONFORK. + +The implementation of madvise (linux-user/syscall.c:11331, tag v5.0.0-rc4) always returns zero (i.e. success). However, an application requesting (at least) MADV_WIPEONFORK may need to know whether the call was actually successful. If not (because the kernel doesn't support WIPEONFORK) then it will need to take other measures to provide fork-safety (such as drawing entropy from the kernel in every case). But, if the application believes that WIPEONFORK is supported (because madvise returned zero), but it actually isn't (as in qemu), then it may forego those protections on the assumption that WIPEONFORK will provide fork-safety. + +Roughly, the comment in qemu that says "This is a hint, so ignoring and returning success is ok." is no longer accurate in the presence of MADV_WIPEONFORK. + +(This is not purely academic: BoringSSL is planning on acting in this way. We found the qemu behaviour in pre-release testing and are planning on making an madvise call with advice=-1 first to test whether unknown advice values actually produce EINVAL.) + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting older bugs to "Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +Still relevant. See also bug #1926521 -- MADV_DONTNEED is another madvise() value that can't be ignored as "just a hint". + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/343 + + diff --git a/results/classifier/zero-shot/105/semantic/1877688 b/results/classifier/zero-shot/105/semantic/1877688 new file mode 100644 index 000000000..ad16bc00c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1877688 @@ -0,0 +1,52 @@ +semantic: 0.732 +device: 0.642 +instruction: 0.630 +other: 0.586 +KVM: 0.553 +network: 0.531 +graphic: 0.530 +vnc: 0.526 +socket: 0.526 +mistranslation: 0.516 +boot: 0.404 +assembly: 0.172 + +9p virtfs device reports error when opening certain files + +Reading certain files on a 9p mounted FS produces this error message: + +qemu-system-x86_64: VirtFS reply type 117 needs 12 bytes, buffer has 12, less than minimum + +After this error message is generated, further accesses to the 9p FS hangs whatever tries to access it. The Arch Linux guest system is otherwise usable. This happens with QEMU 5.0.0 and guest kernel version 5.6.11, hosted on an Arch Linux distro. I use the following command to launch QEMU: + +exec qemu-system-x86_64 -enable-kvm -display gtk -vga virtio -cpu host -m 4G -netdev tap,ifname=vmtap0,id=vn0,script=no,downscript=no -device virtio-net-pci,netdev=vn0 -kernel kernel.img -drive file=file.img,format=raw,if=virtio -virtfs local,path=mnt,mount_tag=host0,security_model=passthrough,id=host0 -append "console=ttyS0 root=/dev/vda rw" + +There's nothing relevant in the guest kernel logs as far as I'm aware of with loglevel set to 7. + +Here's a C program to trigger this behavior. I don't think it matters what the contents of "file" or its size is. + +Looks like being introduced by this change: +https://patchwork.kernel.org/patch/11319993/ + +More specifically this one exactly: + +- if (buf_size < size) { ++ if (buf_size < P9_IOHDRSZ) { + + + +The following patch should fix this bug for the kvm backend (not for the XEN backend yet). + +Please let me know if it fixes this bug for you. + +Thanks, it works. + +Fix is now committed on master as SHA-1 cf45183b718f02b1369e18c795dc51bc1821245d, which actually just reverted the mentioned commit that was leading to this broken behavior: +https://github.com/qemu/qemu/commit/cf45183b718f02b1369e18c795dc51bc1821245d + +The original Xen transport bug that motivated that change, was now fixed differently by handling that Xen issue solely on Xen transport driver side: +https://github.com/qemu/qemu/commit/a4c4d462729466c4756bac8a0a8d77eb63b21ef7 + + +Fixed in QEMU 5.1 release. + diff --git a/results/classifier/zero-shot/105/semantic/1878348 b/results/classifier/zero-shot/105/semantic/1878348 new file mode 100644 index 000000000..e0445c55f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1878348 @@ -0,0 +1,108 @@ +semantic: 0.846 +assembly: 0.840 +instruction: 0.823 +graphic: 0.813 +device: 0.794 +other: 0.767 +mistranslation: 0.737 +KVM: 0.686 +network: 0.684 +socket: 0.656 +boot: 0.641 +vnc: 0.628 + +--static build fails in v5.0 (since 5010cec2bc87dafab39b3913c8ca91f88df9c540) + +Hi, + +Since commit 5010cec2bc87dafab39b3913c8ca91f88df9c540, building qemu fails when configured with --static (eg ../configure --target-list=x86_64-softmmu,x86_64-linux-user --enable-debug --static). + +On ubuntu 16.04, it fails to find -lffi and -lselinux. + +After I apt-get install libffi-dev libselinux1-dev, the build still fails: +../backends/dbus-vmstate.o: In function `_nocheck__trace_dbus_vmstate_pre_save': +/home/christophe.lyon/src/qemu/build-static/backends/trace.h:29: undefined reference to `_TRACE_DBUS_VMSTATE_PRE_SAVE_DSTATE' +../backends/dbus-vmstate.o: In function `_nocheck__trace_dbus_vmstate_post_load': +/home/christophe.lyon/src/qemu/build-static/backends/trace.h:52: undefined reference to `_TRACE_DBUS_VMSTATE_POST_LOAD_DSTATE' +../backends/dbus-vmstate.o: In function `_nocheck__trace_dbus_vmstate_loading': +/home/christophe.lyon/src/qemu/build-static/backends/trace.h:75: undefined reference to `_TRACE_DBUS_VMSTATE_LOADING_DSTATE' +../backends/dbus-vmstate.o: In function `_nocheck__trace_dbus_vmstate_saving': +/home/christophe.lyon/src/qemu/build-static/backends/trace.h:98: undefined reference to `_TRACE_DBUS_VMSTATE_SAVING_DSTATE' +collect2: error: ld returned 1 exit status + +I'm not able to reproduce your problem. + +Are you able to reproduce the problem if you cleanup your build directory (make distclean)? + +Right, after a make distclean + configure, I managed to complete the build after installing libffi-dev libselinux1-dev. + +However, I think there's a bug in configure: it should either complain when these packages are missing, or disable the module that needs them. + +Without libffi-dev and libselinux1-dev, the build fails with: + LINK x86_64-softmmu/qemu-system-x86_64 +/usr/bin/ld: cannot find -lselinux +/usr/bin/ld: cannot find -lffi + + +Semi-officially, QEMU only aims to support static linking with usermode emulators, not system mode emulators. I'm not sure we make that clear anywhere in the docs, or configure script. We should probably print a warning from configure if using --static in combination with system emulators, that this is an untested scenario and users are responsible for figuring out any problems they hit such as missing libraries at link time. + +In particular it is a known limitation that the configure checks for pre-requisite libraries only validate existence of the shared libraries, and make no attempt to look for the static variant, and it was decided not to fix that. + + +I think it's largely that many distros ship pkg-config files which are just broken for the static linking case -- so configure tests "does pkg-config say this will work for static linking", and pkg-config says "yes, that will work", and then it doesn't. If you care about trying to get this to be more reliable you'd want to investigate all of these and file bugs upstream with your distro and get them fixed... + + +OK I wasn't aware that static linking was not supported by system emulators, thanks for the heads-up. I've updated our build scripts not to use static link, so you can close this PR unless you want to keep track that configure needs improvements. + +Thanks. + + +For the record, previous attempt to fix: +https://<email address hidden>/msg624142.html +and identical conclusion: +https://<email address hidden>/msg624164.html + + +Maybe --static should be ignored for system emulators and accepted for user-mode emulators? +That would enable to have a single build, otherwise if we want both, we'd need to configure & build QEMU twice. + + +Some people want the system emulation to be statically linked, which is why we don't refuse to do it entirely; and static vs not changes a bunch of stuff like CFLAGS which we assume to be common across the whole build. So if you want some statically linked binaries and some not statically linked, then yes, you should configure and build twice. (Use separate build directories, one for each config.) + + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting older bugs to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" within the next 60 days (otherwise it will get +closed as "Expired"). We will then eventually migrate the ticket auto- +matically to the new system. + +Thank you and sorry for the inconvenience. + + +FWIW, a configure line that works for me for static system-emulation builds on Ubuntu 18.04 with QEMU 6.0 is: + +'../../configure' '--target-list=arm-softmmu' '--enable-debug' '--static' '--disable-tools' '--disable-sdl' '--disable-gtk' '--disable-vnc' '--disable-virtfs' '--disable-attr' '--disable-libiscsi' '--disable-libnfs' '--disable-libusb' '--disable-opengl' '--disable-numa' '--disable-usb-redir' '--disable-bzip2' '--audio-drv-list=' '--disable-guest-agent' '--disable-vte' '--disable-mpath' '--disable-libudev' '--disable-vhost-user' '--disable-curl' + + +I have re-tested at commit d45a5270d075ea589f0b0ddcf963a5fea1f500ac, and the build succeeded, so it looks like the problem has been fixed. + + diff --git a/results/classifier/zero-shot/105/semantic/1879672 b/results/classifier/zero-shot/105/semantic/1879672 new file mode 100644 index 000000000..62d0929f9 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1879672 @@ -0,0 +1,664 @@ +semantic: 0.902 +assembly: 0.899 +other: 0.892 +graphic: 0.888 +device: 0.871 +KVM: 0.865 +vnc: 0.863 +mistranslation: 0.853 +network: 0.846 +instruction: 0.834 +boot: 0.808 +socket: 0.796 + +QEMU installer with WHPX support + +People often ask the community to add WHPX support to the QEMU installer for Windows, +but it is impossible due to the license limitations of the WHPX SDK. + +The WinHvEmulation.h and WinHvPlatform.h header files needed are "All +rights reserved". + +However these headers only contain struct definitions and integer constants, +no functional code in macros or inline functions. See: +https://<email address hidden>/msg645815.html +It is questionable whether the headers alone can be considered copyrightable material. + +Has anyone raised an RFE with the mingw64 project to provide these headers / APIs ? That's what provides the interfaces we usually rely on for Windows builds, and they're likely familiar with what they can & can't do from a legal POV. I don't see this as something QEMU needs to solve itself. + ++launchpad ticket + +On 9/19/19 1:26 PM, Philippe Mathieu-Daudé wrote: +> On 9/19/19 1:18 PM, Stefan Weil wrote: +>> Am 19.09.2019 um 12:59 schrieb Philippe Mathieu-Daudé: +>>> Add a job to cross-build QEMU with WHPX enabled. +>>> +>>> Use the Win10SDK headers from the Android Project, as commented +>>> in https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg03842.html +>>> +>>> Based-on: <email address hidden> +>>> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg03844.html +>>> +>>> Philippe Mathieu-Daudé (2): +>>> tests/docker: Add fedora-win10sdk-cross image +>>> .shippable.yml: Build WHPX enabled binaries +>>> +>>> .shippable.yml | 2 ++ +>>> tests/docker/Makefile.include | 1 + +>>> .../dockerfiles/fedora-win10sdk-cross.docker | 21 +++++++++++++++++++ +>>> 3 files changed, 24 insertions(+) +>>> create mode 100644 tests/docker/dockerfiles/fedora-win10sdk-cross.docker +>>> +>> +>> Please note that the required header files are part of the Win10SDK +>> which is not published under a free license, so I am afraid that they +>> cannot be used with QEMU code to produce free binaries. +> +> Yes :S +> +>> I have addressed that some time ago, and Justin Terry is still looking +>> for a solution on the Microsoft side. +> +> Oh this is a good news, thanks for caring about this issue, +> and thanks Justin for looking for a solution! +> +> Trying to understand how WHPX is used, I noticed there are much many +> Windows QEMU users than I thought, and it would be nice if we can have +> some upstream CI testing to not break the various projects using it. +> +> Regards, +> +> Phil. +> + + + ++launchpad ticket + +On 9/20/19 6:53 PM, Justin Terry (VM) wrote: +> Hey Phil, +> +> I have contacted our legal department for guidance on this specific use case and will update you when I hear back. Thank you for your patience. +> +> Justin Terry +> +>> -----Original Message----- +>> From: Philippe Mathieu-Daudé <email address hidden> +>> Sent: Friday, September 20, 2019 8:18 AM +>> To: <email address hidden>; Justin Terry (VM) <email address hidden> +>> Cc: Daniel P . Berrangé <email address hidden>; Fam Zheng +>> <email address hidden>; Thomas Huth <email address hidden>; Paolo Bonzini +>> <email address hidden>; Alex Bennée <email address hidden>; Richard +>> Henderson <email address hidden>; Eduardo Habkost <email address hidden>; +>> Stefan Weil <email address hidden> +>> Subject: Re: [PATCH v2 0/3] testing: Build WHPX enabled binaries +>> +>> On 9/20/19 1:33 PM, Philippe Mathieu-Daudé wrote: +>>> Add a job to cross-build QEMU with WHPX enabled. +>>> +>>> Since the WHPX is currently broken, include the patch required to have +>>> successful Shippable build. +>>> +>>> I previously included the WHPX headers shared by the Android project, +>>> and Daniel asked me to check the EULA. While trying to manually +>>> install the Windows SDK, I noticed the installer fetches archives +>>> directly, kindly asking where they are stored via the /fwlink API. +>>> Do the same, fetch the required archives and extract them. No need to +>>> accept EULA... +>>> +>>> Docker build the image first, then build QEMU in a instance of this +>>> image. The image is internal to Shippable, the instances are not +>>> reachable and are thrown once the build is finished. What we collect +>>> from Shippable is the console output of QEMU build process, and if the +>>> build process succeed or failed. So far we do not redistribute the +>>> image or built binaries. +>>> +>>> Philippe Mathieu-Daudé (3): +>>> target/i386: Fix broken build with WHPX enabled +>>> tests/docker: Add fedora-win10sdk-cross image +>>> .shippable.yml: Build WHPX enabled binaries +>> +>> FWIW here is the result of this series: +>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapp. +>> shippable.com%2Fgithub%2Fphilmd%2Fqemu%2Fruns%2F516%2F11%2Fcon +>> sole&data=02%7C01%7Cjuterry%40microsoft.com%7C733a566f3233427 +>> 8ae6f08d73dddb39f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6 +>> 37045894733463150&sdata=55URgDII5r74QMUpLOD%2FWT5%2B5jbzyv +>> nfCSdv%2FNaWDAw%3D&reserved=0 +>> Duration 17 minutes (1076 seconds) +>> +>> 4m49s building the qemu:fedora-win10sdk-cross docker image, 11m10s +>> building WHPX QEMU. + + + ++launchpad ticket + +On 11/7/19 11:52 PM, Sunil Muthuswamy wrote: +>>> You will need the Windows 10 SDK for RS5 (build 17763) or above to +>>> to be able to compile this patch because of the definition of the +>>> XCR0 register. +>>> +>>> Changes since v1: +>>> - Added a sign-off line in the patch. +>> +>> +>> I am not very happy with the current situation which suggests using non +>> free header files from the Microsoft Windows SDK, thus making it +>> impossible to produce QEMU executables for Windows with WHPX support +>> without having legal complications. +>> +>> Could you please add the required headers with a suitable license to the +>> QEMU source code? That would clarify the license issue and make builds +>> with WHPX much easier because those files would not have to be extracted +>> from a very large SDK installation. +>> +>> It would also be acceptable if Microsoft could update the license +>> comments in those files and use a QEMU compatible license. +>> +> I agree in principle that there should be an easier way to consume the Windows +> SDK headers without having to play around with the licenses. I also agree that +> that will make life lot easier for many developers. I am reaching out +> internally here to see what can be done about this, but, that might take some +> time. Meanwhile, is it possible to make some progress on this patch? +> +>> Kind regards +>> Stefan Weil +>> +>> +> + + + ++Mike Battista & lanchpad ticket + +On 2/24/20 8:43 PM, Sunil Muthuswamy wrote: +>> -----Original Message----- +>> From: Stefan Weil <email address hidden> +>> Sent: Thursday, February 20, 2020 11:54 PM +>> To: Justin Terry (SF) <email address hidden>; Philippe Mathieu-Daudé <email address hidden>; Sunil Muthuswamy +>> <email address hidden>; Eduardo Habkost <email address hidden>; Paolo Bonzini <email address hidden>; Richard Henderson +>> <email address hidden> +>> Cc: <email address hidden> +>> Subject: Re: [EXTERNAL] Re: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor Platform +>> +>> Am 19.02.20 um 16:50 schrieb Justin Terry (SF): +>> +>> +>> Hello Justin, hello Sunil, +>> +>> just a reminder: we still have the problem with the proprietary license +>> for the required Microsoft header files. +>> +>> Can you estimate when this will be solved? +>> +> +> Thanks for the reminder, Stefan. Yes, agreed this problem still exists. We followed up with +> the SDK team and the legal team end of last year. I will nudge them again for an update +> here. +> +>> Regards, +>> Stefan +>> +> + + + +Hi Sunil, + +On 5/19/20 11:59 PM, Sunil Muthuswamy wrote: +>> -----Original Message----- +>> From: Stefan Weil <email address hidden> +>> Sent: Thursday, February 20, 2020 11:54 PM +>> To: Justin Terry (SF) <email address hidden>; Philippe Mathieu-Daudé <email address hidden>; Sunil Muthuswamy +>> <email address hidden>; Eduardo Habkost <email address hidden>; Paolo Bonzini <email address hidden>; Richard +>> Henderson <email address hidden> +>> Cc: <email address hidden> +>> Subject: Re: [EXTERNAL] Re: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor Platform +>> +>> Am 19.02.20 um 16:50 schrieb Justin Terry (SF): +>> +>> Hello Justin, hello Sunil, +>> +>> just a reminder: we still have the problem with the proprietary license +>> for the required Microsoft header files. +>> +>> Can you estimate when this will be solved? +>> +>> Regards, +>> Stefan +>> +> +> Adding Mike Battista, who is on the SDK team and can help provide some clarity around the questions about SDK licensing. +> + +To ease communication and track the changes over time regarding this +problem, I opened a ticket on Launchpad: +https://bugs.launchpad.net/qemu/+bug/1879672 + +Last time (Sept 2019) Justin Terry contacted Microsoft legal department +for guidance but no update since. +This is unfortunate as we can not let the community use this feature, +neither can we keep testing WHPX to avoid code bitrot. + +Can you meanwhile provide Azure CI builds using WHPX enabled? + +Regards, + +Phil. + + + +> Has anyone raised an RFE with the mingw64 project to provide these headers / APIs? + +I had asked a long time ago on IRC (#mingw-w64 IRC channel on irc.oftc.net), but got no answer. + +Regards, +Stefan + +Hi Justin, Sunil, + +On 5/20/20 12:26 PM, Philippe Mathieu-Daudé wrote: +> +launchpad ticket +> +> On 9/20/19 6:53 PM, Justin Terry (VM) wrote: +>> Hey Phil, +>> +>> I have contacted our legal department for guidance on this specific +>> use case and will update you when I hear back. Thank you for your +>> patience. + +I recently understood legal changes can be very complex, thus it is +implicit it can take years before getting updates. + +Since the project is still actively developed, maybe you could provide +a Azure CI job to build a WHPX binary. We don't need to have access to +the binary, just to the exit status (success/fail) and build logs. + +Do you think it is doable? + +Thanks, + +Phil. + +>> +>> Justin Terry +>> +>>> -----Original Message----- +>>> From: Philippe Mathieu-Daudé <email address hidden> +>>> Sent: Friday, September 20, 2019 8:18 AM +>>> To: <email address hidden>; Justin Terry (VM) <email address hidden> +>>> Cc: Daniel P . Berrangé <email address hidden>; Fam Zheng +>>> <email address hidden>; Thomas Huth <email address hidden>; Paolo Bonzini +>>> <email address hidden>; Alex Bennée <email address hidden>; Richard +>>> Henderson <email address hidden>; Eduardo Habkost <email address hidden>; +>>> Stefan Weil <email address hidden> +>>> Subject: Re: [PATCH v2 0/3] testing: Build WHPX enabled binaries +>>> +>>> On 9/20/19 1:33 PM, Philippe Mathieu-Daudé wrote: +>>>> Add a job to cross-build QEMU with WHPX enabled. +>>>> +>>>> Since the WHPX is currently broken, include the patch required to have +>>>> successful Shippable build. +>>>> +>>>> I previously included the WHPX headers shared by the Android project, +>>>> and Daniel asked me to check the EULA. While trying to manually +>>>> install the Windows SDK, I noticed the installer fetches archives +>>>> directly, kindly asking where they are stored via the /fwlink API. +>>>> Do the same, fetch the required archives and extract them. No need to +>>>> accept EULA... +>>>> +>>>> Docker build the image first, then build QEMU in a instance of this +>>>> image. The image is internal to Shippable, the instances are not +>>>> reachable and are thrown once the build is finished. What we collect +>>>> from Shippable is the console output of QEMU build process, and if the +>>>> build process succeed or failed. So far we do not redistribute the +>>>> image or built binaries. +>>>> +>>>> Philippe Mathieu-Daudé (3): +>>>> target/i386: Fix broken build with WHPX enabled +>>>> tests/docker: Add fedora-win10sdk-cross image +>>>> .shippable.yml: Build WHPX enabled binaries +>>> +>>> FWIW here is the result of this series: +>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapp. +>>> shippable.com%2Fgithub%2Fphilmd%2Fqemu%2Fruns%2F516%2F11%2Fcon +>>> sole&data=02%7C01%7Cjuterry%40microsoft.com%7C733a566f3233427 +>>> 8ae6f08d73dddb39f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6 +>>> 37045894733463150&sdata=55URgDII5r74QMUpLOD%2FWT5%2B5jbzyv +>>> nfCSdv%2FNaWDAw%3D&reserved=0 +>>> Duration 17 minutes (1076 seconds) +>>> +>>> 4m49s building the qemu:fedora-win10sdk-cross docker image, 11m10s +>>> building WHPX QEMU. +> + + + +Hi Sunil, + +On 8/1/20 1:31 AM, Sunil Muthuswamy wrote: +>> Hi Justin, Sunil, +> +> Justin has moved to a different team is no longer working with WHPX. Moving him +> to bcc. + +OK. Does that mean you are the new responsible of updating the ticket +regarding the WHPX headers and their license? + +> +>> +>> On 5/20/20 12:26 PM, Philippe Mathieu-Daudé wrote: +>>> +launchpad ticket +>>> +>>> On 9/20/19 6:53 PM, Justin Terry (VM) wrote: +>>>> Hey Phil, +>>>> +>>>> I have contacted our legal department for guidance on this specific +>>>> use case and will update you when I hear back. Thank you for your +>>>> patience. +>> +>> I recently understood legal changes can be very complex, thus it is +>> implicit it can take years before getting updates. +>> +>> Since the project is still actively developed, maybe you could provide +>> a Azure CI job to build a WHPX binary. We don't need to have access to +>> the binary, just to the exit status (success/fail) and build logs. +>> +>> Do you think it is doable? +>> +>> Thanks, +>> +>> Phil. +>> +> The ask generally sounds reasonable. But, can you help me understand the full +> scope of the ask. Few questions: +> 1. Stefan has a CI pipeline to build WHPX. + +Great! I didn't know Stefan already did it :) +Can you share the URL please, so we can integrate it with mainstream CI? + +> What's the benefit of having another CI +> job, that doesn't export the binary, but, just the status? + +As usual, we do not want to circumvent the license. IANAL but IIUC we +can not force a CI job to accept the EULA when installing it, even to +test it. So the best we can do is check if the build succeeded (exit +status). + +> 2. Which branch is the CI pipeline expected to build? + +'master', to be sure no regressions are introduced. + +> 3. Is the expectation also that it will build WHPX patches that are submitted to the +> WHPX branch? + +You describe a "downstream CI" testing, which is out of scope of the +community public CI. + +Regards, + +Phil. + + + +On 8/4/20 8:55 AM, Stefan Weil wrote: +> Am 04.08.20 um 08:43 schrieb Thomas Huth: +> +>> On 03/08/2020 22.25, Stefan Weil wrote: +>>> We can add a CI pipeline on Microsoft infrastructure by using a GitHub +>>> action. +>> Sorry for being ignorant, but how does that solve the legal questions +>> just because it is running on GitHub instead of a different CI? +>> +>> Thomas +>> +> +> Sorry, I though that would be clear by looking at the included shell script. +> +> The build does not use the Microsoft SDK. It gets the required header +> files from Mingw-w64. They added them in git master. + +Oh, so we can do that with GitLab too now, we don't need to rely on the +GitHub 'Actions' CI in particular, right? + +> +> See +> https://github.com/stweil/qemu/blob/master/.github/workflows/build.sh#L50 +> for code details. +> +> It's still shameful that MS is forcing developers to waste time +> rewriting API headers, just because the MS legal departments are not +> able to understand the needs of Open Source development. + +There has be a big switch from Microsoft toward Open Source, I attended +some of there talk at the Open Source Summit in 2018. Maybe we simply +haven't contacted the right persons to make the changes...? + +> +> Stefan +> +> +> + + + +On 8/4/20 9:42 AM, Stefan Weil wrote: +> Am 04.08.20 um 09:23 schrieb Philippe Mathieu-Daudé: +> +>> On 8/4/20 8:55 AM, Stefan Weil wrote: +>>> Am 04.08.20 um 08:43 schrieb Thomas Huth: +>>> +>>>> On 03/08/2020 22.25, Stefan Weil wrote: +>>>>> We can add a CI pipeline on Microsoft infrastructure by using a GitHub +>>>>> action. +>>>> Sorry for being ignorant, but how does that solve the legal questions +>>>> just because it is running on GitHub instead of a different CI? +>>>> +>>>> Thomas +>>>> +>>> Sorry, I though that would be clear by looking at the included shell script. +>>> +>>> The build does not use the Microsoft SDK. It gets the required header +>>> files from Mingw-w64. They added them in git master. +>> Oh, so we can do that with GitLab too now, we don't need to rely on the +>> GitHub 'Actions' CI in particular, right? +> +> +> That's right. The build script was written for Ubuntu, so depending on +> the distribution used for GitLab CI it will need some modifications. If +> GitLab already has a recent Mingw-w64, it might be sufficient to fix the +> case of the header file names. Mingw-w64 uses winhvplatform.h while QEMU +> expects WinHvPlatform.h and so on. I used symbolic links to add the +> camel case filenames. +> +> +>>> See +>>> https://github.com/stweil/qemu/blob/master/.github/workflows/build.sh#L50 +>>> for code details. +>>> +>>> It's still shameful that MS is forcing developers to waste time +>>> rewriting API headers, just because the MS legal departments are not +>>> able to understand the needs of Open Source development. +>> There has be a big switch from Microsoft toward Open Source, I attended +>> some of there talk at the Open Source Summit in 2018. Maybe we simply +>> haven't contacted the right persons to make the changes...? +> +> +> Maybe, but it is difficult to find the right person in a large company +> like MS, and legal departments are often somehow special. + +Sunil seems quite active with the WHPX development, and the section is +listed as "Supported [my Microsoft]" in MAINTAINERS. I'm confident we +have someone else able to help use finding the right contacts in the +company :) + +> +> And yes, they learned that Open Source can help them for their business, +> too. +> +> Stefan +> +> +> + + + +On Tue, Aug 04, 2020 at 10:10:31AM +0200, Thomas Huth wrote: +> On 04/08/2020 09.42, Stefan Weil wrote: +> > Am 04.08.20 um 09:23 schrieb Philippe Mathieu-Daudé: +> > +> >> On 8/4/20 8:55 AM, Stefan Weil wrote: +> >>> Am 04.08.20 um 08:43 schrieb Thomas Huth: +> >>> +> >>>> On 03/08/2020 22.25, Stefan Weil wrote: +> >>>>> We can add a CI pipeline on Microsoft infrastructure by using a GitHub +> >>>>> action. +> >>>> Sorry for being ignorant, but how does that solve the legal questions +> >>>> just because it is running on GitHub instead of a different CI? +> >>>> +> >>>> Thomas +> >>>> +> >>> Sorry, I though that would be clear by looking at the included shell script. +> >>> +> >>> The build does not use the Microsoft SDK. It gets the required header +> >>> files from Mingw-w64. They added them in git master. +> +> Great, thanks for the clarification! +> +> >> Oh, so we can do that with GitLab too now, we don't need to rely on the +> >> GitHub 'Actions' CI in particular, right? +> > +> > That's right. The build script was written for Ubuntu, so depending on +> > the distribution used for GitLab CI it will need some modifications. If +> > GitLab already has a recent Mingw-w64, it might be sufficient to fix the +> > case of the header file names. Mingw-w64 uses winhvplatform.h while QEMU +> > expects WinHvPlatform.h and so on. I used symbolic links to add the +> > camel case filenames. +> +> I'm currently working on a patch series for our gitlab-CI that uses our +> containers to all possible kinds of cross-compiler builds (basically the +> ones that we are doing on shippable.com so far), including the 32-bit +> and 64-bit MinGW cross-compilation jobs. I can have a look whether I can +> integrate these headers there! + +Fedora rawhide carries mingw64 v7.0.0, which was released in Nov 2019 + +The WHPX headers were added to mingw64 git a week later, so they're +not available in any distro yet. + +The mingw64 release schedule looks "sporadic" so maybe we can just +request a new release to make WPHX stuff available. It'll thus be +available for our CI in rawhide/sid shortly thereafter, which will +be the best solution to let us do this in GitLab. + +We certainly don't want to add yet another separate CI system just +for WHPX. + +Regards, +Daniel +-- +|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| +|: https://libvirt.org -o- https://fstop138.berrange.com :| +|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| + + + +On 8/18/20 11:20 PM, Sunil Muthuswamy wrote: +>>>> It's still shameful that MS is forcing developers to waste time +>>>> rewriting API headers, just because the MS legal departments are not +>>>> able to understand the needs of Open Source development. +>>> There has be a big switch from Microsoft toward Open Source, I attended +>>> some of there talk at the Open Source Summit in 2018. Maybe we simply +>>> haven't contacted the right persons to make the changes...? +>> +>> +>> Maybe, but it is difficult to find the right person in a large company +>> like MS, and legal departments are often somehow special. +>> +>> And yes, they learned that Open Source can help them for their business, +>> too. +>> +>> Stefan +> +> Mike Battista is the program manager owner of the SDK license and should be +> able to take/respond to any feedback about the SDK licensing for open source +> projects (I have added him here). He has also been added to previous threads +> about the licensing and is also included in this conversation: +> https://bugs.launchpad.net/qemu/+bug/1879672 + +Hi Mike, thanks for helping us with this issue! + +And thanks a lot Sunil to bring Mike here :) + +> +> - Sunil +> +> + + + +Removing 'Opinion' and moving back to 'New'; as 'Opinion' is essentially the same as "WONTFIX" but allows discussion to continue. I believe you want a Feature Request tag instead. + +If there is still work for us to do, let's move this to Confirmed/Triaged and add the feature request tag. + +Thanks! + +Hi John, + +On 11/4/20 9:01 PM, John Snow wrote: +> Removing 'Opinion' and moving back to 'New'; as 'Opinion' is essentially +> the same as "WONTFIX" but allows discussion to continue. I believe you +> want a Feature Request tag instead. +> +> If there is still work for us to do, let's move this to +> Confirmed/Triaged and add the feature request tag. + +It seems Launchpad didn't Cc'ed the interested parties. + +Our contact is Mike Battista (see [*]) but so far he never +made any comment regarding this issue. + +Cc'ing Sunil who is the active developer (of WHPX in QEMU) +at Microsoft, and Stefan, who does the Open Source packaging. + +Regards, + +Phil. + +[*] https://<email address hidden>/msg731227.html + + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'invalid' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/233 + + +Meanwhile QEMU builds for CI and also my inofficial QEMU installers for Windows use free WHPX headers instead of the copyrighted MS ones, so this issue is fixed. + +But looking at the latest pipeline: +https://gitlab.com/qemu-project/qemu/-/pipelines/310113928 +in particular the cross-win64-system job: +https://gitlab.com/qemu-project/qemu/-/jobs/1296341064 + +WHPX isn't built anymore: + + Targets and accelerators + KVM support: NO + HAX support: YES + HVF support: NO + WHPX support: NO + NVMM support: NO + Xen support: NO + TCG support: YES + TCG backend: native (x86_64) + +We likely lost the coverage with commit 93cc0506f6c +("tests/docker: Use Fedora containers for MinGW cross-builds in the gitlab-CI") + +Should we open a new issue? + diff --git a/results/classifier/zero-shot/105/semantic/1879955 b/results/classifier/zero-shot/105/semantic/1879955 new file mode 100644 index 000000000..93bffb547 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1879955 @@ -0,0 +1,75 @@ +semantic: 0.757 +graphic: 0.756 +other: 0.734 +mistranslation: 0.680 +socket: 0.658 +instruction: 0.631 +vnc: 0.585 +device: 0.577 +network: 0.561 +KVM: 0.539 +assembly: 0.475 +boot: 0.434 + +target/i386/seg_helper.c: 16-bit TSS struct format wrong? + +In target/i386/seg_helper.c:switch_tss_ra() we have the following code to load registers from a 16-bit TSS struct: + + /* 16 bit */ + new_cr3 = 0; + new_eip = cpu_lduw_kernel_ra(env, tss_base + 0x0e, retaddr); + new_eflags = cpu_lduw_kernel_ra(env, tss_base + 0x10, retaddr); + for (i = 0; i < 8; i++) { + new_regs[i] = cpu_lduw_kernel_ra(env, tss_base + (0x12 + i * 2), + retaddr) | 0xffff0000; + } + for (i = 0; i < 4; i++) { + new_segs[i] = cpu_lduw_kernel_ra(env, tss_base + (0x22 + i * 4), + retaddr); + } + new_ldt = cpu_lduw_kernel_ra(env, tss_base + 0x2a, retaddr); + +This doesn't match up with the structure described here: https://www.sandpile.org/x86/tss.htm -- which has only 2-byte slots for the segment registers. It also makes the 3rd segreg use the same offset as the LDTR, which is very suspicious. I suspect that this should use "(0x22 + i * 2)". + +The code later in the same function that stores the segment registers to the struct has the same bug. + +Found by code inspection; I don't have a test case to check this. As a non-x86-expert I'm just going to file a bug report in case somebody else feels like confirming the issue and sending a patch. + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/382 + + diff --git a/results/classifier/zero-shot/105/semantic/1883268 b/results/classifier/zero-shot/105/semantic/1883268 new file mode 100644 index 000000000..c498de4bd --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1883268 @@ -0,0 +1,112 @@ +semantic: 0.890 +graphic: 0.883 +assembly: 0.880 +other: 0.871 +device: 0.868 +mistranslation: 0.865 +instruction: 0.852 +socket: 0.841 +boot: 0.810 +network: 0.808 +vnc: 0.787 +KVM: 0.694 + +random errors on aarch64 when executing __aarch64_cas8_acq_rel + +Hello, + +Since I upgraded to qemu-5.0 when executing the GCC testsuite, +I've noticed random failures of g++.dg/ext/sync-4.C. + +I'm attaching the source of the testcase, the binary executable and the qemu traces (huge, 111MB!) starting at main (with qemu-aarch64 -cpu cortex-a57 -R 0 -d in_asm,int,exec,cpu,unimp,guest_errors,nochain) + +The traces where generated by a CI build, I built the executable manually but I expect it to be the same as the one executed by CI. + +In seems the problem occurs in f13, which leads to a call to abort() + +The preprocessed version of f13/t13 are as follows: +static bool f13 (void *p) __attribute__ ((noinline)); +static bool f13 (void *p) +{ + return (__sync_bool_compare_and_swap((ditype*)p, 1, 2)); +} +static void t13 () +{ + try { + f13(0); + } + catch (...) { + return; + } + abort(); +} + + +When looking at the execution traces at address 0x00400c9c, main calls f13, which in turn calls __aarch64_cas8_acq_rel (at 0x00401084) +__aarch64_cas8_acq_rel returns to f13 (address 0x0040113c), then f13 returns to main (0x0040108c) which then calls abort (0x00400ca0) + +I'm not quite sure what's wrong :-( + +I've not noticed such random problems with native aarch64 hardware. + + + + + + + +FWIW, I cannot reproduce the problem with x86_64 host, +but I can reproduce it on a 32-bit i686 host. + +There's nothing wrong with the atomic operation, which +makes sense since it's against a NULL pointer. The +problem that I see is in the unwinding -- the catch +never happens and std::terminate gets called. + +There must be some sort of 32-bit TCG error though, +because the same binary works on x86_64 host. + +The most confusing thing about this test case is that +12 previous throws work correctly, but the 13th fails. + +Hi Richard, + +Thanks for taking a look and confirming that you managed to reproduce the problem. +I forgot to mention that I'm using x86_64 hosts, not i686. I hope there are not two unrelated issues... + + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +Opened ticket on gitlab: https://gitlab.com/qemu-project/qemu/-/issues/333 + + +Thanks for moving the ticket to gitlab! ... so I'm closing this on Launchpad now. + diff --git a/results/classifier/zero-shot/105/semantic/1883400 b/results/classifier/zero-shot/105/semantic/1883400 new file mode 100644 index 000000000..e8340f4da --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1883400 @@ -0,0 +1,60 @@ +semantic: 0.834 +graphic: 0.820 +device: 0.788 +other: 0.746 +mistranslation: 0.709 +network: 0.685 +socket: 0.684 +vnc: 0.675 +assembly: 0.575 +instruction: 0.534 +KVM: 0.533 +boot: 0.498 + +Windows 10 extremely slow and unresponsive + +Hi, + +Fedora 32, x64 +qemu-5.0.0-2.fc32.x86_64 + +https://www.microsoft.com/en-us/software-download/windows10ISO +Win10_2004_English_x64.iso + +Windows 10 is excruciatingly slow since upgrading to 5.0.0-2.fc32. Disabling your repo and downgrading to 2:4.2.0-7.fc32 and corrects the issue (the package in the Fedora repo). + +You can duplicate this off of the Windows 10 ISO (see above) and do not even have to install Windows 10 itself. + +Please fix, + +Many thanks, +-T + +On Sun, Jun 14, 2020 at 01:30:07AM -0000, Toddandmargo-n wrote: +> Public bug reported: +> +> Hi, +> +> Fedora 32, x64 +> qemu-5.0.0-2.fc32.x86_64 +> +> https://www.microsoft.com/en-us/software-download/windows10ISO +> Win10_2004_English_x64.iso +> +> Windows 10 is excruciatingly slow since upgrading to 5.0.0-2.fc32. +> Disabling your repo and downgrading to 2:4.2.0-7.fc32 and corrects the +> issue (the package in the Fedora repo). +> +> You can duplicate this off of the Windows 10 ISO (see above) and do not +> even have to install Windows 10 itself. + +Could this be a duplicate of +https://bugs.launchpad.net/qemu/+bug/1877716? + +Stefan + + +1877716 sounds exactly like what I experienced. + +ok, closing this as a duplicate + diff --git a/results/classifier/zero-shot/105/semantic/1884095 b/results/classifier/zero-shot/105/semantic/1884095 new file mode 100644 index 000000000..45d9ff755 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1884095 @@ -0,0 +1,70 @@ +semantic: 0.923 +graphic: 0.906 +assembly: 0.902 +device: 0.897 +instruction: 0.889 +other: 0.889 +mistranslation: 0.886 +vnc: 0.855 +boot: 0.815 +KVM: 0.811 +socket: 0.780 +network: 0.694 + +QEMU not sufficiently focused on qEMUlation, with resulting holes in TCG emulation coverage + +It seems that QEMU has stopped emphasizing the EMU part of the name, and is too much focused on virtualization. + +My interest is at running legacy operating systems, and as such, they must run on foreign CPU platforms. m68 on intel, intel on ARM, etc. +Time doesn't stand still, and reliance on KVM and similar x86-on-x86 tricks, which allow the delegation of certain CPU features to the host CPU is going to not work going forward. + +If the rumored transition of Apple to ARM is going to take place, people will want to e.g. emulate for testing or legacy purposes a variety of operating systems, incl. NeXTSTEP, Windows, earlier versions of MacOS on ARM Macs. + +Testing that scenario, i.e. macOS on an ARM board with the lowest possible CPU capable of running modern macOS, results in these problems (and of course utter failure achieving the goal): + +qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.fma [bit 12] +qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.avx [bit 28] +qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5] +qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8] +qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.0DH:EAX.xsavec [bit 1] + +And this is emulating a lowly Penryn CPU with the required CPU flags for macOS: +-cpu Penryn,vendor=GenuineIntel,+sse3,+sse4.2,+aes,+xsave,+avx,+xsaveopt,+xsavec,+xgetbv1,+avx2,+bmi2,+smep,+bmi1,+fma,+movbe,+invtsc + +Attempting to emulate a more feature laden intel CPU results in even more issues. + +I would propose that no CPU should be considered supported unless it can be fully handled by TCG on a non-native host. KVM, native-on-native etc. are nice to have, but peripheral to qEMUlation when it boils down to it. At the very least, there should be a CLEAR distinction which CPUs require KVM to be used, and which can be fully emulated. It should not require wasting an afternoon to figure out that an emulation attempt is futile because TCG lacks essential functionality. + +You made this point already in comments in LP:1818075 (and got some responses there). This isn't a bug report, it's just a suggestion about what the project ought to prioritize. If you'd like to have that kind of discussion you can probably do better just starting a qemu-devel thread. + + +Oh, and cut-and-pasting the same long comment into multiple bug reports is not a good idea, so please don't do that. + + +The comments with the other reports were just in support of getting them fixed, and providing a reason as to why that matters. Someone looking at those reports may not read this one, and as the issues are symptoms of the same larger issue, this report was filed as an overarching report, as AVX is just one aspect. Depending on the CPU model picked, an entire slew of error messages are generated. + +Fact is, an emulator that claims it emulates a CPU has a bug, if that CPU cannot be properly emulated. Hence this report. + +For the emulator not to have to be considered buggy, +EITHER +the CPU type has to be delisted as supported +OR +the missing instructions must be implemented. + +But it’s not proper to say QEMU can emulate an x86_64 Penryn system, when trying to do so fails miserably because of instructions unimplemented in TGC. + +At the very least the documentation and online help would have to distinguish between KVM-only CPU types and TGC CPU types. + +Downloading and compiling QEMU 5 sources and compiling them on an ARM64 platform results in + +qemu-system-x86_64 -cpu help + +listing all sorts of CPUs as „available“ even though these have significant gaps in the covered instruction set. If that’s not a bug, I don’t know. + +How you go about fixing it, is a different matter. You could remove the CPUs, mark them as incompletely implemented, or add support for the missing features. +Maybe it might even be possible to interest intel to contribute code from their SDE project to TCG + +BTW: just because I bracket a report with why I think a matter is worth fixing, shouldn’t make it „invalid“. + +The instructions aren’t implemented, yet the CPUs are listed as available, which is a bug in my book, as functionality is advertised that is unavailable. + diff --git a/results/classifier/zero-shot/105/semantic/1888964 b/results/classifier/zero-shot/105/semantic/1888964 new file mode 100644 index 000000000..c06e83ad8 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1888964 @@ -0,0 +1,85 @@ +semantic: 0.875 +graphic: 0.867 +other: 0.846 +mistranslation: 0.778 +instruction: 0.739 +assembly: 0.727 +boot: 0.724 +device: 0.640 +network: 0.486 +socket: 0.457 +KVM: 0.380 +vnc: 0.341 + +Segfault using GTK display with dmabuf (iGVT-g) on Wayland + +When using... + a) Intel virtualized graphics (iGVT-g) with dmabuf output + b) QEMU's GTK display with GL output enabled (-display gtk,gl=on) + c) A Wayland compositor (Sway in my case) +a segfault occurs at some point on boot (I guess as soon as the guest starts using the virtual graphics card?) + +The origin is the function dpy_gl_scanout_dmabuf in ui/console.c, where it calls + con->gl->ops->dpy_gl_scanout_dmabuf(con->gl, dmabuf); +However, the ops field (struct DisplayChangeListenerOps) does not have dpy_gl_scanout_dmabuf set because it is set to dcl_gl_area_ops which does not have dpy_gl_scanout_dmabuf set. +Only dcl_egl_ops has dpy_gl_scanout_dmabuf set. +Currently, the GTK display uses EGL on X11 displays, but GtkGLArea on Wayland. This can be observed in early_gtk_display_init() in ui/gtk.c, where it says (simplified code): + +if (opts->has_gl && opts->gl != DISPLAYGL_MODE_OFF) { + if (GDK_IS_WAYLAND_DISPLAY(gdk_display_get_default())) { + gtk_use_gl_area = true; + gtk_gl_area_init(); + } else { + DisplayGLMode mode = opts->has_gl ? opts->gl : DISPLAYGL_MODE_ON; + gtk_egl_init(mode); + } +} + +To reproduce the findings above, add this assertion to dpy_gl_scanout_dmabuf: + assert(con->gl->ops->dpy_gl_scanout_dmabuf); +This will make the segfault turn into an assertion failure. + +A workaround is to force QEMU to use GDK's X11 backend (using GDK_BACKEND=x11). + +Note: This might be a duplicate of 1775011, however the information provided in that bug report is not sufficient to make the assertion. + +QEMU version: b0ce3f021e0157e9a5ab836cb162c48caac132e1 (from Git master branch) +OS: Arch Linux, Kernel Version 5.17.0-1 + +Relevant flags of the QEMU invocation: +qemu-system-x86_64 \ + -vga none \ + -device vfio-pci-nohotplug,sysfsdev="$GVT_DEV",romfile="${ROMFILE}",display=on,x-igd-opregion=on,ramfb=on \ + -display gtk,gl=on + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" within the next 60 days (otherwise it will get +closed as "Expired"). We will then eventually migrate the ticket auto- +matically to the new system (but you won't be the reporter of the bug +in the new system and thus won't get notified on changes anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1896 b/results/classifier/zero-shot/105/semantic/1896 new file mode 100644 index 000000000..28b3286b9 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1896 @@ -0,0 +1,67 @@ +semantic: 0.893 +instruction: 0.855 +other: 0.855 +mistranslation: 0.826 +graphic: 0.807 +device: 0.789 +network: 0.783 +socket: 0.749 +vnc: 0.539 +assembly: 0.459 +boot: 0.380 +KVM: 0.363 + +Use `qemu_exit()` function instead of `exit()` +Additional information: +I just saw the similar refactoring for the GDB part of QEMU and thought it might be useful in more general case too: https://lore.kernel.org/qemu-devel/20230907112640.292104-1-chigot@adacore.com/T/#m540552946cfa960b34c4d76d2302324f5de8627f + +``` +$ rg "exit\(0" -t c -l +gdbstub/gdbstub.c +qemu-edid.c +subprojects/libvhost-user/libvhost-user.c +semihosting/arm-compat-semi.c +softmmu/async-teardown.c +softmmu/device_tree.c +softmmu/vl.c +softmmu/runstate.c +os-posix.c +dtc/util.c +dtc/dtc.c +dtc/tests/dumptrees.c +qemu-keymap.c +qemu-io.c +contrib/ivshmem-server/main.c +contrib/rdmacm-mux/main.c +tests/qtest/vhost-user-blk-test.c +tests/qtest/fuzz/fuzz.c +tests/qtest/fuzz/generic_fuzz.c +tests/unit/test-seccomp.c +tests/unit/test-rcu-list.c +tests/unit/rcutorture.c +tests/bench/qht-bench.c +tests/bench/atomic64-bench.c +tests/bench/atomic_add-bench.c +tests/unit/test-iov.c +tests/tcg/multiarch/linux/linux-test.c +tests/tcg/aarch64/mte-3.c +tests/tcg/aarch64/pauth-2.c +tests/tcg/aarch64/mte-5.c +tests/tcg/aarch64/mte-6.c +tests/tcg/aarch64/mte-2.c +tests/tcg/cris/libc/check_glibc_kernelversion.c +tests/tcg/cris/libc/check_lz.c +tests/tcg/s390x/signals-s390x.c +tests/tcg/i386/hello-i386.c +tests/tcg/cris/bare/sys.c +tests/tcg/ppc64/mtfsf.c +qemu-nbd.c +net/net.c +hw/nvram/eeprom93xx.c +hw/arm/allwinner-r40.c +hw/rdma/rdma_backend.c +hw/watchdog/watchdog.c +trace/control.c +hw/pci/pci.c +hw/misc/sifive_test.c +``` diff --git a/results/classifier/zero-shot/105/semantic/1898215 b/results/classifier/zero-shot/105/semantic/1898215 new file mode 100644 index 000000000..2e1baeb1a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1898215 @@ -0,0 +1,98 @@ +semantic: 0.924 +graphic: 0.915 +other: 0.895 +assembly: 0.893 +instruction: 0.880 +device: 0.849 +mistranslation: 0.837 +boot: 0.805 +network: 0.804 +socket: 0.783 +vnc: 0.760 +KVM: 0.747 + +[git][archlinux]Build process is busted in spice-display.c + +Linux distribution: Archlinux. Crash log added is based on a build from scratch. + +Gcc version: 10.2.0 + +Configure options used: + +configure \ + --prefix=/usr \ + --sysconfdir=/etc \ + --localstatedir=/var \ + --libexecdir=/usr/lib/qemu \ + --extra-ldflags="$LDFLAGS" \ + --smbd=/usr/bin/smbd \ + --enable-modules \ + --enable-sdl \ + --disable-werror \ + --enable-slirp=system \ + --enable-xfsctl \ + --audio-drv-list="pa alsa sdl" + +Crash log: + +../ui/spice-display.c: In function 'interface_client_monitors_config': +../ui/spice-display.c:682:25: error: 'VD_AGENT_CONFIG_MONITORS_FLAG_PHYSICAL_SIZE' undeclared (first use in this function); did you mean 'VD_AGENT_CONFIG_MONITORS_FLAG_USE_POS'? + 682 | if (mc->flags & VD_AGENT_CONFIG_MONITORS_FLAG_PHYSICAL_SIZE) { + | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | VD_AGENT_CONFIG_MONITORS_FLAG_USE_POS +../ui/spice-display.c:682:25: note: each undeclared identifier is reported only once for each function it appears in +../ui/spice-display.c:683:13: error: unknown type name 'VDAgentMonitorMM' + 683 | VDAgentMonitorMM *mm = (void *)&mc->monitors[mc->num_of_monitors]; + | ^~~~~~~~~~~~~~~~ +../ui/spice-display.c:684:37: error: request for member 'width' in something not a structure or union + 684 | info.width_mm = mm[head].width; + | ^ +../ui/spice-display.c:685:38: error: request for member 'height' in something not a structure or union + 685 | info.height_mm = mm[head].height; + | ^ +make: *** [Makefile.ninja:2031: libcommon.fa.p/ui_spice-display.c.o] Error 1 +make: *** Waiting for unfinished jobs.... + +Full build log with make V=1. + +This is a bug in the spice-server meson build system: +https://gitlab.freedesktop.org/spice/spice/-/commit/37fd91a51f52cdc1b55d3ce41e6ce6db348b986c + +Most likely they will end up bumping the version to 0.15, so we may want to update the condition in qemu. + +Already reported to Arch: + +https://bugs.archlinux.org/task/68061 + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +Fix released + diff --git a/results/classifier/zero-shot/105/semantic/1900 b/results/classifier/zero-shot/105/semantic/1900 new file mode 100644 index 000000000..01ef5700c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1900 @@ -0,0 +1,14 @@ +semantic: 0.717 +device: 0.702 +vnc: 0.678 +network: 0.671 +mistranslation: 0.663 +instruction: 0.625 +socket: 0.579 +graphic: 0.518 +KVM: 0.358 +assembly: 0.358 +boot: 0.314 +other: 0.131 + +8.1.0-r1: segfault at get_zones_wp() at ../block/file-posix.c:1337 diff --git a/results/classifier/zero-shot/105/semantic/1905562 b/results/classifier/zero-shot/105/semantic/1905562 new file mode 100644 index 000000000..767c6168f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1905562 @@ -0,0 +1,80 @@ +semantic: 0.939 +graphic: 0.918 +other: 0.907 +mistranslation: 0.905 +instruction: 0.859 +socket: 0.816 +assembly: 0.806 +device: 0.802 +vnc: 0.784 +KVM: 0.768 +network: 0.760 +boot: 0.673 + +Guest seems suspended after host freed memory for it using oom-killer + +Host: qemu 5.1.0, linux 5.5.13 +Guest: Windows 7 64-bit + +This guest ran a memory intensive process, and triggered oom-killer on host. Luckily, it killed chromium. My understanding is this should mean qemu should have continued running unharmed. But, the spice connection shows the host system clock is stuck at the exact time oom-killer was triggered. The host is completely unresponsive. + +I can telnet to the qemu monitor. "info status" shows "running". But, multiple times running "info registers -a" and saving the output to text files shows the registers are 100% unchanged, so it's not really running. + +On the host, top shows around 4% CPU usage by qemu. strace shows about 1,000 times a second, these 6 lines repeat: + +0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.000010> +0.000034 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.000009> +0.000031 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.000007> +0.000028 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.000007> +0.000030 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, {fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, NULL, 8) = 0 (Timeout) <0.000009> +0.000043 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, {fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788> + +In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times a second. IRQ 0 seems to be for the system clock, and 1,000 times a second seems to be the frequency a windows 7 guest might have the clock at. + +Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the spice socket file, and "TCP localhost:ftnmtp->localhost:36566 (ESTABLISHED)". + +Because the guest's registers aren't changing, it seems to me like monitor thinks the VM is running, but it's actually effectively in a paused state. I think all the strace activity shown above must be generated by the host. Perhaps it's repeatedly trying to contact the guest to inject a new clock, and communicate with it on the various eventfd's, spice socket, etc. So, I'm thinking the strace doesn't give any information about the real reason why the VM is acting as if it's paused. + +I've checked "info block", and there's nothing showing that a device is paused, or that there's any issues with them. (Can't remember what term can be there, but a paused/blocked/etc block device I think caused a VM to act like this for me in the past.) + + +Is there something I can provide to help fix the bug here? + +Is there something I can do, to try to get the VM running again? (I sadly have unsaved work in it.) + + + +Am I correct to expect the VM to continue successfully, after oom-killer successfully freed up memory? This journactl does show a calltrace which includes "vmx_vmexit", and I'm not sure what that function is for but looks a little worrisome. + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1905979 b/results/classifier/zero-shot/105/semantic/1905979 new file mode 100644 index 000000000..015fa7070 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1905979 @@ -0,0 +1,68 @@ +semantic: 0.910 +other: 0.869 +mistranslation: 0.868 +device: 0.867 +instruction: 0.864 +graphic: 0.854 +assembly: 0.849 +network: 0.834 +vnc: 0.795 +socket: 0.778 +KVM: 0.775 +boot: 0.765 + +Check if F_OFD_SETLK is supported may give wrong result + +In util/osdep.c there is a function qemu_probe_lock_ops() to check if file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file description locks (non-POSIX)") are supported. + +This test is done by trying a lock operation on the file /dev/null. + +This test can get a wrong result. + +The result is (probably) if the operating system *in general* supports these locks. However, it does not guarantee that the file system where the lock is really wanted (for instance, in caller raw_check_lock_bytes() in block/file-posix.c) does support these locks. + +(In theory it could even be that /dev/null, being a device special file, does not support the lock type while a plain file would.) + +This is in particular relevant for disk images which are stored on a shared file system (my particular use case is the Quobyte file system, which appears not to support these locks). + +The code as mentioned above is present in the master branch (I checked commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640) + +This is rather serious, since it causes VMs to crash: + +Unexpected error in raw_check_lock_bytes() at /build/qemu-PKI6mj/qemu-4.2/block/file-posix.c:796: +Failed to get "write" lock +2020-11-23 11:32:27.810+0000: shutting down, reason=crashed + +when openstack attempts to create a snapshot. + +In this thread, it is pointed out that support for OFD is provided by the generic VFS layer in the kernel, so there should never be a situation where one filesystem supports OFD and another does not support OFD: + + https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05264.html + +Can you say what filesystem you are using that exhibits the lack of OFD support, and what kernel version + +Interesting. Thanks for the link. + +The file system we are using is the Quobyte file system (2.24.1) (https://www.quobyte.com/), which works via FUSE. +We've had problems with OFD locks with this file system in the past, so my first thought, seeing the error in comment #1, was that those would be to blame. + +But if the OFD locks are not really handled by the file system, I'm not sure how that explains the OFD lock issues we had in the past. I don't suppose this changed in the last year or so. Just now I made a little test program (basically copying qemu_lock_fd_test() and qemu_probe_lock_ops() from qemu) to double-check, and indeed right now it seems that the OFD locks *are* working on the Quobyte file system. Or at least qemu_lock_fd_test() doesn't return an error. + +So now I'm back to square one on diagnosing the observed error. It occurred in an installation of Openstack Ussuri installed on Ubuntu 18.04 Bionic using the Ubuntu Cloud Archive for packaging. The Cloud Archive has backports of the latest Qemu to earlier Ubuntu versions. The exact qemu version was http://ubuntu-cloud.archive.canonical.com/ubuntu/pool/main/q/qemu/qemu_4.2-3ubuntu6.7~cloud0_amd64.deb . + +Annoyingly I have not been able to locate the git repo from which the Ubuntu Cloud Archive creates its packages (containing the patches and build changes for backports); all I can find is version 4.2-3ubuntu6.7 (without ~cloud0) which is for Ubuntu 20.04 Focal. + +For now we're working around it by downgrading Qemu to the normal Bionic version (2.11+dfsg-1ubuntu7.33) + +You wouldn't happen to know where the Ubuntu Cloud Archive stores exact files it creates its packages from? (I have already asked on stackoverflow without success so far: https://stackoverflow.com/questions/65146846/from-which-git-repos-does-the-ubuntu-cloud-archive-compile-its-packages) + + + +Look in the same directory as that .deb link above - the the files ending in orig.tar.gz (upstream source) and files ending in debian.tar.xz (downstream modifications) + +The kernel version is Linux hostname 4.15.0-124-generic #127-Ubuntu SMP Fri Nov 6 10:54:43 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux + +That is indeed the source and patches, but I wanted to follow their git repo for easier maintenance. Surely they must have one. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1906156 b/results/classifier/zero-shot/105/semantic/1906156 new file mode 100644 index 000000000..20e1d890a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1906156 @@ -0,0 +1,68 @@ +semantic: 0.830 +device: 0.730 +instruction: 0.693 +boot: 0.689 +other: 0.671 +graphic: 0.635 +mistranslation: 0.615 +socket: 0.607 +assembly: 0.533 +vnc: 0.531 +KVM: 0.521 +network: 0.423 + +Host OS Reboot Required, for Guest kext to Load (Fully) + +Hi, + +Finding this one a bit odd, but I am loading a driver (kext) in a macOS guest ... and it works, on the first VM (domain) startup after a full / clean host OS boot (or reboot). However, if I even reboot the guest OS, then the driver load fails => can be "corrected" by a full host OS reboot (which seems very extreme). + +Is this a known issue, and/or is there a workaround? + +FYI, running, +QEMU emulator version 5.0.0 (Debian 1:5.0-5ubuntu9.1) +Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers + +This is for a macOS guest, on a Linux host. + +Thanks! + +Hi! Seems like you're using the QEMU from your distro, so should this be a bug report against Ubuntu's QEMU instead? Otherwise, can you please try again with the latest upstream version of QEMU (currently an RC of v5.2)? You certainly also need to provide more information, e.g. what kind of error message do you see, how often did you try (maybe it's just an intermittent problem and it sometimes also works without rebooting the host), etc. + +Sure, will do (upstream version). Is there a preferred way to do it? Meaning ... build locally, or install from some repository? + +Thanks! + +The QEMU project only provides the source tarballs, so builing locally is certainly the preferred way to test. + +That makes sense, no issue at all. So I cloned from git (v5.2.0-rc3), built, installed. All good so far :-). But then I tried to modify the "emulator" in virt-manager, point to this build => I get the error, + +Error changing VM configuration: internal error: Failed to start QEMU binary /usr/local/bin/qemu-system-x86_64 for probing: libvirt: error : cannot execute binary /usr/local/bin/qemu-system-x86_64: Permission denied + +Thoughts? I have run into this before (without finding a fix sadly) - thinking it's apparmor related somehow? + +Thanks! + +Sorry for the delay on updating this - but pulling my hair out (and I'm short enough of that already ... LOL). I can't get Ubuntu to let me run the custom qemu executable. Really is looking like apparmor. Fighting with that, but struggling to have it let me run it :-(. + +Thanks. + +My apologies, but I'm somewhat stuck here :-(. Trying to run the latest (upstream) version of QEMU, but no luck getting it to execute. I even tried setting securit_driver = "none", as captured here, +https://gitlab.com/apparmor/apparmor/-/wikis/Libvirt + +But no luck. Open to any suggestions. + +Thanks! + +OK, found my issue! :-). Still a bit odd, but virt-manager complaints about the custom QEMU executable => but virsh still works. So I did get the VM running, with, +QEMU emulator version 5.1.93 (v5.2.0-rc3) +Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers + +But it still performed the same. I also checked the xml file (VM definition), and made sure to change the machine to the most current version (pc-q35-5.2), but also no improvement. + +Other things to try? + +Thanks! + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1907952 b/results/classifier/zero-shot/105/semantic/1907952 new file mode 100644 index 000000000..a25d5bf6b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1907952 @@ -0,0 +1,195 @@ +semantic: 0.877 +instruction: 0.867 +graphic: 0.852 +device: 0.847 +other: 0.846 +socket: 0.831 +boot: 0.831 +assembly: 0.828 +network: 0.828 +vnc: 0.754 +mistranslation: 0.750 +KVM: 0.664 + +qemu-system-aarch64: with "-display gtk" arrow keys are received as just ^[ on ttyAMA0 + +I originally observed this on Debian packaged qemu 5.2 at +https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976808 + +Today I checked out the latest git source at +Sun, 13 Dec 2020 19:21:09 +0900 +and configured the source as follows: + +./configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib/qemu \ + --localstatedir=/var --disable-blobs --disable-strip --localstatedir=/var \ + --libdir=/usr/lib/aarch64-linux-gnu \ + --firmwarepath=/usr/share/qemu:/usr/share/seabios:/usr/lib/ipxe/qemu \ + --target-list=aarch64-softmmu,arm-softmmu --disable-werror \ + --disable-user --enable-gtk --enable-vnc +then executed "make" on an ARM64 (not an x86_64) host, +running the latest Debian testing. + +I did the following commands on an arm64 host with the Debian Installer Alpha 3 at +https://cdimage.debian.org/cdimage/bullseye_di_alpha3/arm64/iso-cd/debian-bullseye-DI-alpha3-arm64-netinst.iso + +#!/bin/sh + +ARCH=arm64 +IMAGE=`pwd`/qemu-disk-${ARCH}.qcow2 +CDROM=`pwd`/debian-bullseye-DI-alpha3-${ARCH}-netinst.iso +rm -f $IMAGE +qemu-img create -f qcow2 -o compat=1.1 -o lazy_refcounts=on -o preallocation=off $IMAGE 20G +cd /var/tmp +cp /usr/share/AAVMF/AAVMF_VARS.fd . +$HOME/qemu-git/qemu/build/qemu-system-aarch64 \ + -display gtk -enable-kvm -machine virt -cpu host -m 3072 -smp 2\ + -net nic,model=virtio -net user -object rng-random,filename=/dev/urandom,id=rng0 \ + -device virtio-rng-pci,rng=rng0,id=rng-device0 \ + -drive if=virtio,file=${IMAGE},index=0,format=qcow2,discard=unmap,detect-zeroes=unmap,media=disk \ + -drive if=virtio,file=${CDROM},index=1,format=raw,readonly=on,media=cdrom \ + -drive if=pflash,format=raw,unit=0,file=/usr/share/AAVMF/AAVMF_CODE.fd,readonly=on \ + -drive if=pflash,format=raw,unit=1,file=`pwd`/AAVMF_VARS.fd + +Then 4 arrow keys on the physical keyboard are received as just "^[". + +This symptom was not observed on qemu-system-x86_64. +This symptom was not observed with virt-manager on my arm64 host, neither. +This seems unique to -display gtk of qemu-system-aarch64. + +An easier way to reproduce the symptom was provided by Alper Nebi Yasak at +https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976808#88 + +qemu-system-aarch64 \ + -display gtk -enable-kvm -machine virt -cpu host -m 1G -smp 2 \ + -kernel /boot/vmlinuz -initrd /boot/initrd.img \ + -append "break console=ttyAMA0" + +Then, run cat on the initramfs shell and see arrow keys result in ^[ . +For x86_64, it's console=ttyS0 and ^[[A etc. + + +This should be fixed already in head-of-git, by commit 8eb13bbbac08aa077e ; this will be in QEMU 6.0. + + +This bug was fixed in the package qemu - 1:6.0+dfsg-1~ubuntu3 + +--------------- +qemu (1:6.0+dfsg-1~ubuntu3) impish; urgency=medium + + * d/p/u/lp-1935617-target-ppc-Fix-load-endianness-for-lxvwsx-lxvdsx.patch: + fix TCG emulation for ppc64 (LP: #1935617) + +qemu (1:6.0+dfsg-1~ubuntu2) impish; urgency=medium + + * d/control: remove fuse2 trial-build (LP 1934510) + +qemu (1:6.0+dfsg-1~ubuntu1) impish; urgency=medium + + * Merge with Debian experimental, Among many other things this fixes LP Bugs: + (LP: #1907952) broken arrow keys in -display gtk on aarch64 + - qemu-kvm to systemd unit + - d/qemu-kvm-init: script for QEMU KVM preparation modules, ksm, + hugepages and architecture specifics + - d/qemu-system-common.qemu-kvm.service: systemd unit to call + qemu-kvm-init + - d/qemu-system-common.install: install helper script + - d/qemu-system-common.qemu-kvm.default: defaults for + /etc/default/qemu-kvm + - d/rules: call dh_installinit and dh_installsystemd for qemu-kvm + - Distribution specific machine type + (LP: 1304107 1621042 1776189 1761372 1761372 1776189) + - d/p/ubuntu/define-ubuntu-machine-types.patch: define distro machine + types containing release versioned machine attributes + - d/qemu-system-x86.NEWS Info on fixed machine type defintions + for host-phys-bits=true + - Add an info about -hpb machine type in debian/qemu-system-x86.NEWS + - ubuntu-q35 alias added to auto-select the most recent q35 ubuntu type + - Enable nesting by default + - d/p/ubuntu/enable-svm-by-default.patch: Enable nested svm by default + in qemu64 on amd + [ No more strictly needed, but required for backward compatibility ] + - improved dependencies + - Make qemu-system-common depend on qemu-block-extra + - Make qemu-utils depend on qemu-block-extra + - Let qemu-utils recommend sharutils + - tolerate ipxe size change on migrations to >=18.04 (LP: 1713490) + - d/p/ubuntu/pre-bionic-256k-ipxe-efi-roms.patch: old machine types + reference 256k path + - d/control-in: depend on ipxe-qemu-256k-compat-efi-roms to be able to + handle incoming migrations from former releases. + - d/control-in: Disable capstone disassembler library support (universe) + - d/qemu-system-x86.README.Debian: add info about updated nesting changes + - d/control*, d/rules: disable xen by default, but provide universe + package qemu-system-x86-xen as alternative + [includes compat links changes of 5.0-5ubuntu4] + - Fix upgrade module handling (LP 1905377) + --enable-module-upgrades for qemu-xen which doesn't exist in Debian + * Dropped Changes [in 6.0]: + - d/p/ubuntu/lp-1907789-build-no-pie-is-no-functional-liker-flag.patch: fix + ld usage of -no-pie (LP 1907789) + - d/p/u/lp-1916230-hw-s390x-fix-build-for-virtio-9p-ccw.patch: fix + virtio-9p-ccw being missing (LP 1916230) + - d/p/u/lp-1916705-disas-Fix-build-with-glib2.0-2.67.3.patch: Fix FTFBS due + to glib2.0 >=2.67.3 (LP 1916705) + - d/p/u/lp-1921754*: add EPYC-Rome-v2 as v1 missed IBRS and thereby fails + on some HW/Guest combinations e.g. Windows 10 on Threadripper chips + (LP 1921754) + - d/p/u/lp-1921880*: add EPYC-Milan features and named cpu type support + (LP 1921880) + - d/p/u/lp-1922010-linux-user-s390x-Use-the-guest-pointer-for-the-sigre*: + fix go in qemu-s390x-static (LP 1922010) + * Dropped Changes [in Debian]: + - Allow qemu to load old modules post upgrade (LP 1847361) + - Drop d/qemu-block-extra.*.in, d/qemu-system-gui.*.in + - d/rules: Drop generating package version into maintainer scripts + * Dropped Changes [No more needed >21.04]: + - d/qemu-system-gui.prerm: add no-op prerm to overcome upgrade issues on + the bad old prerm (LP 1906245 1905377) + * Added Changes + - Disable fuse export (universe dependency) + - d/p/ubuntu/enable-svm-by-default.patch: update to match v6.0 + - d/p/ubuntu/define-ubuntu-machine-types.patch: add ubuntu machine types + for v6.0 + - d/p/ubuntu/lp-1929926-*: avoid segfaults by uretprobes (LP: #1929926) + - Ease the use of module retention on upgrades (LP: #1913421) + - d/run-qemu.mount, d/rules: provide run-qemu.mount in qemu-block-extra + - d/rules: only save modules if /run/qemu isn't noexec + - d/rules: clear all (current and former) modules on purge + - debian/qemu-block-extra.postinst: enable mount unit on install/upgrade + - d/control: qemu 6.0 broke libvirt <7.2 add a breaks to avoid partial + upgrade issues (LP: #1932264) + - Enable SDL as secondary UI backend (LP: #1256185) + - d/control: add build dependency libsdl2-dev + - d/control: enable sdl graphics on build + - d/qemu-system-gui.install: add ui-sdl.so + - d/control: add runtime dependency to libgl1 + - d/rules: qemu-system-x86-xen builds modules as well now (follows the + other packages) + +qemu (1:6.0+dfsg-1~exp0) experimental; urgency=medium + + * new upstream release + * remove obsolete patches, refresh use-fixed-data-path.patch + * use libncurses-dev, not old libncursesw5-dev + * enable fuse export (and build-depend on libfuse3-dev) + * install (new) manpages for qemu-storage-daemon + * enable new hexagon qemu-user target + * two patches to fix 3 new spelling mistakes + * remove now-unused shared-library-lacks-prerequisites lintian-overrides + for qemu-user-static + +qemu (1:5.2+dfsg-10) unstable; urgency=medium + + * 5 sdhci fixes from upstream: + dont-transfer-any-data-when-command-time-out.patch + dont-write-to-SDHC_SYSAD-register-when-transfer-is-in-progress.patch + correctly-set-the-controller-status-for-ADMA.patch + limit-block-size-only-when-SDHC_BLKSIZE-register-is-writable.patch + reset-the-data-pointer-of-s-fifo_buffer-when-a-different-block-size...patch + (Closes: #986795, #970937, CVE-2021-3409, CVE-2020-17380, CVE-2020-25085) + * mptsas-remove-unused-MPTSASState.pending-CVE-2021-3392.patch + fix possible use-after-free in mptsas_free_request + (Cloese: #984449, CVE-2021-3392) + + -- Christian Ehrhardt <email address hidden> Tue, 13 Jul 2021 09:34:55 +0200 + diff --git a/results/classifier/zero-shot/105/semantic/1907969 b/results/classifier/zero-shot/105/semantic/1907969 new file mode 100644 index 000000000..ece78fd23 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1907969 @@ -0,0 +1,164 @@ +semantic: 0.912 +graphic: 0.905 +other: 0.876 +instruction: 0.861 +mistranslation: 0.850 +device: 0.841 +network: 0.838 +assembly: 0.831 +KVM: 0.813 +vnc: 0.806 +boot: 0.801 +socket: 0.794 + +linux-user/i386: Segfault when mixing threads and signals + +Given the following C program, qemu-i386 will surely and certainly segfault when executing it. +The problem is only noticeable if the program is statically linked to musl's libc and, as written +in the title, it only manifests when targeting i386. + +Removing the pthread calls or the second raise() makes it not segfault. + +The crash is in some part of the TCG-generated code, right when it tries to perform a +%gs-relative access. + +If you want a quick way of cross-compiling this binary: + +* Download a copy of the Zig compiler from https://ziglang.org/download/ +* Compile it with + `zig cc -target i386-linux-musl <C-FILE> -o <OUT>` + +``` +#include <pthread.h> +#include <signal.h> +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <unistd.h> +#include <asm/prctl.h> +#include <sys/syscall.h> + +void sig_func(int sig) +{ + write(1, "hi!\n", strlen("hi!\n")); +} + +void func(void *p) { } + +typedef void *(*F)(void *); + +int main() +{ + pthread_t tid; + + struct sigaction action; + action.sa_flags = 0; + action.sa_handler = sig_func; + + if (sigaction(SIGUSR1, &action, NULL) == -1) { + return 1; + } + + // This works. + raise(SIGUSR1); + + pthread_create(&tid, NULL, (F)func, NULL); + pthread_join(tid, NULL); + + // This makes qemu segfault. + raise(SIGUSR1); +} +``` + + + +I finally understand where the problem is. + +Qemu's user-mode emulation maps guest threads to native ones by spawning a new native one +and running a forked copy of the CPUX86State in parallel with the main thread. + +This works fine for pretty much every architecture but i386 where the GDT/LDT comes into +play: the two descriptor tables are shared among all the threads, mimicking the real hw +behaviour, but since no host task-switching is being performed the TLS entry in the GDT +become stale. + +Raising a signal makes Qemu reload the GS segment from the GDT, that's why removing that +line makes the problem disappear. + +The problem is also confined to musl libc because of an interesting implementation choice. +Once a thread dies Glibc adds the now unused stack to a queue in order to reuse it later, +while musl frees it right away when it's not needed anymore and, as a consequence, makes +Qemu segfault. + +As luck has it, after spending too much time debugging this, I found somebody else already +stumbled across this problem and wrote a patch. + +https://<email address hidden>/mbox + +Too bad the patch flew under the radar... + +Le 16/12/2020 à 09:59, The Lemon Man a écrit : +> I finally understand where the problem is. +> +> Qemu's user-mode emulation maps guest threads to native ones by spawning a new native one +> and running a forked copy of the CPUX86State in parallel with the main thread. +> +> This works fine for pretty much every architecture but i386 where the GDT/LDT comes into +> play: the two descriptor tables are shared among all the threads, mimicking the real hw +> behaviour, but since no host task-switching is being performed the TLS entry in the GDT +> become stale. +> +> Raising a signal makes Qemu reload the GS segment from the GDT, that's why removing that +> line makes the problem disappear. +> +> The problem is also confined to musl libc because of an interesting implementation choice. +> Once a thread dies Glibc adds the now unused stack to a queue in order to reuse it later, +> while musl frees it right away when it's not needed anymore and, as a consequence, makes +> Qemu segfault. +> +> As luck has it, after spending too much time debugging this, I found somebody else already +> stumbled across this problem and wrote a patch. +> +> https://<email address hidden>/mbox +> +> Too bad the patch flew under the radar... +> + +Could you add a Reviewed-by and/or a tested by to the patch on the ML? + +Thanks, +Laurent + + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1908551 b/results/classifier/zero-shot/105/semantic/1908551 new file mode 100644 index 000000000..47bdaa7bb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1908551 @@ -0,0 +1,93 @@ +semantic: 0.876 +graphic: 0.871 +other: 0.842 +assembly: 0.808 +vnc: 0.763 +instruction: 0.730 +device: 0.704 +KVM: 0.671 +mistranslation: 0.667 +socket: 0.636 +network: 0.628 +boot: 0.561 + +aarch64 SVE emulation breaks strnlen and strrchr + +arm optimized-routines have sve string functions with test code. + +the test worked up until recently: with qemu-5.2.0 i see + +$ qemu-aarch64 build/bin/test/strnlen +PASS strnlen +PASS __strnlen_aarch64 +__strnlen_aarch64_sve (0x490fa0, 32) len 32 returned 64, expected 32 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80" +__strnlen_aarch64_sve (0x490fa0, 32) len 33 returned 64, expected 32 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80a" +__strnlen_aarch64_sve (0x490fa0, 33) len 33 returned 64, expected 33 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80a" +__strnlen_aarch64_sve (0x490fa0, 32) len 34 returned 64, expected 32 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80ab" +__strnlen_aarch64_sve (0x490fa0, 33) len 34 returned 64, expected 33 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80ab" +__strnlen_aarch64_sve (0x490fa0, 34) len 34 returned 64, expected 34 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80ab" +__strnlen_aarch64_sve (0x490fa0, 32) len 35 returned 64, expected 32 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80a\x00c" +__strnlen_aarch64_sve (0x490fa0, 33) len 35 returned 64, expected 33 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80ab\x00" +__strnlen_aarch64_sve (0x490fa0, 34) len 35 returned 64, expected 34 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80abc" +__strnlen_aarch64_sve (0x490fa0, 35) len 35 returned 64, expected 35 +input: "abcdefghijklmnopqrstuvwxyz\{|}~\x7f\x80abc" +FAIL __strnlen_aarch64_sve + +however the test passes with + +qemu-aarch64 -cpu max,sve-max-vq=2 + +there should be nothing vector length specific in the code. + +i haven't debugged it further, to reproduce the issue clone +https://github.com/ARM-software/optimized-routines + +and run 'make build/bin/test/strnlen' with a config.mk like + +SUBS = string +ARCH = aarch64 +CROSS_COMPILE = aarch64-none-linux-gnu- +CC = $(CROSS_COMPILE)gcc +CFLAGS = -std=c99 -pipe -O3 +CFLAGS += -march=armv8.2-a+sve +EMULATOR = qemu-aarch64 + +(native compilation works too, and you can run 'make check' to +run all string tests) this will build a static linked executable +into build/bin/test. if you want a smaller test case edit +string/test/strnlen.c + +I don't know why the test worked previously, and I did not +investigate, but as far as I can tell, the test is broken. + +The test is returning a value >= maxlen because it it using +the wrong increment. Fixed thus. + + + +FWIW, as I think on this further, this probably isn't the +ideal fix -- I recall now that INCP is a "reduction" class +instruction and thus its overhead is non-trivial. + +We could instead add an integer min operation at label 9, +which is outside of the main loop. + +Bah. The code at label 9 does not match the comment. +Best fixed thus. + +... but you also mentioned strrchr, and there is a qemu bug there. The REV (predicate) instruction doesn't seem to be doing the right thing -- input 0x1 -> output 0x80000000 which is not correct for the current vector length (64). + +Patch fixing strrchr: +https://<email address hidden>/ + +https://gitlab.com/qemu-project/qemu/-/commit/70acaafef2e053a3 + diff --git a/results/classifier/zero-shot/105/semantic/1912065 b/results/classifier/zero-shot/105/semantic/1912065 new file mode 100644 index 000000000..2b414f3e0 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1912065 @@ -0,0 +1,98 @@ +semantic: 0.889 +assembly: 0.849 +graphic: 0.845 +device: 0.818 +network: 0.816 +instruction: 0.815 +other: 0.805 +socket: 0.774 +vnc: 0.719 +mistranslation: 0.702 +KVM: 0.654 +boot: 0.643 + +Segfaults in tcg/optimize.c:212 after commit 7c79721606be11b5bc556449e5bcbc331ef6867d + +QEMU segfaults to NULL dereference in tcg/optimize.c:212 semi-randomly after commit 7c79721606be11b5bc556449e5bcbc331ef6867d + +Exception Type: EXC_BAD_ACCESS (SIGSEGV) +Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000020 +Exception Note: EXC_CORPSE_NOTIFY + +... + +Thread 4 Crashed: +0 qemu-system-ppc 0x0000000109cd26d2 tcg_opt_gen_mov + 178 (optimize.c:212) +1 qemu-system-ppc 0x0000000109ccf838 tcg_optimize + 5656 +2 qemu-system-ppc 0x0000000109c27600 tcg_gen_code + 64 (tcg.c:4490) +3 qemu-system-ppc 0x0000000109c17b6d tb_gen_code + 493 (translate-all.c:1952) +4 qemu-system-ppc 0x0000000109c16085 tb_find + 41 (cpu-exec.c:454) [inlined] +5 qemu-system-ppc 0x0000000109c16085 cpu_exec + 2117 (cpu-exec.c:810) +6 qemu-system-ppc 0x0000000109c09ac3 tcg_cpus_exec + 35 (tcg-cpus.c:57) +7 qemu-system-ppc 0x0000000109c75edd rr_cpu_thread_fn + 445 (tcg-cpus-rr.c:217) +8 qemu-system-ppc 0x0000000109e41fae qemu_thread_start + 126 (qemu-thread-posix.c:521) +9 libsystem_pthread.dylib 0x00007fff2038e950 _pthread_start + 224 +10 libsystem_pthread.dylib 0x00007fff2038a47b thread_start + 15 + +Here the crash is in tcg/optimize.c line 212: + + mask = si->mask; + +"si" is NULL. The NULL value arises from tcg/optimize.c line 198: + + si = ts_info(src_ts); + +I did not attempt to determine the root cause of this issue, however. It clearly is related to the "tcg/optimize" changes in this commit. The previous commit c0dd6654f207810b16a75b673258f5ce2ceffbf0 doesn't crash. + +Thanks for reporting it. Just found it as well and reported on the mailing +list. It's currently being investigated. List thread is here: + +https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg03936.html + + +The problem is that we're now generating many more temporaries +than before, and running out of the statically allocated amount. +Changing a debug assert to a full assert will change the SEGV +into an ABRT. :-) + +diff --git a/tcg/tcg.c b/tcg/tcg.c +index 8f8badb61c..c376afe56a 100644 +--- a/tcg/tcg.c ++++ b/tcg/tcg.c +@@ -1207,7 +1207,7 @@ void tcg_func_start(TCGContext *s) + static inline TCGTemp *tcg_temp_alloc(TCGContext *s) + { + int n = s->nb_temps++; +- tcg_debug_assert(n < TCG_MAX_TEMPS); ++ g_assert(n < TCG_MAX_TEMPS); + return memset(&s->temps[n], 0, sizeof(TCGTemp)); + } + + +The problem can be worked around temporarily by increasing the +number of temporaries: + +diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h +index 504c5e9bb0..8fe32bb03c 100644 +--- a/include/tcg/tcg.h ++++ b/include/tcg/tcg.h +@@ -275,7 +275,7 @@ typedef struct TCGPool { + + #define TCG_POOL_CHUNK_SIZE 32768 + +-#define TCG_MAX_TEMPS 512 ++#define TCG_MAX_TEMPS 2048 + #define TCG_MAX_INSNS 512 + + /* when the size of the arguments of a called function is smaller than + + +But a proper solution is to dynamically allocate the temps. + +Richard, thanks for providing the workaround. It helps. + +A full solution to the problem: +https://<email address hidden>/ + +https://gitlab.com/qemu-project/qemu/-/commit/ae30e86661b0f4 + diff --git a/results/classifier/zero-shot/105/semantic/1914986 b/results/classifier/zero-shot/105/semantic/1914986 new file mode 100644 index 000000000..609a097fe --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1914986 @@ -0,0 +1,101 @@ +semantic: 0.807 +instruction: 0.781 +assembly: 0.717 +other: 0.712 +graphic: 0.712 +device: 0.710 +mistranslation: 0.705 +KVM: 0.665 +vnc: 0.615 +socket: 0.592 +network: 0.567 +boot: 0.528 + +KVM internal error. Suberror: 1 - OVMF / Audio related + +This is latest release QEMU-5.2.0 on Arch Linux running kernel 5.10.13, latest OVMF etc. + +I'm seeing the following crash when loading an audio driver from the OpenCore[1] project in the UEFI shell: + +KVM internal error. Suberror: 1 +emulation failure +RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000000 +RSI=0000000000000000 RDI=000000007e423628 RBP=000000007fee6a90 RSP=000000007fee6a08 +R8 =0000000000000000 R9 =0000000000000080 R10=0000000000000000 R11=0000000000000000 +R12=000000007eeaf828 R13=0000000000000000 R14=0000000000000000 R15=000000007fee6a67 +RIP=00000000000b0000 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 +ES =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] +SS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +DS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +FS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +GS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT +TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy +GDT= 000000007f9ee698 00000047 +IDT= 000000007f27a018 00000fff +CR0=80010033 CR2=0000000000000000 CR3=000000007fc01000 CR4=00000668 +DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 +DR6=00000000ffff0ff0 DR7=0000000000000400 +EFER=0000000000000d00 +Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <ff> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + + +Here's the QEMU command line I'm using: + +qemu-system-x86_64 \ +-machine q35,accel=kvm \ +-cpu host,+topoext,+invtsc \ +-smp 4,sockets=1,cores=2 \ +-m 4096 \ +-drive file=/usr/share/edk2-ovmf/x64/OVMF_CODE.fd,if=pflash,format=raw,readonly=on \ +-drive file=OVMF_VARS.fd,if=pflash,format=raw \ +-usb -device usb-tablet -device usb-kbd \ +-drive file=OpenCore-0.6.6.img,format=raw \ +-device ich9-intel-hda,bus=pcie.0,addr=0x1b \ +-device hda-micro,audiodev=hda \ +-audiodev pa,id=hda,server=/run/user/1000/pulse/native + +The driver loads fine when using the "no connect" switch. eg: + +Shell> load -nc fs0:\efi\oc\drivers\audiodxe.efi +Shell> Image 'fs0:\EFI\OC\Drivers\AudioDxe.efi' loaded at 7E3C7000 - Success + +However, the crash occurs when loading normally. + +Any ideas? Thanks. + +[1]: https://github.com/acidanthera/OpenCorePkg/releases + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1915794 b/results/classifier/zero-shot/105/semantic/1915794 new file mode 100644 index 000000000..0c69ee487 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1915794 @@ -0,0 +1,37 @@ +semantic: 0.794 +device: 0.733 +socket: 0.687 +graphic: 0.651 +other: 0.615 +instruction: 0.605 +network: 0.490 +vnc: 0.461 +boot: 0.453 +mistranslation: 0.369 +assembly: 0.335 +KVM: 0.269 + +could not load PC BIOS 'bios-256k.bin' on latest Windows exe (*-20210203.exe) + +I'm using https://scoop.sh/ to install QEMU on a Windows CI job, which is run daily. Since today, the job is failing with an `could not load PC BIOS 'bios-256k.bin'` error thrown by QEMU. + +The version that causes this error is: https://qemu.weilnetz.de/w64/2021/qemu-w64-setup-20210203.exe#/dl.7z +The previous version, which worked fine, was: https://qemu.weilnetz.de/w64/2020/qemu-w64-setup-20201124.exe#/dl.7z + +Both CI runs build the exact same code. You can find the full logs at https://github.com/rust-osdev/x86_64/runs/1908137213?check_suite_focus=true (failing) and https://github.com/rust-osdev/x86_64/runs/1900698412?check_suite_focus=true (previous). + +(I hope this is the right place to report this issue.) + +This is a known and documented problem with a simple workaround: "Known issue: currently requires start from installation directory or -L option to specify the location of the firmware files." + +And it is already fixed in the newer installer. + +See also the patch which will fix this issue upstream as soon as it was merged: https://patchwork<email address hidden>/. + + +Thanks a lot! + +Stefan's patch got merged here: +https://gitlab.com/qemu-project/qemu/-/commit/342e3a4f20653c2d419 +... thus closing this issue now. + diff --git a/results/classifier/zero-shot/105/semantic/1917940 b/results/classifier/zero-shot/105/semantic/1917940 new file mode 100644 index 000000000..804376e57 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1917940 @@ -0,0 +1,51 @@ +semantic: 0.894 +device: 0.886 +boot: 0.848 +instruction: 0.832 +graphic: 0.822 +other: 0.820 +mistranslation: 0.752 +assembly: 0.625 +network: 0.609 +KVM: 0.509 +socket: 0.496 +vnc: 0.461 + +-bios edk2-$arch-code doesn't work for x86 + +Whilst creating a flash device is recommended, -bios <file> is extremely useful in many cases as it automatically searches $PREFIX/share/qemu rather than requiring the caller (be it a human or a script) to work out where that directory is for the QEMU being called and prepend it to the file name. + +Currently, all the x86 EDK2 FD code files are 3653632 bytes in size, or 0x37c000 bytes. However, for some reason I cannot find the answer to (I traced the code back to 7587cf44019d593bb12703e7046bd7738996c55c), x86's -bios only allows files that are multiples of 64K in size (x86_bios_rom_init), which would require the EDK2 ROMs to be rounded up to 0x380000 bytes. If I delete the check, QEMU is able to load the only-16K-multiple-sized EDK2 and boot an OS just fine. If I pad EDK2 with 16K of zeroes at the *start* (since the ROM gets mapped counting backwards), it also works just fine (but padding at the *end* doesn't). Please therefore either relax the check in x86_bios_rom_init or ensure the EDK2 binary is suitably padded. + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/1920602 b/results/classifier/zero-shot/105/semantic/1920602 new file mode 100644 index 000000000..186c7fe86 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1920602 @@ -0,0 +1,95 @@ +semantic: 0.765 +instruction: 0.667 +mistranslation: 0.638 +assembly: 0.637 +device: 0.601 +graphic: 0.594 +boot: 0.593 +vnc: 0.568 +socket: 0.516 +network: 0.451 +KVM: 0.409 +other: 0.409 + +QEMU crash after a QuickBASIC program integer overflow + +A trivial program compiler with QuickBASIC 4.5 with integer overflow will crash QEMU when ran under MS-DOS 5.0 or FreeDOS 1.2: + +C:\KILLER>type killer.bas +A% = VAL("99999"):PRINT A% + +C:\KILLER>killer.exe +** + ERROR:../qemu-5.2.0/accel/tcg/tcg-cpus.c:541:tcg_handle_interrupt: assertion failed: (qemu_mutex_iothread_locked()) +Aborted + +QEMU version v5.2, compiler for ARM, and started with command line: + +qemu-system-i386 -curses -cpu 486 -m 1 -drive dos.img + +The same test under Ubuntu QEMU and KVM/x86_64 (QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.14)) will just silently hang the QEMU. On DOSBOX, the machine does not die and program outputs the value -31073. + +The EXE to reproduce the issue is attached. + + + +The program works (in TCQ mode) with QEMU v5.0.0. + +QEMU starts crashing with the commit: + +commit 975af797f1e04e4d1b1a12f1731141d3770fdbce +Author: Joseph Myers <email address hidden> +Date: Fri May 15 21:21:24 2020 +0000 + + target/i386: fix IEEE x87 floating-point exception raising + + +For -enable-kvm I haven't been able to find a working commit. All versions since v3.1.0 just silently hang with the program. + +Attached is a minimal FreeDOS floppy disk to reproduce the TCG crash. Still reproducible with QEMU v6.0.0: + +WARNING: Image format was not specified for 'test-floppy.img' and probing guessed raw. + Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted. + Specify the 'raw' format explicitly to remove the restrictions. +SeaBIOS (version rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org) +Booting from Floppy... +................................................123 +FreeDOS kernel 2042 (build 2042 OEM:0xfd) [compiled May 11 2016] +Kernel compatibility 7.10 - WATCOMC - FAT32 support + +(C) Copyright 1995-2012 Pasquale J. Villani and The FreeDOS Project. +All Rights Reserved. This is free software and comes with ABSOLUTELY NO +WARRANTY; you can redistribute it and/or modify it under the terms of the +GNU General Public License as published by the Free Software Foundation; +either version 2, or (at your option) any later version. + - InitDiskno hard disks detected + +FreeCom version 0.84-pre2 XMS_Swap [Aug 28 2006 00:29:00] +A:\>KILLER.EXE +** +ERROR:../accel/tcg/tcg-accel-ops.c:80:tcg_handle_interrupt: assertion failed: (qemu_mutex_iothread_locked()) +Bail out! ERROR:../accel/tcg/tcg-accel-ops.c:80:tcg_handle_interrupt: assertion failed: (qemu_mutex_iothread_locked()) +Aborted + + +Since commit 975af797f1e helper_fist_ST0() sets float_flag_invalid. + +FErr# IRQ raise since bf13bfab084 ("i386: implement IGNNE"): + + Change the handling of port F0h writes and FPU exceptions to implement IGNNE. + + The implementation mixes a bit what the chipset and processor do in real + hardware, but the effect is the same as what happens with actual FERR# + and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering + FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#. + + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/318 + + diff --git a/results/classifier/zero-shot/105/semantic/1921468 b/results/classifier/zero-shot/105/semantic/1921468 new file mode 100644 index 000000000..d1ddfe130 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1921468 @@ -0,0 +1,322 @@ +semantic: 0.889 +other: 0.884 +network: 0.878 +boot: 0.846 +graphic: 0.841 +mistranslation: 0.831 +instruction: 0.823 +assembly: 0.808 +socket: 0.797 +device: 0.797 +KVM: 0.748 +vnc: 0.706 + +[UBUNTU 20.04] KVM guest fails to find zipl boot menu index + +---Problem Description--- +A KVM guest fails to find the zipl boot menu index if the "zIPL" magic value is listed at the end of a disk block. + +---System Hang--- +System sits in disabled wait, last console display +LOADPARM=[ ] +Using virtio-blk. +Using ECKD scheme (block size 4096), CDL +VOLSER=[0X0067] + + +---Steps to Reproduce--- +1. Install Distro KVM guest from ISO on a DASD, e.g. using virt-install, my invocation was +$ virt-install --name secguest2 --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.af6a --cdrom /var/lib/libvirt/images/xxxxxx.iso + +2. Select DHCP networking and ASCII console, and accept all defaults of the installer + +3. Let the installer reboot after the installation completes + +It is possible to recover by editing the domain XML with an explicit loadparm to select a boot menu entry. E.g. I changed the disk definition to + <disk type='block' device='disk'> + <driver name='qemu' type='raw' cache='none' io='native'/> + <source dev='/dev/disk/by-path/ccw-0.0.af6a'/> + <target dev='vda' bus='virtio'/> + <boot order='1' loadparm='1'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0xaf6a'/> + </disk> + +The patches are now upstream: +5f97ba0c74cc ("pc-bios/s390-ccw: fix off-by-one error") +468184ec9024 ("pc-bios/s390-ccw: break loop if a null block number is reached") + +Current versions of qemu within Ubuntu + +focal (20.04LTS) 1:4.2-3ubuntu6 [ports]: arm64 armhf ppc64el s390x +focal-updates (metapackages): 1:4.2-3ubuntu6.14: amd64 arm64 armhf ppc64el s390x + +groovy (20.10) (metapackages): 1:5.0-5ubuntu9 [ports]: arm64 armhf ppc64el s390x +groovy-updates (metapackages): 1:5.0-5ubuntu9.6: amd64 arm64 armhf ppc64el s390x + +hirsute (metapackages): 1:5.2+dfsg-9ubuntu1: amd64 arm64 armhf ppc64el s390x + + +git-commits will apply seamlessley for the requested levels if not already integrated + +------- Comment From <email address hidden> 2021-03-26 04:38 EDT------- +Just to avoid any bad surprise, these patches require a rebuild of the bios image so the binary must also be updated. + +This already is in upstream qemu 5.2, thereby Hirsute is fixed already. +I'll prep PPAs for a try for Focal/Groovy in a bit + +PPA is here: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4504 + +Would you mind to check if this really is enough and all that you'd need? +Once that is confirmed I can prep this for the SRU process. + +Hi @Christan B. :-) +With "rebuild of the bios image" I guess you meant: + /usr/share/qemu/s390-ccw.img + /usr/share/qemu/s390-netboot.img +Those are built from the same source, so fixing and building src:qemu fixes this in one go. + +If you had other binaries in mind let me know. + +------- Comment From <email address hidden> 2021-03-29 07:42 EDT------- +(In reply to comment #12) +> Hi @Christan B. :-) +> With "rebuild of the bios image" I guess you meant: +> /usr/share/qemu/s390-ccw.img +> /usr/share/qemu/s390-netboot.img +> Those are built from the same source, so fixing and building src:qemu fixes +> this in one go. +> +> If you had other binaries in mind let me know. + +Yes I had these 2 in mind. +I was not sure if Ubuntu always builds these files or if you use the pre-build ones. + +Hi, +I have tested this with: +$ virt-install --name testinst1 --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.151e --cdrom /var/lib/libvirt/images/ubuntu-18.04.5-server-s390x.iso + +But while the issue itself and the fix is clear, this did not trigger the issue. +In my case the reboot after install worked just fine even without the fix. +Might I ask: +- which "xxxxxx.iso" it is in your example that has issues with this? +- which disk setup did you select on install (that is then put onto the dasd by the installer) +- what should I expect in the error case, I expected a fail or hang on reboot but got: + +``` +The system is going down NOW! +Sent SIGTERM to all processes +Sent SIGKILL to all processes +Requesting system reboot + +Domain creation completed.ts; <Enter> activates buttons +Restarting guest. +Connected to domain testinst1 +Escape character is ^] + +Booting entry #0 +[ 0.450525] Linux version 4.15.0-140-generic (buildd@bos02-s390x-010) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #144-Ubuntu SMP Fri Mar 19 14:11:29 UTC 2021 (Ubuntu 4.15.0-140.144- +generic 4.15.18) +``` + +``` +The config (in regard to boot) that virtinst left (and that worked) was: + <os> + <type arch='s390x' machine='s390-ccw-virtio-focal'>hvm</type> + <boot dev='hd'/> + </os> +... + <disk type='block' device='disk'> + <driver name='qemu' type='raw' cache='none' io='native'/> + <source dev='/dev/disk/by-path/ccw-0.0.151e' index='2'/> + <backingStore/> + <target dev='vda' bus='virtio'/> + <alias name='virtio-disk0'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/> + </disk> +``` + +I agree to the fix, but need a reasonable testcase that works (also to explain to the SRU team why this is a realistic issue someone would hit). + +Could I skip all the install description and just take a existing guest running on a dasd and then use a custom zipl.conf to trigger this? If so which zipl.conf would you recommend? + +------- Comment From <email address hidden> 2021-03-29 08:47 EDT------- +(In reply to comment #14) +> Hi, +> I have tested this with: +> $ virt-install --name testinst1 --memory 2048 --disk +> path=/dev/disk/by-path/ccw-0.0.151e --cdrom +> /var/lib/libvirt/images/ubuntu-18.04.5-server-s390x.iso +> +> But while the issue itself and the fix is clear, this did not trigger the +> issue. +> In my case the reboot after install worked just fine even without the fix. +> Might I ask: +> - which "xxxxxx.iso" it is in your example that has issues with this? + +This was a non Ubuntu distribution. It can happen on any distro that has the s390-tools commit/patch "zipl: Make use of __noreturn macro" and not the fix "zipl/libc: libc_stop move 'noreturn' to declaration" + +I spoke to cborntra, and it turned out that this affects only guests with zipl +with: 86856f98 "zipl: Make use of __noreturn macro" +but not yet: c367a6bb "zipl/libc: libc_stop move 'noreturn' to declaration" + +That means 2.12/2.13 and that translates to Focal. +Therefore retry this as: + +ubuntu@s1lp5:~$ virt-install --name testinst2 --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.151e --cdrom /var/lib/libvirt/images/ubuntu-20.04-legacy-server-s390x.iso +- all defaults - +- install as "entire disk - + +Then on reboot I get still what seems working: + +``` +The system is going down NOW! +Sent SIGTERM to all processes +Sent SIGKILL to all processes +Requesting system reboot + +Domain creation completed.ts; <Enter> activates buttons +Restarting guest. +Connected to domain testinst2 +Escape character is ^] + +Booting entry #0 +``` + +We talked further and it is also compiler specific. +Eventually any guest "could" fail and it definitely is wise to fix this. +Just verification gets harder. + +I'll try some other ISOs as instructed by Christian to see if one can be used as repro case. + + +This iso should do the trick "SLE-15-SP2-Full-s390x-GM-Media1.iso" to reproduce. + +------- Comment From <email address hidden> 2021-03-29 13:09 EDT------- +(In reply to comment #22) +> This iso should do the trick "SLE-15-SP2-Full-s390x-GM-Media1.iso" to +> reproduce. + +Yep exactly. Without the ISO it's harder to reproduce... Because then you have to (AFAICR): +1. patch your zipl code so the stage loader (I think it was stage2) has the right size +2. use this patched zipl for zipl'ing the DASD +3. use qemu to boot from this DASD + +FYI - uploaded to the -unapproved queue yesterday. Now on the SRU team to evaluate. + +Hello bugproxy, or anyone else affected, + +Accepted qemu into groovy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:5.0-5ubuntu9.7 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-groovy to verification-done-groovy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-groovy. In either case, without details of your testing we will not be able to proceed. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! + +N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. + +Hello bugproxy, or anyone else affected, + +Accepted qemu into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:4.2-3ubuntu6.15 in a few hours, and then in the -proposed repository. + +Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. + +If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed. + +Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! + +N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. + +I happen to know that Marc is verifying this - thanks in advance! + +All autopkgtests for the newly accepted qemu (1:4.2-3ubuntu6.15) for focal have finished running. +The following regressions have been reported in tests triggered by the package: + +casper/1.445.1 (amd64, ppc64el) +systemd/245.4-4ubuntu3.6 (amd64) +ubuntu-image/1.11+20.04ubuntu1 (armhf, amd64, s390x) +livecd-rootfs/2.664.19 (ppc64el) + + +Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1]. + +https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#qemu + +[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions + +Thank you! + + +All autopkgtests for the newly accepted qemu (1:5.0-5ubuntu9.7) for groovy have finished running. +The following regressions have been reported in tests triggered by the package: + +systemd/246.6-1ubuntu1.3 (ppc64el) +cloud-utils/0.31-29-ge0792e3d-0ubuntu1 (s390x) +open-iscsi/2.1.1-1ubuntu2 (amd64) +ubuntu-image/1.11+20.10ubuntu1 (armhf) + + +Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1]. + +https://people.canonical.com/~ubuntu-archive/proposed-migration/groovy/update_excuses.html#qemu + +[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions + +Thank you! + + +FYI I'm working on the autopkgtest issues - but all of those are known flaky cases, so I expect no long term blocker. + +The other two bugs that are part of this SRU are verified by now, so it needs just this one to complete - which we know can be hard to re-create without special unlucky bootloader record sizes. + +On the good side, I've not seen regressions to the non-affected-boots + +@Marc - let us know once you've completed the testing + +FYI - autopkgtest issues resolved as well now (as assumed it was due to flaky tests) + +------- Comment From <email address hidden> 2021-04-08 08:37 EDT------- +@Christian: I've verified the fix works. + +Thanks (we have had way too much non-fun with no debug symbols on the roms, bootloader record sizes and so on). +I really appreciate that you went so deep on this Marc! + +The verification of the Stable Release Update for qemu has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions. + +This bug was fixed in the package qemu - 1:5.0-5ubuntu9.7 + +--------------- +qemu (1:5.0-5ubuntu9.7) groovy; urgency=medium + + * d/p/u/lp-1921468-*: fix issues handling boot menu index on s390x + (LP: #1921468) + * d/p/u/lp-1887535-configure-replace-enable-disable-git-update-with-wit.patch, + d/rules: Backport --with-git-submodules param so building from git repo + doesn't fail (LP: #1887535) + * Fix byte aligned writes when writing to image stored on NFS + server, as they aren't required to be 4kib aligned. (LP: #1921665) + - d/p/u/lp-1921665-1-block-Require-aligned-image-size-to-avoid-assert.patch + - d/p/u/lp-1921665-2-file-posix-Allow-byte-aligned-O_DIRECT-with-NFS.patch + + -- Christian Ehrhardt <email address hidden> Fri, 26 Mar 2021 10:36:31 +0100 + +This bug was fixed in the package qemu - 1:4.2-3ubuntu6.15 + +--------------- +qemu (1:4.2-3ubuntu6.15) focal; urgency=medium + + * d/p/u/lp-1921468-*: fix issues handling boot menu index on s390x + (LP: #1921468) + * d/p/u/lp-1887535-configure-replace-enable-disable-git-update-with-wit.patch, + d/rules: Backport --with-git-submodules param so building from git repo + doesn't fail (LP: #1887535) + * Fix byte aligned writes when writing to image stored on NFS + server, as they aren't required to be 4kib aligned. (LP: #1921665) + - d/p/u/lp-1921665-1-block-Require-aligned-image-size-to-avoid-assert.patch + - d/p/u/lp-1921665-2-file-posix-Allow-byte-aligned-O_DIRECT-with-NFS.patch + + -- Christian Ehrhardt <email address hidden> Fri, 26 Mar 2021 10:38:47 +0100 + +------- Comment From <email address hidden> 2021-04-15 07:09 EDT------- +IBM bugzilla status-> closed, Fix Released with all requested distros + diff --git a/results/classifier/zero-shot/105/semantic/1921948 b/results/classifier/zero-shot/105/semantic/1921948 new file mode 100644 index 000000000..a4d86ea09 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1921948 @@ -0,0 +1,606 @@ +semantic: 0.858 +boot: 0.842 +device: 0.842 +other: 0.825 +assembly: 0.818 +network: 0.810 +instruction: 0.793 +graphic: 0.767 +KVM: 0.753 +socket: 0.713 +vnc: 0.699 +mistranslation: 0.563 + +MTE tags not checked properly for unaligned accesses at EL1 + +For kernel memory accesses that span across two memory granules, QEMU's MTE implementation only checks the tag of the first granule but not of the second one. + +To reproduce this, build the Linux kernel with CONFIG_KASAN_HW_TAGS enabled, apply the patch below, and boot the kernel: + +diff --git a/sound/last.c b/sound/last.c +index f0bb98780e70..04745cb30b74 100644 +--- a/sound/last.c ++++ b/sound/last.c +@@ -5,12 +5,18 @@ + */ + + #include <linux/init.h> ++#include <linux/slab.h> + #include <sound/core.h> + + static int __init alsa_sound_last_init(void) + { + struct snd_card *card; + int idx, ok = 0; ++ ++ char *ptr = kmalloc(128, GFP_KERNEL); ++ pr_err("KASAN report should follow:\n"); ++ *(volatile unsigned long *)(ptr + 124); ++ kfree(ptr); + + printk(KERN_INFO "ALSA device list:\n"); + for (idx = 0; idx < SNDRV_CARDS; idx++) { + +KASAN tags the 128 allocated bytes with the same tag as the returned pointer. The memory granule that follows the 128 allocated bytes has a different tag (with 1/15 probability). + +Expected result: a tag fault is detected and a KASAN report is printed when accessing bytes [124, 130). +Observed result: no tag fault is detected and no KASAN report is printed. + +Here are the flags that I use to run QEMU if they matter: + +qemu-system-aarch64 -s -machine virt,mte=on -cpu max -m 2G -smp 2 -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 -net nic -nographic -kernel ./Image -append "console=ttyAMA0 root=/dev/vda earlyprintk=serial" -drive file=./fs.img,format=raw,if=virtio -no-shutdown -no-reboot + +I believe that you're correct, and that I mis-read the MTE specification. + +I believed that exactly one mte tag check was made for any single memory +access. But I missed that unaligned accesses are as-if a sequence of byte +accesses -- in the Arm ARM, see aarch64/functions/memory/Mem[]. + +I'm still trying to verify this via the Arm FVP, but so far I've not +found the right incantation of parameters to properly enable MTE. +(I can enable the instructions, but a simple stg/ldg test suggests +that there is no tag storage enabled -- all tags read as 0.) + +The flags that you need to pass to FVP to enable MTE are listed near the end of the README here: + +https://cs.android.com/android/platform/superproject/+/master:device/generic/goldfish/fvpbase/README.md + +Ah, perfect, I was missing dram_metadata.is_enabled. + +And my userland unaligned test case demonstrates that +the second granule is tested, as reported. + +https://<email address hidden>/ + +Hi Richard, + +I tried your patch, but QEMU crashes with: + +ERROR:../target/arm/mte_helper.c:588:mte_check_fail: code should not be reached +Bail out! ERROR:../target/arm/mte_helper.c:588:mte_check_fail: code should not be reached + +when running KASAN tests. + +Thanks! + +Yeah, I saw an error right after posting. Please try v2: + +https://<email address hidden>/ + +With v2, a lot of KASAN tests start failing. This likely means that MTE tag faults stop being generated in certain cases. + +With v3 [1], no MTE faults are generated at all. + +[1] https://<email address hidden>/ + + +Richard Henderson <email address hidden> writes: + +> We were incorrectly assuming that only the first byte of an MTE access +> is checked against the tags. But per the ARM, unaligned accesses are +> pre-decomposed into single-byte accesses. So by the time we reach the +> actual MTE check in the ARM pseudocode, all accesses are aligned. +> +> Therefore, the first failure is always either the first byte of the +> access, or the first byte of the granule. +> +> In addition, some of the arithmetic is off for last-first -> count. +> This does not become directly visible until a later patch that passes +> single bytes into this function, so ptr == ptr_last. +> +> Buglink: https://bugs.launchpad.net/bugs/1921948 + +Minor note: you can Cc: Bug 1921948 <email address hidden> to +automatically copy patches to the appropriate bugs which is useful if +you don't have the Cc for the reporter. + +Anyway I'm trying to get the kasas unit tests running as a way of +testing this (and maybe expanding with a version of Andrey's test). I +suspect this may be a PEBCAC issue but I built an MTE enabled kernel +with: + + CONFIG_HAVE_ARCH_KASAN=y + CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y + CONFIG_HAVE_ARCH_KASAN_HW_TAGS=y + CONFIG_CC_HAS_KASAN_GENERIC=y + CONFIG_KASAN=y + # CONFIG_KASAN_GENERIC is not set + CONFIG_KASAN_HW_TAGS=y + CONFIG_KASAN_STACK=1 + CONFIG_KASAN_KUNIT_TEST=m + CONFIG_TEST_KASAN_MODULE=m + +and was able to boot it. But when I insmod the kasan tests: + + insmod test_kasan.ko + +it looks like it just keeps looping failing on the same test: + + Ignoring spurious kernel translation fault at virtual address dead00000000010a + WARNING: CPU: 0 PID: 1444 at arch/arm64/mm/fault.c:364 __do_kernel_fault+0xc4/0x1bc + Modules linked in: test_kasan(+) + CPU: 0 PID: 1444 Comm: kunit_try_catch Tainted: G B W 5.11.0-ajb-kasan #3 + Hardware name: linux,dummy-virt (DT) + pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) + pc : __do_kernel_fault+0xc4/0x1bc + lr : __do_kernel_fault+0xc4/0x1bc + sp : ffffffc01191b900 + x29: ffffffc01191b900 x28: fcffff8001f7a880 + x27: fcffff8001c01e00 x26: 0000000000000000 + x25: 0000000000000001 x24: 00000000000000f4 + x23: 0000000020400009 x22: dead00000000010a + x21: 0000000000000025 x20: ffffffc01191b9d0 + x19: 0000000097c08004 x18: 0000000000000000 + x17: 000000000000000a x16: 000017a83fb75794 + x15: 0000000000000030 x14: 6c656e72656b2073 + x13: ffffffc010e21be0 x12: 00000000000001aa + x11: 000000000000008e x10: ffffffc010e2d930 + x9 : 000000000003a6d0 x8 : ffffffc010e21be0 + x7 : ffffffc010e2cbe0 x6 : 0000000000000d50 + x5 : ffffff8007f9c850 x4 : ffffffc01191b700 + x3 : 0000000000000001 x2 : 0000000000000000 + x1 : 0000000000000000 x0 : fcffff8001f7a880 + Call trace: + __do_kernel_fault+0xc4/0x1bc + do_translation_fault+0x98/0xb0 + do_mem_abort+0x44/0xb0 + el1_abort+0x40/0x6c + el1_sync_handler+0x6c/0xb0 + el1_sync+0x70/0x100 + kasan_update_kunit_status+0x6c/0x1ac + kasan_report_invalid_free+0x34/0xa0 + ____kasan_slab_free.constprop.0+0xf8/0x1a0 + __kasan_slab_free+0x10/0x20 + slab_free_freelist_hook+0xf8/0x1a0 + kfree+0x148/0x25c + kunit_destroy_resource+0x15c/0x1bc + string_stream_destroy+0x20/0x80 + kunit_do_assertion+0x190/0x1e4 + kmalloc_double_kzfree+0x158/0x190 [test_kasan] + kunit_try_run_case+0x78/0xa4 + kunit_generic_run_threadfn_adapter+0x20/0x2c + kthread+0x134/0x144 + ret_from_fork+0x10/0x38 + ---[ end trace 5acd02cdb9b3d3f0 ]--- + +but maybe I'm using the kunit tests wrong. It's my first time playing +with them. + +> Signed-off-by: Richard Henderson <email address hidden> +> --- +> target/arm/mte_helper.c | 38 +++++++++++++++++--------------------- +> 1 file changed, 17 insertions(+), 21 deletions(-) +> +> diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c +> index 8be17e1b70..c87717127c 100644 +> --- a/target/arm/mte_helper.c +> +++ b/target/arm/mte_helper.c +> @@ -757,10 +757,10 @@ uint64_t mte_checkN(CPUARMState *env, uint32_t desc, +> uint64_t ptr, uintptr_t ra) +> { +> int mmu_idx, ptr_tag, bit55; +> - uint64_t ptr_last, ptr_end, prev_page, next_page; +> - uint64_t tag_first, tag_end; +> - uint64_t tag_byte_first, tag_byte_end; +> - uint32_t esize, total, tag_count, tag_size, n, c; +> + uint64_t ptr_last, prev_page, next_page; +> + uint64_t tag_first, tag_last; +> + uint64_t tag_byte_first, tag_byte_last; +> + uint32_t total, tag_count, tag_size, n, c; +> uint8_t *mem1, *mem2; +> MMUAccessType type; +> +> @@ -779,29 +779,27 @@ uint64_t mte_checkN(CPUARMState *env, uint32_t desc, +> +> mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX); +> type = FIELD_EX32(desc, MTEDESC, WRITE) ? MMU_DATA_STORE : MMU_DATA_LOAD; +> - esize = FIELD_EX32(desc, MTEDESC, ESIZE); +> total = FIELD_EX32(desc, MTEDESC, TSIZE); +> +> /* Find the addr of the end of the access, and of the last element. */ +> - ptr_end = ptr + total; +> - ptr_last = ptr_end - esize; +> + ptr_last = ptr + total - 1; +> +> /* Round the bounds to the tag granule, and compute the number of tags. */ +> tag_first = QEMU_ALIGN_DOWN(ptr, TAG_GRANULE); +> - tag_end = QEMU_ALIGN_UP(ptr_last, TAG_GRANULE); +> - tag_count = (tag_end - tag_first) / TAG_GRANULE; +> + tag_last = QEMU_ALIGN_DOWN(ptr_last, TAG_GRANULE); +> + tag_count = ((tag_last - tag_first) / TAG_GRANULE) + 1; +> +> /* Round the bounds to twice the tag granule, and compute the bytes. */ +> tag_byte_first = QEMU_ALIGN_DOWN(ptr, 2 * TAG_GRANULE); +> - tag_byte_end = QEMU_ALIGN_UP(ptr_last, 2 * TAG_GRANULE); +> + tag_byte_last = QEMU_ALIGN_DOWN(ptr_last, 2 * TAG_GRANULE); +> +> /* Locate the page boundaries. */ +> prev_page = ptr & TARGET_PAGE_MASK; +> next_page = prev_page + TARGET_PAGE_SIZE; +> +> - if (likely(tag_end - prev_page <= TARGET_PAGE_SIZE)) { +> + if (likely(tag_last - prev_page <= TARGET_PAGE_SIZE)) { +> /* Memory access stays on one page. */ +> - tag_size = (tag_byte_end - tag_byte_first) / (2 * TAG_GRANULE); +> + tag_size = ((tag_byte_last - tag_byte_first) / (2 * TAG_GRANULE)) + 1; +> mem1 = allocation_tag_mem(env, mmu_idx, ptr, type, total, +> MMU_DATA_LOAD, tag_size, ra); +> if (!mem1) { +> @@ -815,9 +813,9 @@ uint64_t mte_checkN(CPUARMState *env, uint32_t desc, +> mem1 = allocation_tag_mem(env, mmu_idx, ptr, type, next_page - ptr, +> MMU_DATA_LOAD, tag_size, ra); +> +> - tag_size = (tag_byte_end - next_page) / (2 * TAG_GRANULE); +> + tag_size = ((tag_byte_last - next_page) / (2 * TAG_GRANULE)) + 1; +> mem2 = allocation_tag_mem(env, mmu_idx, next_page, type, +> - ptr_end - next_page, +> + ptr_last - next_page + 1, +> MMU_DATA_LOAD, tag_size, ra); +> +> /* +> @@ -838,15 +836,13 @@ uint64_t mte_checkN(CPUARMState *env, uint32_t desc, +> } +> +> /* +> - * If we failed, we know which granule. Compute the element that +> - * is first in that granule, and signal failure on that element. +> + * If we failed, we know which granule. For the first granule, the +> + * failure address is @ptr, the first byte accessed. Otherwise the +> + * failure address is the first byte of the nth granule. +> */ +> if (unlikely(n < tag_count)) { +> - uint64_t fail_ofs; +> - +> - fail_ofs = tag_first + n * TAG_GRANULE - ptr; +> - fail_ofs = ROUND_UP(fail_ofs, esize); +> - mte_check_fail(env, desc, ptr + fail_ofs, ra); +> + uint64_t fault = (n == 0 ? ptr : tag_first + n * TAG_GRANULE); +> + mte_check_fail(env, desc, fault, ra); +> } +> +> done: + + +-- +Alex Bennée + + +This warning is caused by "virtualization=on" QEMU option. This is another QEMU bug AFAIU, see [1] and [2]. + +[1] https://lore.kernel<email address hidden>/ +[2] https://<email address hidden>/T/ + +It gets further without but still spams a lot of failure messages: + +The buggy address belongs to the object at ffffff80036a2200 + which belongs to the cache kmalloc-128 of size 128 +The buggy address is located 11 bytes to the right of + 128-byte region [ffffff80036a2200, ffffff80036a2280) +The buggy address belongs to the page: +page:0000000046e01872 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x436a2 +flags: 0x3fc0000000000200(slab) +raw: 3fc0000000000200 dead000000000100 dead000000000122 f9ffff8001c01e00 +raw: 0000000000000000 0000000080100010 00000001ffffffff f3ffff80036a2401 +page dumped because: kasan: bad access detected +pages's memcg:f3ffff80036a2401 + +Memory state around the buggy address: + ffffff80036a2000: f6 f6 f6 f6 f6 f6 f6 f6 fe fe fe fe fe fe fe fe + ffffff80036a2100: fa fa fa fa fe fe fe fe fe fe fe fe fe fe fe fe +>ffffff80036a2200: f9 f9 f9 f9 f9 f9 f9 f9 fe fe fe fe fe fe fe fe + ^ + ffffff80036a2300: fc fc fc fc fe fe fe fe fe fe fe fe fe fe fe fe + ffffff80036a2400: f3 f3 f3 f3 f3 f3 f3 f3 fe fe fe fe fe fe fe fe +================================================================== +Disabling lock debugging due to kernel taint + # kmalloc_oob_right: EXPECTATION FAILED at lib/test_kasan.c:86 + Expected fail_data.report_expected == fail_data.report_found, but + fail_data.report_expected == 1 + fail_data.report_found == 0 + not ok 1 - kmalloc_oob_right + # kmalloc_oob_left: EXPECTATION FAILED at lib/test_kasan.c:98 + Expected fail_data.report_expected == fail_data.report_found, but + fail_data.report_expected == 1 + fail_data.report_found == 0 + not ok 2 - kmalloc_oob_left + # kmalloc_node_oob_right: EXPECTATION FAILED at lib/test_kasan.c:110 + Expected fail_data.report_expected == fail_data.report_found, but + fail_data.report_expected == 1 + fail_data.report_found == 0 + not ok 3 - kmalloc_node_oob_right + # kmalloc_pagealloc_oob_right: EXPECTATION FAILED at lib/test_kasan.c:130 + Expected fail_data.report_expected == fail_data.report_found, but + fail_data.report_expected == 1 + fail_data.report_found == 0 + not ok 4 - kmalloc_pagealloc_oob_right + # kmalloc_pagealloc_uaf: EXPECTATION FAILED at lib/test_kasan.c:148 + Expected fail_data.report_expected == fail_data.report_found, but + fail_data.report_expected == 1 + fail_data.report_found == 0 + not ok 5 - kmalloc_pagealloc_uaf + + +Is this with QEMU master without the patches mentioned in this bug? + +Which kernel version do you use? + +Could you share your kernel config? + +Re comments #8 and #10, I don't replicate that. +I get full pass on KASAN_UNIT_TEST with +and without virtualization enabled. + +Re comment #9, if there are bugs suspected in qemu, they +need to be reported, or we'll never hear about them. + + +Andrey Konovalov <email address hidden> writes: + +> Is this with QEMU master without the patches mentioned in this bug? + +This is with Richard's latest series. + +> +> Which kernel version do you use? + +v5.11 + +> Could you share your kernel config? + +We are just testing with Richard's config and eliminating compiler +shenanigans now. + + +-- +Alex Bennée + + + +Alex Bennée <email address hidden> writes: + +> Andrey Konovalov <email address hidden> writes: +> +>> Is this with QEMU master without the patches mentioned in this bug? +> +> This is with Richard's latest series. +> +>> +>> Which kernel version do you use? +> +> v5.11 +> +>> Could you share your kernel config? +> +> We are just testing with Richard's config and eliminating compiler +> shenanigans now. + +OK with v5.12-rc5 and Richard's config I get a clean pass. + + +-- +Alex Bennée + + +Ah, there's v4 now. + +Tested with KASAN tests + a custom test to check unaligned accesses that span across two granules, everything works. + +Thank you! + +On Wed, 7 Apr 2021 at 19:54, Alex Bennée <email address hidden> wrote: +> +> +> Richard Henderson <email address hidden> writes: +> +> > We were incorrectly assuming that only the first byte of an MTE access +> > is checked against the tags. But per the ARM, unaligned accesses are +> > pre-decomposed into single-byte accesses. So by the time we reach the +> > actual MTE check in the ARM pseudocode, all accesses are aligned. +> > +> > Therefore, the first failure is always either the first byte of the +> > access, or the first byte of the granule. +> > +> > In addition, some of the arithmetic is off for last-first -> count. +> > This does not become directly visible until a later patch that passes +> > single bytes into this function, so ptr == ptr_last. +> > +> > Buglink: https://bugs.launchpad.net/bugs/1921948 +> +> Minor note: you can Cc: Bug 1921948 <email address hidden> to +> automatically copy patches to the appropriate bugs which is useful if +> you don't have the Cc for the reporter. + +I'm not sure cc'ing bugs on patches is very useful, though (and especially +not for big series). I usually prefer to just add a note to the bug with +the URL of the series in patchew afterwards. + +-- PMM + + + +Richard Henderson <email address hidden> writes: + +> On 4/7/21 11:39 AM, Alex Bennée wrote: +>> Richard Henderson <email address hidden> writes: +>> +>>> We were incorrectly assuming that only the first byte of an MTE access +>>> is checked against the tags. But per the ARM, unaligned accesses are +>>> pre-decomposed into single-byte accesses. So by the time we reach the +>>> actual MTE check in the ARM pseudocode, all accesses are aligned. +>>> +>>> Therefore, the first failure is always either the first byte of the +>>> access, or the first byte of the granule. +>>> +<snip> + +I replicated the original test case as a kunit test: + + static void kmalloc_node_oob_span_right(struct kunit *test) + { + char *ptr; + size_t size = 128; + + ptr = kmalloc_node(size, GFP_KERNEL, 0); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + + KUNIT_EXPECT_KASAN_FAIL(test, *(volatile unsigned long *)(ptr + 124) = 0); + kfree(ptr); + } + +which before this fix triggered: + + [ 6.753429] # kmalloc_node_oob_span_right: EXPECTATION FAILED at lib/test_kasan.c:163 + [ 6.753429] Expected ({ do { extern void __compiletime_assert_337(void) __attribute__((__error__("Unsupported access size for {READ,WRITE}_ONCE()."))); if (!((sizeof( + fail_data.report_expected) == sizeof(char) || sizeof(fail_data.report_expected) == sizeof(short) || sizeof(fail_data.report_expected) == sizeof(int) || sizeof(fail_data.repo + rt_expected) == sizeof(long)) || sizeof(fail_data.report_expected) == sizeof(long long))) __compiletime_assert_337(); } while (0); (*(const volatile typeof( _Generic((fail_d + ata.report_expected), char: (char)0, unsigned char: (unsigned char)0, signed char: (signed char)0, unsigned short: (unsigned short)0, signed short: (signed short)0, unsigned + int: (unsigned int)0, signed int: (signed int)0, unsigned long: (unsigned long)0, signed long: (signed long)0, unsigned long long: (unsigned long long)0, signed long long: + (signed long long)0, default: (fail_data.report_expected))) * + [ 6.759388] not ok 4 - kmalloc_node_oob_span_right + +And is OK by the end of the series: + + [ 6.587381] The buggy address belongs to the object at ffff000003325800 + [ 6.587381] which belongs to the cache kmalloc-128 of size 128 + [ 6.588372] The buggy address is located 0 bytes to the right of + [ 6.588372] 128-byte region [ffff000003325800, ffff000003325880) + [ 6.589354] The buggy address belongs to the page: + [ 6.589948] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x43325 + [ 6.590911] flags: 0x3fffc0000000200(slab) + [ 6.591648] raw: 03fffc0000000200 dead000000000100 dead000000000122 fdff000002401200 + [ 6.592346] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 + [ 6.593007] page dumped because: kasan: bad access detected + [ 6.593532] + [ 6.593775] Memory state around the buggy address: + [ 6.594360] ffff000003325600: f3 f3 f3 f3 f3 f3 f3 f3 fe fe fe fe fe fe fe fe + [ 6.594991] ffff000003325700: fd fd fd fd fd fd fd fd fe fe fe fe fe fe fe fe + [ 6.595594] >ffff000003325800: f8 f8 f8 f8 f8 f8 f8 f8 fe fe fe fe fe fe fe fe + [ 6.596180] ^ + [ 6.596714] ffff000003325900: f7 f7 f7 f7 fe fe fe fe fe fe fe fe fe fe fe fe + [ 6.597309] ffff000003325a00: f4 f4 fe fe fe fe fe fe fe fe fe fe fe fe fe fe + [ 6.597925] ================================================================== + [ 6.598513] Disabling lock debugging due to kernel taint + [ 6.600353] ok 1 - kmalloc_node_oob_span_right + [ 6.600554] ok 1 - kasan + +But it still fails this patch until: + + target/arm: Fix unaligned checks for mte_check1, mte_probe1 + +So I guess that is the one that should have the buglink. + +Anyway code wise: + +Reviewed-by: Alex Bennée <email address hidden> + +-- +Alex Bennée + + + +Peter Maydell <email address hidden> writes: + +> On Wed, 7 Apr 2021 at 19:54, Alex Bennée <email address hidden> wrote: +>> +>> +>> Richard Henderson <email address hidden> writes: +>> +>> > We were incorrectly assuming that only the first byte of an MTE access +>> > is checked against the tags. But per the ARM, unaligned accesses are +>> > pre-decomposed into single-byte accesses. So by the time we reach the +>> > actual MTE check in the ARM pseudocode, all accesses are aligned. +>> > +>> > Therefore, the first failure is always either the first byte of the +>> > access, or the first byte of the granule. +>> > +>> > In addition, some of the arithmetic is off for last-first -> count. +>> > This does not become directly visible until a later patch that passes +>> > single bytes into this function, so ptr == ptr_last. +>> > +>> > Buglink: https://bugs.launchpad.net/bugs/1921948 +>> +>> Minor note: you can Cc: Bug 1921948 <email address hidden> to +>> automatically copy patches to the appropriate bugs which is useful if +>> you don't have the Cc for the reporter. +> +> I'm not sure cc'ing bugs on patches is very useful, though (and especially +> not for big series). I usually prefer to just add a note to the bug with +> the URL of the series in patchew afterwards. + +Ideally the tooling would pick up bug links in Patchew and automatically +update the relevant bugs when new series are posted. I only mentioned it +because the original bug reporters weren't Cc'ed on the patches and lo +now they know about the iteration they have tested it ;-) + +-- +Alex Bennée + + +It looks like there's still a bug here: I'm seeing false positive MTE faults for unaligned accesses that touch multiple pages. This userspace assembly program demonstrates the problem, but for some reason it only reproduces some of the time for me: + +.arch_extension memtag + +.globl _start +_start: +mov x0, #0x37 // PR_SET_TAGGED_ADDR_CTRL +mov x1, #0x3 // PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_ASYNC +mov x2, #0 +mov x3, #0 +mov x4, #0 +mov x8, #0xa7 // prctl +svc #0 + +mov x0, xzr +mov w1, #0x2000 +mov w2, #0x23 // PROT_READ|PROT_WRITE|PROT_MTE +mov w3, #0x22 // MAP_PRIVATE|MAP_ANONYMOUS +mov w4, #0xffffffff +mov x5, xzr +mov x8, #0xde // mmap +svc #0 + +mov x1, #(1 << 56) +add x0, x0, x1 +add x0, x0, #0xff0 +stg x0, [x0] +stg x0, [x0, #16] +str x1, [x0, #12] + +mov x0, #0 +mov x8, #0x5d // exit +svc #0 + +(s/PR_MTE_TCF_ASYNC/PR_MTE_TCF_SYNC/g in the above program -- but the actual constant is correct) + +I see something similar in memset + +It SEGV on +stur q0, [x4, #-16] +for x4 set to 0xd000055214fe008 + +and near tags are 0xd000055214fdff0 and 0xd000055214fe000 + +I happened to notice that you're moving your bug tracker to gitlab so I refiled this issue over there: https://gitlab.com/qemu-project/qemu/-/issues/403 + +Thanks for opening the new ticket. I'm closing this one here on Launchpad now so that we don't accidentally migrate it later automatically. + diff --git a/results/classifier/zero-shot/105/semantic/1922391 b/results/classifier/zero-shot/105/semantic/1922391 new file mode 100644 index 000000000..fdde4a6e2 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1922391 @@ -0,0 +1,140 @@ +semantic: 0.911 +graphic: 0.878 +assembly: 0.857 +device: 0.838 +other: 0.837 +instruction: 0.815 +boot: 0.796 +vnc: 0.794 +network: 0.784 +socket: 0.761 +mistranslation: 0.731 +KVM: 0.603 + +qemu-system-ppc assertion "!mr->container" failed + +Hi, + +I'm trying to run the NetBSD/macppc 8.2 installer (which is 32-bit ppc) in qemu-system-ppc +version 5.2.0, and I'm hitting this assertion failure quite a bit into the "unpacking sets" +part of the installation procedure, unpacking from the install iso image. + +Qemu is run on a NetBSD/amd64 9.1 host system. The stack backtrace from the core file is + +Program terminated with signal SIGABRT, Aborted. +#0 0x000078859a36791a in _lwp_kill () from /usr/lib/libc.so.12 +[Current thread is 1 (process 1)] +(gdb) where +#0 0x000078859a36791a in _lwp_kill () from /usr/lib/libc.so.12 +#1 0x000078859a3671ca in abort () from /usr/lib/libc.so.12 +#2 0x000078859a2a8507 in __assert13 () from /usr/lib/libc.so.12 +#3 0x000000015a3c19c0 in memory_region_finalize () +#4 0x000000015a3fef1c in object_unref () +#5 0x000000015a3feee6 in object_unref () +#6 0x000000015a374154 in address_space_unmap () +#7 0x000000015a276551 in pmac_ide_atapi_transfer_cb () +#8 0x000000015a150a59 in dma_blk_cb () +#9 0x000000015a46a1c7 in blk_aio_complete () +#10 0x000000015a5a617d in coroutine_trampoline () +#11 0x000078859a264150 in ?? () from /usr/lib/libc.so.12 +Backtrace stopped: Cannot access memory at address 0x7884894ff000 +(gdb) + +I start qemu with this small script: + +--- +#!/bin/sh + +MEM=3g +qemu-system-ppc \ + -M mac99,via=pmu \ + -m $MEM \ + -nographic \ + -drive id=hda,format=raw,file=disk.img \ + -L pc-bios \ + -netdev user,id=net0,hostfwd=tcp::2223-:22,ipv6=off \ + -net nic,model=rtl8139,netdev=net0 \ + -boot d \ + -cdrom NetBSD-8.2-macppc.iso +--- + +and boot the install kernel with "boot cd:ofwboot.xcf". If someone wants +to replicate this I can provide more detailed instructions to repeat the +procedure I used to start the install. + +Any hints about what more to look for? + +Regards, + +- Håvard + +Hmm, + +it seems I need to retract this bug. It turns out that the 32-bit macppc port +of NetBSD only supports a maximum of 2GB of memory. As a NetBSD developer said it: + +> The physical memory map on G4 Macs doesn't have room for more than 2G of RAM. + +So, I've set the status of this bug report to "Invalid", as that seemed to be the +best fit. + +Regards, + +- Håvard + + +If the machine can not support more than 2GB, QEMU should report an error when the user tries to assign too many memory, not crash and let it figure out. +Setting the bug status to confirmed. + +Proposed fix: +https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg00570.html + +On 4/7/21 3:11 PM, Mark Cave-Ayland wrote: +> On 06/04/2021 09:48, Philippe Mathieu-Daudé wrote: +> +>> On Mac99 and newer machines, the Uninorth PCI host bridge maps +>> the PCI hole region at 2GiB, so the RAM area beside 2GiB is not +>> accessible by the CPU. Restrict the memory to 2GiB to avoid +>> problems such the one reported in the buglink. +>> +>> Buglink: https://bugs.launchpad.net/qemu/+bug/1922391 +>> Reported-by: Håvard Eidnes <email address hidden> +>> Signed-off-by: Philippe Mathieu-Daudé <email address hidden> +>> --- +>> hw/ppc/mac_newworld.c | 4 ++++ +>> 1 file changed, 4 insertions(+) +>> +>> diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c +>> index 21759628466..d88b38e9258 100644 +>> --- a/hw/ppc/mac_newworld.c +>> +++ b/hw/ppc/mac_newworld.c +>> @@ -157,6 +157,10 @@ static void ppc_core99_init(MachineState *machine) +>> } +>> /* allocate RAM */ +>> + if (machine->ram_size > 2 * GiB) { +>> + error_report("RAM size more than 2 GiB is not supported"); +>> + exit(1); +>> + } +>> memory_region_add_subregion(get_system_memory(), 0, machine->ram); +>> /* allocate and load firmware ROM */ +> +> I think the patch is correct, however I'm fairly sure that the default +> g3beige machine also has the PCI hole located at 0x80000000 so the same +> problem exists there too. +> +> Also are you keen to get this merged for 6.0? It doesn't seem to solve a +> security issue/release blocker and I'm sure the current behaviour has +> been like this for a long time... + +No problem. I wanted to revisit this bug anyway, I realized during the +night, while this patch makes QEMU exit cleanly, it hides the bug which +is likely in TYPE_MACIO_IDE (I haven't tried Håvard's full reproducer). + +Regards, + +Phil. + + +Philippe's fix has been merged here: +https://gitlab.com/qemu-project/qemu/-/commit/03b3542ac93cb196bf6a6 + diff --git a/results/classifier/zero-shot/105/semantic/1923197 b/results/classifier/zero-shot/105/semantic/1923197 new file mode 100644 index 000000000..06e794798 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1923197 @@ -0,0 +1,149 @@ +semantic: 0.907 +mistranslation: 0.879 +other: 0.869 +assembly: 0.855 +instruction: 0.854 +socket: 0.832 +device: 0.813 +graphic: 0.805 +KVM: 0.794 +vnc: 0.737 +boot: 0.724 +network: 0.710 + +RISC-V priviledged instruction error + +Hello when performing an MRET with MPP set to something else than 0b11 in MSTATUS, 'Invalid Instruction' exception will be triggered. The problem appeared in code after version 5.2.0. + +<pre> + # setup interrupt handling for monitor mode + la t0, entry_loop + la t1, entry_trap + li t2, 0x888 + li t3, 0x1880 # MPP in MSTATUS selects to which mode to return & MPIE selects if to enable interrupts after MRET + csrw mepc, t0 + csrw mtvec, t1 + csrs mie, t2 + csrs mstatus, t3 + + # if supervisor mode not supported, then loop forever + csrr t0, misa + li t1, 0x40000 + and t2, t1, t0 + beqz t2, 1f + + # setup interrupt i& exception delegation for supervisor mode + li t0, 0xc0000000 # 3 GiB (entry address of supervisor) + li t1, 0x1000 + #li t2, 0x300 # bit 8 & 9 is for ecall from user & supervisor mode + #li t3, 0x222 + csrw mepc, t0 + csrc mstatus, t1 + #csrs medeleg, t2 + #csrs mideleg, t3 + + # pass mhartid as first parameter to supervisor + csrr a0, mhartid + +1: + mret +</pre> + +I'm guessing that this is a bug in your guest as it hasn't configured PMP regions. + +From the RISC-V spec: + +" +If no PMP entry matches an M-mode access, the access succeeds. If no PMP entry matches an +S-mode or U-mode access, but at least one PMP entry is implemented, the access fails. +" + +Confusingly implemented here means implemented in hardware, not just configured. + +You can check this by reverting this QEMU commit: + +commit d102f19a2085ac931cb998e6153b73248cca49f1 +Author: Atish Patra <email address hidden> +Date: Wed Dec 23 11:25:53 2020 -0800 + + target/riscv/pmp: Raise exception if no PMP entry is configured + + As per the privilege specification, any access from S/U mode should fail + if no pmp region is configured. + + Signed-off-by: Atish Patra <email address hidden> + Reviewed-by: Alistair Francis <email address hidden> + Message-id: <email address hidden> + Signed-off-by: Alistair Francis <email address hidden> + + +Hello Francis, + +I'll configure PMP than do the test again. Sorry I hadn't understood what +changed between version 5.2 and 6.0-rc2, since my code worked before. + +Best regards, +Teodori Serge + +On Thu, 15 Apr 2021, 06:15 Alistair Francis, <email address hidden> +wrote: + +> I'm guessing that this is a bug in your guest as it hasn't configured +> PMP regions. +> +> >From the RISC-V spec: +> +> " +> If no PMP entry matches an M-mode access, the access succeeds. If no PMP +> entry matches an +> S-mode or U-mode access, but at least one PMP entry is implemented, the +> access fails. +> " +> +> Confusingly implemented here means implemented in hardware, not just +> configured. +> +> ** Changed in: qemu +> Status: Confirmed => Invalid +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1923197 +> +> Title: +> RISC-V priviledged instruction error +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1923197/+subscriptions +> + + +We fixed a bug to make QEMU act more like hardware, which now means that PMP must be configured in M-mode. + +Hello Francis, + +Yes thank you. I added code to setup a basic PMP and it works now. Thank +you and best regards, + +Teodori Serge + +On Sun, 18 Apr 2021, 05:55 Alistair Francis, <email address hidden> +wrote: + +> We fixed a bug to make QEMU act more like hardware, which now means that +> PMP must be configured in M-mode. +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1923197 +> +> Title: +> RISC-V priviledged instruction error +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1923197/+subscriptions +> + + diff --git a/results/classifier/zero-shot/105/semantic/1924603 b/results/classifier/zero-shot/105/semantic/1924603 new file mode 100644 index 000000000..01c8c84ea --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1924603 @@ -0,0 +1,112 @@ +semantic: 0.906 +other: 0.901 +KVM: 0.898 +graphic: 0.898 +instruction: 0.892 +assembly: 0.867 +device: 0.854 +mistranslation: 0.844 +socket: 0.818 +network: 0.810 +boot: 0.787 +vnc: 0.782 + +Incorrect feature negotiation for vhost-vdpa netdevice + +QEMU cmdline: +============= +./x86_64-softmmu/qemu-system-x86_64 -machine accel=kvm -m 2G -hda /gautam/centos75_1.qcow2 -name gautam,process=gautam -enable-kvm -netdev vhost-vdpa,id=mynet0,vhostdev=/dev/vhost-vdpa-0 -device virtio-net-pci,netdev=mynet0,mac=02:AA:BB:DD:00:20,disable-modern=off,page-per-vq=on -cpu host --nographic + +Host OS: +======== +Linux kernel 5.11 running on x86 host + +Guest OS: +========== +CentOS 7.5 + +Root cause analysis: +===================== + +For vhost-vdpa netdevice, the feature negotiation results in sending the superset of features received from device in call to get_features vdpa ops callback. + +During the feature-negotiation phase, the acknowledged feature bits are initialized with backend_features and then checked for supported feature bits in vhost_ack_features(): + +void vhost_net_ack_features(struct vhost_net *net, uint64_t features) +{ + net->dev.acked_features = net->dev.backend_features; + vhost_ack_features(&net->dev, vhost_net_get_feature_bits(net), features); +} + + +The vhost_ack_features() function just builds up on the dev.acked_features and never trims it down: + +void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits, uint64_t features) +{ const int *bit = feature_bits; + + while (*bit != VHOST_INVALID_FEATURE_BIT) { + uint64_t bit_mask = (1ULL << *bit); + + if (features & bit_mask) + hdev->acked_features |= bit_mask; + + bit++; + } +} + +Because of this hdev->acked_features is always minimally equal to the value of device features and this is the value that is passed to the device in set_features callback: + +static int vhost_dev_set_features(struct vhost_dev *dev, bool enable_log) +{ + uint64_t *features = dev->acked_features; + ..... + r = dev->vhost_ops->*vhost_set_features*(dev, features); +} + +QEMU version: 5.1.0 + +https://qemu-devel.nongnu.narkive.com/jUimpLt0/patch-vhost-net-initialize-acked-features-to-a-safe-value-during-ack +This review of a patch that introduced "acked_features = backend_features" behaviour suggests that acked_features should be 0 by default, but it ended up pushing it as it is now (as they say that not setting acked_features to backend_features sometimes makes acked_features value 'unexpected') + +Other devices initialize acked_features to guest_features (basically what VM writes to driver_features ANDed with device_features), changing default value to any of those (0 as review initially suggested, or guest_features) should fix this problem for us + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +This ticket has been moved here (thanks, Gautam): +https://gitlab.com/qemu-project/qemu/-/issues/331 +... thus I'm closing this on Launchpad now. + +Thanks Thomas Huth. + +I couldn't find an option to assign the issue on gitlab to anyone. Can you please help with that? + +Not sure who should be the assignee here ... maybe it's best if you write a mail to the people who have been involved in the original code (see https://gitlab.com/qemu-project/qemu/-/commit/108a64818e69be0a97c ) and ask them who could have a look at this issue. + diff --git a/results/classifier/zero-shot/105/semantic/1939 b/results/classifier/zero-shot/105/semantic/1939 new file mode 100644 index 000000000..f0eb12693 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1939 @@ -0,0 +1,77 @@ +semantic: 0.891 +instruction: 0.879 +mistranslation: 0.874 +device: 0.823 +graphic: 0.778 +socket: 0.749 +boot: 0.717 +assembly: 0.661 +network: 0.607 +KVM: 0.554 +other: 0.501 +vnc: 0.489 + +qemu master git can no longer be compiled under MacOs Sonoma 14.0 +Description of problem: + +Steps to reproduce: +Qemu master git fails to compile under MacOs M1/2, I already tested it with "git-bisect" "git bisect good" and "git bisect bad".All dependencies for qemu are fulfilled and were installed using Homebrew under MacOs.It fails with these commits: + + +`>>>>> commit 7c3fb52bcdaef85b15a91b3ca4d1516f9d9b5402 +>>>>> Author: Paolo Bonzini <pbonzini@redhat.com> +>>>>> Date: Tue Aug 8 20:28:25 2023 +0200 +>>>>> +>>>>> configure: never use PyPI for Meson +>>>>> +>>>>> Since there is a vendored copy, there is no point in choosing +>> online +>>>> +>>>>> operation. +>>>>> +>>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> +>> +>>>>> +>>>>> configure | 6 ------ +>>>>> 1 file changed, 6 deletions(-) +>>>>>` +Additional information: +Older sources Qemu 8.1 can be compiled without problems. The only thing that has changed is that I did a major system update and Xcode was also updated. Since then compiling on qemu master version 8.1.50 breaks. + +``` +`On branch master +Your branch is up to date with 'origin/master'. + +nothing to commit, working tree clean +Mac-Studio qemu % ./configure --target-list=ppc-softmmu +Using './build' as the directory for build output +python determined to be '/Library/Frameworks/Python.framework/Versions/3.10/bin/python3' +python version: Python 3.10.8 +mkvenv: Creating non-isolated virtual environment at 'pyvenv' +mkvenv: checking for tomli>=1.2.0 +mkvenv: installing tomli>=1.2.0 +mkvenv: checking for meson>=0.63.0 +mkvenv: installing meson==0.63.3 +mkvenv: checking for sphinx>=1.6 +mkvenv: checking for sphinx_rtd_theme>=0.5 + +'sphinx==5.3.0' not found: +• Python package 'sphinx' was not found nor installed. +• mkvenv was configured to operate offline and did not check PyPI. + + +Sphinx not found/usable, disabling docs. +Disabling PIE due to missing toolchain support +The Meson build system +Version: 0.63.3 +Source dir: /Users/qemu +Build dir: /Users/qemu/build +Build type: native build +Project name: qemu +Project version: 8.1.50 + +../meson.build:1:0: ERROR: Unable to detect linker for compiler `cc -Wl,--version` +stdout: +stderr: ld: unknown options: --version +clang: error: linker command failed with exit code 1 (use -v to see invocation)` +``` diff --git a/results/classifier/zero-shot/105/semantic/1948 b/results/classifier/zero-shot/105/semantic/1948 new file mode 100644 index 000000000..790f1ff62 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/1948 @@ -0,0 +1,16 @@ +semantic: 0.847 +graphic: 0.839 +device: 0.767 +instruction: 0.755 +network: 0.554 +other: 0.546 +vnc: 0.488 +socket: 0.478 +mistranslation: 0.421 +boot: 0.312 +assembly: 0.300 +KVM: 0.282 + +ARM GICv3 cannot support irq number > 992 +Description of problem: +If we want to create a gic with supported irq number 992, we need to set the `num-irq` property to 992 + 32 while 32 is the extra SGI number. But there is a problem, when QEMU initialize GICv3, it will check the variable `num_irq <= 1020 && (num_irq & 32) == 0`, which will lead to error abort. So there is no way to bypass the ```num_irq <= 1020``` check and we cannot use irq number bigger than 992 while in ARM GIC specification, irq number < 1020 should all be aviliable to use. diff --git a/results/classifier/zero-shot/105/semantic/2007 b/results/classifier/zero-shot/105/semantic/2007 new file mode 100644 index 000000000..af21a48f1 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2007 @@ -0,0 +1,42 @@ +semantic: 0.833 +device: 0.796 +instruction: 0.778 +graphic: 0.731 +other: 0.657 +network: 0.617 +assembly: 0.594 +vnc: 0.565 +socket: 0.476 +KVM: 0.441 +boot: 0.428 +mistranslation: 0.424 + +Unable to update APIC_TPR when x2APIC is enabled and -global kvm-pit.lost_tick_policy=discard parameter provided +Description of problem: +I am developing a custom OS and I wanted to implement x2APIC support. I was able to enable x2APIC, read and write some registers, like APIC_VER and APIC_SIVR. Everything looks good, except that I cannot update APIC_TPR register. Reading it always returns 0. The code I wrote works properly on bare metal. Below some observations: + +Scenario 1: +1. Enable x2APIC +2. Write to CR8 - success +3. Read from CR8 - gives correct value +4. Read from APIC_TPR - gives correct value + +Scenario 2: +1. Enable x2APIC +2. Read from APIC_TPR - gives 0 +3. Write to APIC_TPR +4. Read from APIC_TPR - gives 0 again + +Scenario 3: +1. Initialize APIC (LAPIC or xAPIC) +2. Write to APIC_TPR +3. Read from APIC_TPR - gives correct value +4. Switch to x2APIC +5. Read from APIC_TPR - gives correct value stored in pt. 2 +6. Write to APIC_TPR +7. Read from APIC_TPR - gives values stored in pt.2, not in point 6! + +Looks like APIC_TPR is stuck at value stored there before switching to x2APIC and it cannot be updated with MSR. Only update CR8 works. +I have checked parameters I passed to qemu. After removing `-global kvm-pit.lost_tick_policy=discard` problem is gone and APIC_TPR is updated correctly. +Additional information: +Please let me know if you need additional information. diff --git a/results/classifier/zero-shot/105/semantic/2064 b/results/classifier/zero-shot/105/semantic/2064 new file mode 100644 index 000000000..294752d8a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2064 @@ -0,0 +1,25 @@ +semantic: 0.619 +device: 0.610 +graphic: 0.580 +instruction: 0.519 +socket: 0.490 +other: 0.489 +mistranslation: 0.448 +network: 0.376 +vnc: 0.375 +assembly: 0.368 +boot: 0.298 +KVM: 0.203 + +QEMU v8.2.0-rc4 and above will not take SMI +Description of problem: +Starting from v8.2.0-rc4, the x86 QEMU system will take SMI from an incorrect starting address. Without any firmware relocation, sending an SMI will move the RIP to 0x8000, instead of the traditional 0x38000. This caused the existing UEFI drivers not functional during SMI relocation step. + +After some investigation, the issue was caused by this commit: https://github.com/qemu/qemu/commit/b5e0d5d22fbffc3d8f7d3e86d7a2d05a1a974e27. There seems to be 2 issues with this change: + +1. This code section https://github.com/qemu/qemu/blob/7425b6277f12e82952cede1f531bfc689bf77fb1/target/i386/tcg/translate.c#L568C1-L572C6 was updated from calculating `cpu_eip` based on `s->pc` to `s->base.pc_next`. This will cause undetermined behavior. +2. This code section https://github.com/qemu/qemu/blob/7425b6277f12e82952cede1f531bfc689bf77fb1/target/i386/tcg/translate.c#L2848C1-L2869C67 added the routine of updating `new_pc`, which is later used `tcg_gen_addi_tl`. This will cause the `new_pc` to be populated with undesirable value and thus cause faulting behaviors. +Steps to reproduce: +1. Launch once booting UEFI firmware, and the system will get stuck at the SMM base relocation logic. +Additional information: +I verified that after fixing the 2 issues mentioned above, the SMI can be correctly invoked at desired location. diff --git a/results/classifier/zero-shot/105/semantic/2103 b/results/classifier/zero-shot/105/semantic/2103 new file mode 100644 index 000000000..a5ca3fd23 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2103 @@ -0,0 +1,14 @@ +semantic: 0.517 +device: 0.489 +mistranslation: 0.369 +network: 0.233 +graphic: 0.220 +KVM: 0.175 +vnc: 0.131 +socket: 0.122 +other: 0.119 +boot: 0.110 +instruction: 0.100 +assembly: 0.015 + +docs/system/keys.rst.inc still refers to removed options -alt-grab and -ctrl-grab diff --git a/results/classifier/zero-shot/105/semantic/2185 b/results/classifier/zero-shot/105/semantic/2185 new file mode 100644 index 000000000..ccc0008e4 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2185 @@ -0,0 +1,14 @@ +semantic: 0.621 +instruction: 0.465 +device: 0.454 +mistranslation: 0.338 +network: 0.124 +graphic: 0.116 +boot: 0.102 +vnc: 0.097 +KVM: 0.087 +other: 0.075 +assembly: 0.044 +socket: 0.038 + +spapr watchdog should honour watchdog-set-action etc monitor commands diff --git a/results/classifier/zero-shot/105/semantic/2253 b/results/classifier/zero-shot/105/semantic/2253 new file mode 100644 index 000000000..0a261be6b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2253 @@ -0,0 +1,14 @@ +semantic: 0.818 +network: 0.768 +device: 0.753 +instruction: 0.703 +mistranslation: 0.600 +graphic: 0.574 +socket: 0.299 +other: 0.283 +vnc: 0.233 +assembly: 0.182 +boot: 0.159 +KVM: 0.051 + +NO_CAST.INTEGER_OVERFLOW in /hw/net/eepro100.c diff --git a/results/classifier/zero-shot/105/semantic/2280 b/results/classifier/zero-shot/105/semantic/2280 new file mode 100644 index 000000000..fdb24c4e9 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2280 @@ -0,0 +1,14 @@ +semantic: 0.869 +assembly: 0.795 +device: 0.748 +graphic: 0.726 +mistranslation: 0.605 +network: 0.545 +boot: 0.375 +KVM: 0.360 +other: 0.258 +socket: 0.171 +vnc: 0.132 +instruction: 0.119 + +Not Installing Properly diff --git a/results/classifier/zero-shot/105/semantic/237164 b/results/classifier/zero-shot/105/semantic/237164 new file mode 100644 index 000000000..9cb014fb1 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/237164 @@ -0,0 +1,108 @@ +semantic: 0.922 +other: 0.887 +instruction: 0.886 +mistranslation: 0.862 +assembly: 0.856 +graphic: 0.856 +device: 0.855 +socket: 0.824 +network: 0.801 +KVM: 0.785 +boot: 0.735 +vnc: 0.734 + +kvm needs to correctly simulate a proper monitor + +Binary package hint: xorg + +With xserver-xor-video-cirrus 1.2.1, there should be no need to require special handling for kvm in dexconf any longer. +See also: bug 193323. + +Quote from Bryce: +>Possibly with this fix, some portion of the kvm-specific changes to +>dexconf could be dropped. +> +>If anyone is interested in assisting with this, please file a new bug assigned to me, attach a minimal xorg.conf that has been adequately tested. Here are >the current kvm-specific things dexconf is doing: +>a) hardcoding the driver to cirrus +>b) specifying the depth +>c) setting the HorizSync and VertRefresh +>d) specifying the available resolutions +> +>In theory, none of these four things should be necessary, but I suspect +>this bug fix only addresses b and maybe c. Please test if these can be +>removed and if so, file a bug and I can take care of dropping them in +>dexconf. Thanks ahead of time. + +considering this is a follow-up bug to #193323, it should certainly be marked as 'confirmed', since it is a genuie issue. + + +Since I've compared qemu and kvm sources to find out why kvm works, and qemu doesn't (d'oh *g*), here my results: +a) not too sure if this is addressed with the update, or if this was a problem in the first place. +b) dexconf sets the depth to 24, which now the driver also does if it finds the corresponding cirrus card +c) haven't seen any implementation difference between qemu/kvm, so it should work +d) same as for c). + +To make FAUmachine (which however has a different cirrus implementation than qemu) work with the old cirrus driver, the only thing that was needed in the first place was to set the default depth to 24bpp. + +However I suggest to keep this bug in the state new, until anyone has in fact tested that kvm works with a plain xorg.conf. + +bryce: none of the quirks you are mentioning are actually working in qemu. The relevant part of dexconf that detects kvm is in line 271: + +QEMU_KVM=$(grep "QEMU Virtual CPU" /proc/cpuinfo || true) +if [ -n "$QEMU_KVM" ]; then + DEVICE_DRIVER="cirrus" +fi + +Only kvm reports that in /proc/cpuinfo. qemu reports "Pentium II (Katmai)", which is the very reason why the hardy live cd works in kvm but not in qemu. + +TBH, I'd suggest to just strip the kvm quircks out of dexconf in intrepid right now, and see if a daily livecd comes up. I'm pretty confident that it does so. + +Okay, I've stripped those all out of dexconf and repackaged xorg accordingly. Could you please test and verify it works ok? + +http://people.ubuntu.com/~bryce/Testing/xorg/ + +as asked on irc: can you provide a .deb for x11-common there as well? (iirc dexconf is in there... at least it's not in the .debs you put up there ;)) + +Just tested kvm with the hardy cd, installing xserver-xorg-video-cirrus from intrepid, and then x11-common and rerunning dexconf. +gdm comes up, however it uses a smaller resolution by default then. + +I'll attach xorg.conf (as supplied by the dexconf run), and Xorg.0.log (from the start with the new driver/new xorg.log) in a minute. + + + + + +what's the status of this? The kvm environment (still) doesn't seem to autoconfigure too well, that's why the Modes and HorizSync/VertRefresh are hardcoded. + +I just tested this, and Gnome comes up just fine without xorg.conf, however, the screen resolution is a sad little 800x600 without xorg.conf. It's 1024x768 with xorg.conf. + +:-Dustin + +kirkland confirmed that kvm still does not work properly without these quirks, so they cannot be dropped at this time. Feel free to reopen the xorg task if this situation changes, but moving the issue to kvm for now. + +Hi, + +to fix the kvm issue, kvm needs to simulate a monitor attached to the cirrus card, together with an EDID eeprom delivering the correct data for monitor modes. The simulated cirrus card shoul provide this via register sr8. A sample implementation can be found at www.faumachine.org (cvs checkout, see node-pc/simulator/chip_cirrus_gd5446.c -- based on qemu -- and lib/sig_i2c_bus.c and finally node-pc/monitor.c for the EDID contents) for details how to do it. + +Feel free to ask if anything is unclear. + +Cheers, + Stefan. + +Hey Stefan- + +There was actually some discussion upstream among KVM and Xorg developers. I think they determined that this was a 'won't fix' situation, but I need to check that. Let me track that down... + + +:-Dustin + +As a workaround, the driver itself can force the resolution to a certain degree. This is covered in bug #349331 + +Isn't the issue here that the emulated card has too low video memory forcing 800x600 when the driver selects the default 24bpp depth? + +This is an issue with some very old real hardware too. + +I guess X could account for that but due to its architecture every driver would likely have a separate check for this condition (S3, cirrus, and any other driver that could be possibly used with such low-mem card). + +Since cirrus is not the prefered graphics card in QEMU anymore, and there hasn't been any update to this within the last four years, I think nobody will take care of this ticket anymore, so setting the status to "Won't fix" now. + diff --git a/results/classifier/zero-shot/105/semantic/2378 b/results/classifier/zero-shot/105/semantic/2378 new file mode 100644 index 000000000..15b4a8c65 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2378 @@ -0,0 +1,41 @@ +semantic: 0.891 +graphic: 0.883 +other: 0.793 +device: 0.757 +network: 0.690 +vnc: 0.675 +instruction: 0.643 +mistranslation: 0.581 +socket: 0.548 +KVM: 0.493 +assembly: 0.430 +boot: 0.373 + +make install (meson?) removes needed RPATH for libslirp, making build on CentOS 9 difficult +Description of problem: +make install appears to remove need RPATH attributes from the binary, making it difficult if not impossible to install Qemu 9.0.0 on a CentOS 9 machine. + +I'm trying to build Qemu 9.0.0 on a CentOS 9 Stream machine where I do not have root. +The system ships with libslirp-4.4.0-7.el9.src.rpm which is libslirp 4.4.0, which is too old for Qemu. + +I checked out https://gitlab.freedesktop.org/slirp/libslirp.git which is 2 commits more recent than +libslirp 4.8.0. I installed this version in a separate directory. + +When I configure Qemu using PKG_CONFIG_PATH, it builds the correct executable with the correct RPATH. +readelf -d shows: + + 0x000000000000000f (RPATH) Library rpath: [/web/courses/cs4284/pintostools/lib64] + +which is the correct directory where the proper version of libslirp is located. + +However, when I run "make install" the RPATH attribute is removed. Thus, Qemu resorts to the system version, which is version 4.4 (with which Qemu won't run.) + +Meson's propensity to strip necessary RPATHs appears to be well-known, see, for instance, + +https://github.com/mesonbuild/meson/issues/4027 + +(There is a fix for at least some of the problems in 0.55.0 of Meson +https://mesonbuild.com/Release-notes-for-0-55-0.html +Qemu 9.0.0 appears to use Meson 1.2.3., but yet it still fails.) + +Work-around: don't use make install, copy it directly from the build directory to the destination directory. diff --git a/results/classifier/zero-shot/105/semantic/2393 b/results/classifier/zero-shot/105/semantic/2393 new file mode 100644 index 000000000..348595d4d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2393 @@ -0,0 +1,33 @@ +semantic: 0.874 +boot: 0.872 +graphic: 0.849 +other: 0.817 +instruction: 0.781 +device: 0.771 +vnc: 0.700 +socket: 0.543 +KVM: 0.535 +network: 0.512 +mistranslation: 0.461 +assembly: 0.460 + +qemu: seabios hangs for 10~15 sec at boot with `-machine q35` +Description of problem: +Whenever i'm starting a virtual machine i'm having the issue that seabios (or at least that's what i see) hangs for about 10~15 seconds. In that time on of the cpu cores runs at 100%. +This issue isn't new actually. I'm having this already for quite some time and a i think for at least the last 2 major versions. I haven't looked into it since it isn't a big issue, just annoying. +Today i've looked into it and as far as i can see, this issue is always present with the flag `-machine q35`, which is the default for my vm's. If i set it to `-machine pc`, booting works as expected. However i also found a "workaround" where the vm's starting immediately (with `-machine q35` enabled), which is by simply adding a iso image to the command line (via -cdrom) - even though it's not used. + +This means: +- 15 sec delay: qemu-system-x86_64 -machine q35 +- works immediately: qemu-system-x86_64 -machine q35 -cdrom /mnt/data/vm/isos/openSUSE-Tumbleweed-DVD-x86_64-Snapshot20230303-Media.iso + +Please note that most of my vm's usually start booting from a kernel image directly (-kernel /mnt/data/vm/kernel/gentoo-latest -initrd /mnt/data/vm/kernel/initrd-v5.cpio.gz) - but even in that case settings a cdrom (image) would fix the issue. +Also, the image needs to be a valid one, if i set an empty file or /dev/null the issue would remain. +Further more, i have the same issue on a second computer. This also runs on Gentoo Linux and is also a AMD Ryzen. (in case this is relevant) +Steps to reproduce: +1. qemu-system-x86_64 -machine q35 +2. wait about 10-15sec before boot continues +Additional information: +I was thinking to add an Screenshot of the hanging boot process, but the only text written there is: +SeaBIOS (version 1.16.0-20220807_005459-localhost) +with a blinking cursor below diff --git a/results/classifier/zero-shot/105/semantic/2434 b/results/classifier/zero-shot/105/semantic/2434 new file mode 100644 index 000000000..e72faa48b --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2434 @@ -0,0 +1,42 @@ +semantic: 0.691 +other: 0.646 +graphic: 0.625 +instruction: 0.625 +device: 0.611 +assembly: 0.608 +socket: 0.552 +mistranslation: 0.538 +vnc: 0.534 +KVM: 0.510 +network: 0.468 +boot: 0.467 + +qemu fails to build tests/unit/test-nested-aio-poll with errors about writing <N> bytes into a region of size <M> overflows the destination +Description of problem: +Fails to compile from source with: +``` +[2/2] Linking target tests/unit/test-nested-aio-poll +FAILED: tests/unit/test-nested-aio-poll +cc -m64 -o tests/unit/test-nested-aio-poll libevent-loop-base.a.p/event-loop-base.c.o libqom.a.p/qom_container.c.o libqom.a.p/qom_object.c.o libqom.a.p/qom_object_interfaces.c.o libqom.a.p/qom_qom-qobject.c.o libblock.a.p/block.c.o libblock.a.p/blockjob.c.o libblock.a.p/job.c.o libblock.a.p/qemu-io-cmds.c.o libblock.a.p/replication.c.o libblock.a.p/nbd_client.c.o libblock.a.p/nbd_client-connection.c.o libblock.a.p/nbd_common.c.o libblock.a.p/scsi_utils.c.o libblock.a.p/scsi_pr-manager.c.o libblock.a.p/scsi_pr-manager-helper.c.o libblock.a.p/block_accounting.c.o libblock.a.p/block_aio_task.c.o libblock.a.p/block_amend.c.o libblock.a.p/block_backup.c.o libblock.a.p/block_blkdebug.c.o libblock.a.p/block_blklogwrites.c.o libblock.a.p/block_blkverify.c.o libblock.a.p/block_block-backend.c.o libblock.a.p/block_block-copy.c.o libblock.a.p/block_commit.c.o libblock.a.p/block_copy-before-write.c.o libblock.a.p/block_copy-on-read.c.o libblock.a.p/block_create.c.o libblock.a.p/block_crypto.c.o libblock.a.p/block_dirty-bitmap.c.o libblock.a.p/block_filter-compress.c.o libblock.a.p/block_graph-lock.c.o libblock.a.p/block_io.c.o libblock.a.p/block_mirror.c.o libblock.a.p/block_nbd.c.o libblock.a.p/block_null.c.o libblock.a.p/block_preallocate.c.o libblock.a.p/block_progress_meter.c.o libblock.a.p/block_qapi.c.o libblock.a.p/block_qcow2.c.o libblock.a.p/block_qcow2-bitmap.c.o libblock.a.p/block_qcow2-cache.c.o libblock.a.p/block_qcow2-cluster.c.o libblock.a.p/block_qcow2-refcount.c.o libblock.a.p/block_qcow2-snapshot.c.o libblock.a.p/block_qcow2-threads.c.o libblock.a.p/block_quorum.c.o libblock.a.p/block_raw-format.c.o libblock.a.p/block_reqlist.c.o libblock.a.p/block_snapshot.c.o libblock.a.p/block_snapshot-access.c.o libblock.a.p/block_throttle.c.o libblock.a.p/block_throttle-groups.c.o libblock.a.p/block_write-threshold.c.o libblock.a.p/block_qcow.c.o libblock.a.p/block_vdi.c.o libblock.a.p/block_vhdx-endian.c.o libblock.a.p/block_vhdx-log.c.o libblock.a.p/block_vhdx.c.o libblock.a.p/block_vmdk.c.o libblock.a.p/block_vpc.c.o libblock.a.p/block_cloop.c.o libblock.a.p/block_bochs.c.o libblock.a.p/block_vvfat.c.o libblock.a.p/block_dmg.c.o libblock.a.p/block_qed-check.c.o libblock.a.p/block_qed-cluster.c.o libblock.a.p/block_qed-l2-cache.c.o libblock.a.p/block_qed-table.c.o libblock.a.p/block_qed.c.o libblock.a.p/block_parallels.c.o libblock.a.p/block_parallels-ext.c.o libblock.a.p/block_file-posix.c.o libblock.a.p/block_iscsi-opts.c.o libblock.a.p/block_nvme.c.o libblock.a.p/block_replication.c.o libblock.a.p/block_linux-aio.c.o libblock.a.p/block_io_uring.c.o libblock.a.p/block_stream.c.o libblock.a.p/block_monitor_bitmap-qmp-cmds.c.o libblock.a.p/block_blkio.c.o libblock.a.p/block_curl.c.o libblock.a.p/block_gluster.c.o libblock.a.p/block_iscsi.c.o libblock.a.p/block_nfs.c.o libblock.a.p/block_ssh.c.o libblock.a.p/block_dmg-bz2.c.o libblock.a.p/meson-generated_.._block_block-gen.c.o libcrypto.a.p/crypto_afsplit.c.o libcrypto.a.p/crypto_akcipher.c.o libcrypto.a.p/crypto_block-luks.c.o libcrypto.a.p/crypto_block-qcow.c.o libcrypto.a.p/crypto_block.c.o libcrypto.a.p/crypto_cipher.c.o libcrypto.a.p/crypto_der.c.o libcrypto.a.p/crypto_hash.c.o libcrypto.a.p/crypto_hmac.c.o libcrypto.a.p/crypto_ivgen-essiv.c.o libcrypto.a.p/crypto_ivgen-plain.c.o libcrypto.a.p/crypto_ivgen-plain64.c.o libcrypto.a.p/crypto_ivgen.c.o libcrypto.a.p/crypto_pbkdf.c.o libcrypto.a.p/crypto_secret_common.c.o libcrypto.a.p/crypto_secret.c.o libcrypto.a.p/crypto_tlscreds.c.o libcrypto.a.p/crypto_tlscredsanon.c.o libcrypto.a.p/crypto_tlscredspsk.c.o libcrypto.a.p/crypto_tlscredsx509.c.o libcrypto.a.p/crypto_tlssession.c.o libcrypto.a.p/crypto_rsakey.c.o libcrypto.a.p/crypto_hash-gnutls.c.o libcrypto.a.p/crypto_hmac-gnutls.c.o libcrypto.a.p/crypto_pbkdf-gnutls.c.o libcrypto.a.p/crypto_secret_keyring.c.o libauthz.a.p/authz_base.c.o libauthz.a.p/authz_list.c.o libauthz.a.p/authz_listfile.c.o libauthz.a.p/authz_simple.c.o libauthz.a.p/authz_pamacct.c.o libio.a.p/io_channel-buffer.c.o libio.a.p/io_channel-command.c.o libio.a.p/io_channel-file.c.o libio.a.p/io_channel-null.c.o libio.a.p/io_channel-socket.c.o libio.a.p/io_channel-tls.c.o libio.a.p/io_channel-util.c.o libio.a.p/io_channel-watch.c.o libio.a.p/io_channel-websock.c.o libio.a.p/io_channel.c.o libio.a.p/io_dns-resolver.c.o libio.a.p/io_net-listener.c.o libio.a.p/io_task.c.o tests/unit/test-nested-aio-poll.p/test-nested-aio-poll.c.o tests/unit/test-nested-aio-poll.p/iothread.c.o -Werror -flto -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -pie -Wl,-z,relro -Wl,-z,now -march=native -fno-omit-frame-pointer -Wl,-rpath,/usr/lib64/iscsi -Wl,-rpath-link,/usr/lib64/iscsi -Wl,--start-group libqemuutil.a subprojects/libvhost-user/libvhost-user-glib.a subprojects/libvhost-user/libvhost-user.a /usr/lib64/libzstd.so /usr/lib64/libz.so /usr/lib64/iscsi/libiscsi.so -laio /usr/lib64/liburing.so -lblkio /usr/lib64/libcurl.so /usr/lib64/libacl.so /usr/lib64/libgfapi.so /usr/lib64/libglusterfs.so /usr/lib64/libgfrpc.so /usr/lib64/libgfxdr.so /usr/lib64/libuuid.so /usr/lib64/libnfs.so /usr/lib64/libssh.so /usr/lib64/libglib-2.0.so /usr/lib64/libgmodule-2.0.so -pthread -lbz2 /usr/lib64/libgnutls.so -lpam -lnuma /usr/lib64/libgio-2.0.so /usr/lib64/libgobject-2.0.so -lm -Wl,--end-group +In function ‘aio_notify’, + inlined from ‘aio_bh_enqueue’ at ../util/async.c:96:5, + inlined from ‘aio_bh_schedule_oneshot_full’ at ../util/async.c:139:5, + inlined from ‘aio_wait_kick.part.0’ at ../util/aio-wait.c:54:9: +../util/async.c:494:5: error: ‘__atomic_store_1’ writing 1 byte into a region of size 0 overflows the destination [-Werror=stringop-overflow=] + 494 | qatomic_set(&ctx->notified, true); + | ^ +In function ‘aio_wait_kick.part.0’: +lto1: note: destination object is likely at address zero +In function ‘aio_notify’, + inlined from ‘aio_bh_enqueue’ at ../util/async.c:96:5, + inlined from ‘aio_bh_schedule_oneshot_full’ at ../util/async.c:139:5, + inlined from ‘aio_wait_kick.part.0’ at ../util/aio-wait.c:54:9: +../util/async.c:501:9: error: ‘__atomic_load_4’ writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] + 501 | if (qatomic_read(&ctx->notify_me)) { + | ^ +In function ‘aio_wait_kick.part.0’: +lto1: note: destination object is likely at address zero +lto1: all warnings being treated as errors +``` +Steps to reproduce: +1. Build qemu from source, probably with LTO enabled and recent GCC. diff --git a/results/classifier/zero-shot/105/semantic/2449 b/results/classifier/zero-shot/105/semantic/2449 new file mode 100644 index 000000000..1f75beadd --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2449 @@ -0,0 +1,14 @@ +semantic: 0.515 +mistranslation: 0.228 +graphic: 0.147 +device: 0.100 +boot: 0.037 +vnc: 0.030 +other: 0.025 +assembly: 0.019 +socket: 0.015 +network: 0.014 +instruction: 0.014 +KVM: 0.006 + +How to extract FIS (personal question) diff --git a/results/classifier/zero-shot/105/semantic/2457 b/results/classifier/zero-shot/105/semantic/2457 new file mode 100644 index 000000000..13d6c5d66 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2457 @@ -0,0 +1,14 @@ +semantic: 0.632 +other: 0.510 +mistranslation: 0.497 +device: 0.452 +network: 0.288 +KVM: 0.272 +vnc: 0.259 +socket: 0.166 +graphic: 0.165 +boot: 0.084 +instruction: 0.054 +assembly: 0.010 + +Building plugin sources doesn't produce any output to 'make' diff --git a/results/classifier/zero-shot/105/semantic/2460 b/results/classifier/zero-shot/105/semantic/2460 new file mode 100644 index 000000000..f73329524 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2460 @@ -0,0 +1,21 @@ +semantic: 0.903 +mistranslation: 0.895 +graphic: 0.880 +instruction: 0.799 +device: 0.695 +other: 0.671 +network: 0.551 +vnc: 0.543 +socket: 0.528 +boot: 0.359 +assembly: 0.305 +KVM: 0.082 + +Significant performance degradation of qemu-x86_64 starting from version 3 on aarch64 +Description of problem: +When I ran CoreMark with different qemu user-mode versions,guest x86-64-> host arm64, I found that the performance was highest with QEMU 2.x versions, and there was a significant performance degradation starting from QEMU version 3. What is the reason? + +| | | | | | | | | | | | | +|------------------------------------------|-------------|-------------|-------------|-------------|-------------|-------------|------------|-------------|-------------|-------------|-------------| +| qemu version | 2.5.1 | 2.8.0 | 2.9.0 | 2.9.1 | 3.0.0 | 4.0.0 | 5.2.0 | 6.2.0 | 7.2.13 | 8.2.6 | 9.0.1 | +| coremark score | 3905.995703 | 4465.947153 | 4534.119247 | 4538.577912 | 1167.337886 | 1163.399453 | 928.348384 | 1327.051954 | 1301.659616 | 1034.714677 | 1085.304971 | diff --git a/results/classifier/zero-shot/105/semantic/2562 b/results/classifier/zero-shot/105/semantic/2562 new file mode 100644 index 000000000..402ac627d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2562 @@ -0,0 +1,65 @@ +semantic: 0.856 +graphic: 0.851 +other: 0.842 +boot: 0.841 +assembly: 0.839 +instruction: 0.831 +device: 0.792 +mistranslation: 0.748 +socket: 0.695 +network: 0.682 +KVM: 0.634 +vnc: 0.582 + +Booting EFI shell from GRUB using "chainloader" in Qemu with UEFI boot shows video artifacts if we have all_video, gfxterm +Steps to reproduce: +- Start Qemu in UEFI mode, i. e. `qemu-system-x86_64 -bios OVMF.fd ...` +- Qemu should load GRUB from the disk as the first thing after firmware +- GRUB should run commands `loadfont unicode; insmod all_video; terminal_output gfxterm` (note: this is perfectly ordinary sequence executed by Debian's default configuration) +- Then GRUB should execute EFI shell using `chainloader` command + +If we do all this, then instead of EFI shell we will see broken image. I. e. video output will be completely broken/mangled/damaged. But EFI shell will still respond to commands. If we type "exit", then we will exit from EFI shell back to GRUB. + +I will repeat: my configuration is not special at all. `loadfont unicode; insmod all_video; terminal_output gfxterm` are absolutely ordinary commands executed by Debian's GRUB default setup. So, essentially this bug means this: if I add EFI shell to GRUB menu in Debian, then this new menu entry will not work properly if I try to boot in Qemu in UEFI mode. + +Okay, now let me give you more detailed steps to reproduce. + +- Execute the following script on Linux x86_64 host: +```bash +#!/bin/bash +# This script was tested on Debian trixie (as on 2024-09-07) with the following packages installed: +# dosfstools grub-efi-amd64-bin qemu-system-x86 ovmf efi-shell-x64 +set -e +DIR="$(mktemp -d /tmp/qemu-bug-XXXXXX)" +truncate --size=100M "$DIR/disk" +echo ',+,' | sfdisk --label gpt "$DIR/disk" +LOOP="$(losetup --find --show --partscan --nooverlap "$DIR/disk")" +sleep 1 +mkfs.vfat "${LOOP}p1" +mkdir "$DIR/root" +mount "${LOOP}p1" "$DIR/root" +losetup --detach "$LOOP" +mkdir -p "$DIR/root/EFI/boot" "$DIR/root/boot/grub/fonts" +grub-mkimage --format=x86_64-efi --output="$DIR/root/EFI/boot/bootx64.efi" --prefix=/boot/grub part_gpt fat +cp -r /usr/lib/grub/x86_64-efi "$DIR/root/boot/grub" +cp /usr/share/efi-shell-x64/shellx64.efi "$DIR/root/boot" +cp /usr/share/grub/unicode.pf2 "$DIR/root/boot/grub/fonts" +cat << "EOF" > "$DIR/root/boot/grub/grub.cfg" +loadfont unicode +insmod all_video +terminal_output gfxterm +menuentry "EFI shell" { + chainloader /boot/shellx64.efi +} +EOF +umount "$DIR/root" +qemu-system-x86_64 -m 2048 -bios OVMF.fd -drive file="$DIR/disk",format=raw +``` +- When you see Qemu window, choose "EFI shell" menu entry in GRUB menu +- You will immediately see damaged video output instead of proper EFI shell + +This bug doesn't reproduce on real hardware, i. e. without Qemu!!! I. e. this is Qemu bug. Qemu task is to duplicate real hardware behaviour. On real hardware there is no this bug, so Qemu should not have it, either. + +Note: if I remove `loadfont unicode; insmod all_video; terminal_output gfxterm`, then the bug disappears. + +Also note: if I replace `all_video` with `efi_gop`, then the bug disappears, too. So, workaround is to use `efi_gop` instead of `all_video` in UEFI mode. But I still believe the bug is in Qemu, because `all_video` doesn't cause any problems on real hardware, so Qemu should work, too. diff --git a/results/classifier/zero-shot/105/semantic/2582 b/results/classifier/zero-shot/105/semantic/2582 new file mode 100644 index 000000000..e452cee5f --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2582 @@ -0,0 +1,36 @@ +semantic: 0.962 +instruction: 0.883 +device: 0.784 +graphic: 0.771 +vnc: 0.674 +KVM: 0.625 +network: 0.506 +socket: 0.498 +boot: 0.327 +assembly: 0.274 +mistranslation: 0.252 +other: 0.076 + +CR4.VMX leaks from L1 into L2 on Intel VMX +Description of problem: +In a nested virtualization setting, `savevm` can cause CR4 bits from leaking from L1 into L2. This causes general-protection faults in certain guests. + +The L2 guest executes this code: + +``` +mov rax, cr4 ; Get CR4 +mov rcx, rax ; Remember the old value +btc rax, 7 ; Toggle CR4.PGE +mov cr4, rax ; #GP! <- Shouldn't happen! +mov cr4, rcx ; Restore old value +``` + +If the guest code is interrupted at the right time (e.g. via `savevm`), Qemu marks CR4 dirty while the guest executes L2 code. Due to really complicated KVM semantics, this will result in L1 CR4 bits (VMXE) leaking into the L2 guest and the L2 will die with a GP: + +Instead of the expected CR4 value, the L2 guest reads a value with VMXE set. When it tries to write this back into CR4, this triggers the general protection fault. +Steps to reproduce: +This is only an issue on **Intel** systems. + +# +Additional information: +See also this discussion where we discussed a (flawed) approach to fixing this in KVM: https://lore.kernel.org/lkml/Zh6WlOB8CS-By3DQ@google.com/t/ diff --git a/results/classifier/zero-shot/105/semantic/2649 b/results/classifier/zero-shot/105/semantic/2649 new file mode 100644 index 000000000..4e0dd6883 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2649 @@ -0,0 +1,53 @@ +semantic: 0.085 +mistranslation: 0.075 +device: 0.074 +graphic: 0.059 +instruction: 0.057 +network: 0.045 +other: 0.044 +assembly: 0.043 +boot: 0.033 +vnc: 0.029 +socket: 0.025 +KVM: 0.008 + +Data corruption with qcow2 images +Steps to reproduce: +``` +# Create an example file with old version of qemu-img and fill it with random data. +$ qemu-img-8.2.2 create -f qcow2 file.qcow2 600000000000 +$ qemu-nbd-8.2.2 -c /dev/nbd0 file.qcow2 +$ dd if=/dev/random of=/dev/nbd0 bs=1000000 count=600000 +$ qemu-nbd-8.2.2 -d /dev/nbd0 +/dev/nbd0 disconnected + +# Get the correct checksum of both qcow2 file and its contents +$ sha256sum -b file.qcow2 +ca471f6822af4fcf3c81bc5cc671493be06a837b71b43c1f747042759da587b9 *file.qcow2 +$ qemu-nbd-8.2.2 -r -c /dev/nbd0 file.qcow2 +$ sha256sum -b /dev/nbd0 +5dac11e88f891740da3b655588b2e62037962d1ba6377efce30124d6224dd0d1 */dev/nbd0 +$ qemu-nbd-8.2.2 -d /dev/nbd0 +/dev/nbd0 disconnected + +# Use the qcow2 file with new version. +# We're using qemu-nbd here, but the same happens when qcow2 is attached to a guest +# running in the new version qemu-system-86_64-9.1.1 and can be seen through guest's +# /dev/vda. +# Note that the checksum is different than before, and also non-deterministic +# (running sha256sum twice produces different results even though the file is +# read-only and hasn't changed). +$ sha256sum -b file.qcow2 +ca471f6822af4fcf3c81bc5cc671493be06a837b71b43c1f747042759da587b9 *file.qcow2 +$ qemu-nbd-9.1.1 -r -c /dev/nbd0 file.qcow2 +$ sha256sum -b /dev/nbd0 +1793a38b9b964d3fc643629284722373e9d5dedea68e35900ace777b57688926 */dev/nbd0 +$ sha256sum -b /dev/nbd0 +98f900f9cd174493d0bfcf06e2bc86f5ee99dfa04c90d6832fa941e384b62d49 */dev/nbd0 +$ qemu-nbd-9.1.1 -d /dev/nbd0 +/dev/nbd0 disconnected +$ sha256sum -b file.qcow2 +ca471f6822af4fcf3c81bc5cc671493be06a837b71b43c1f747042759da587b9 *file.qcow2 +``` +Additional information: +No errors in either host or guest logs. When using a qcow2 with an actual filesystem, you may see reports of corruption from the filesystem driver. diff --git a/results/classifier/zero-shot/105/semantic/2704 b/results/classifier/zero-shot/105/semantic/2704 new file mode 100644 index 000000000..8ddf52059 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2704 @@ -0,0 +1,315 @@ +semantic: 0.940 +network: 0.917 +assembly: 0.916 +mistranslation: 0.906 +graphic: 0.905 +boot: 0.901 +instruction: 0.890 +device: 0.881 +KVM: 0.877 +other: 0.870 +socket: 0.784 +vnc: 0.740 + +Error when migrating s390x VM from QEMU 9.0 to 9.1: Unknown savevm section or instance 's390_css' +Description of problem: +I have been working on merging QEMU 9.1.1 (directly from Debian unstable), and I'm seeing this problem when trying to migrate an s390x VM from an Oracular host (which runs QEMU 9.0.2) to a Plucky host (which runs QEMU 9.1.1). + +The problem only happens on s390x (host and guest), and only when attempting to migrate from Oracular to Plucky. Migrations between Oracular guests work fine, as well as migrations between Plucky guests. + +This is the error I see after invoking `virsh migrate`: + +``` +error: internal error: QEMU unexpectedly closed the monitor (vm='kvmguest-jammy-normal'): +2024-11-27T21:13:43.745625Z qemu-system-s390x: Unknown savevm section or instance 's390_css' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices +2024-11-27T21:13:43.746914Z qemu-system-s390x: load of migration failed: Invalid argument +``` +Steps to reproduce: +I only have one s390x machine available, so I am resorting to creating two LXD containers that are KVM-capable. One of the containers runs Oracular, the other runs Plucky. Please let me know if you would instructions on how to create such containers. + +Inside the Oracular container, using `uvt-kvm` to simplify the process of creating the VM: + +``` +# uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=s390x label=daily release=oracular +# cat > guesttemplate.xml << _EOF_ +<domain type='kvm'> + <os> + <type>hvm</type> + <boot dev='hd'/> + </os> + <devices> + <interface type='network'> + <source network='default'/> + <model type='virtio'/> + </interface> + <console type='pty' tty='/dev/pts/3'> + <source path='/dev/pts/3'/> + <target type='sclp' port='0'/> + <alias name='console0'/> + </console> + <channel type='unix'> + <target type='virtio' name='org.qemu.guest_agent.0'/> + </channel> + </devices> +</domain> +_EOF_ +# uvt-kvm create --template /root/guesttemplate.xml --machine-type s390-ccw-virtio-9.0 --password=ubuntu --ssh-public-key-file /home/ubuntu/.ssh/authorized_keys kvmguest-oracular-upstream-cpu release=oracular arch=s390x label=daily +``` + +Wait a moment for the VM to boot, use `virsh list` to make sure it's running. Note that we force the machine type to be `s390-ccw-virtio-9.0`; this is necessary because Ubuntu overrides the default machine type with its own definition, and we want to make sure to use upstream's type here. + +Make sure you're running QEMU 9.1.1 at least on the Plucky container. Plucky currently ships with QEMU 9.0.2, which doesn't have the problem. If needed, my QEMU 9.1.1 build can be found at https://launchpad.net/~sergiodj/+archive/ubuntu/qemu. + +After everything is in place, try to migrate the machine: + +``` +# virsh migrate --unsafe --live kvmguest-oracular-upstream-cpu qemu+ssh://plucky-container-IP-here/system +error: internal error: QEMU unexpectedly closed the monitor (vm='kvmguest-oracular-upstream-cpu'): 2024-11-29T22:28:21.417201Z qemu-system-s390x: Unknown savevm section or instance 's390_css' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices +2024-11-29T22:28:21.417496Z qemu-system-s390x: load of migration failed: Invalid argument +``` +Additional information: +libvirt log from Oracular (QEMU 9.0.2): + +``` +LC_ALL=C \ [2/1817] +PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/snap/bin \ +USER=root \ +HOME=/var/lib/libvirt/qemu/domain-3-kvmguest-oracular-up \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-3-kvmguest-oracular-up/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-3-kvmguest-oracular-up/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-3-kvmguest-oracular-up/.config \ +/usr/bin/qemu-system-s390x \ +-name guest=kvmguest-oracular-upstream-cpu,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-3-kvmguest-oracular-up/master-key.aes"}' \ +-machine s390-ccw-virtio-9.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \ +-accel kvm \ +-cpu z13.2-base,aen=on,aefsi=on,diag318=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,edat2=on,vx=on,ipter=on,cei=on,ap=on,gpereh=on,esop=on,ib=on,siif=on,ibs=on,apqi=on,apft=on,els=on,sief2=on,apqci=on,cte=on,ais=on,bpb=on,64bscao=on,ctop=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on,gsls=on \ +-m size=524288k \ +-object '{"qom-type":"memory-backend-ram","id":"s390.ram","size":536870912}' \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid fa8bcf1a-8982-47ab-9766-ebbb695008e3 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=38,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0003"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MjQuMTA6czM5MHggMjAyNDExMjY=","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-3-format","read-only":true,"driver":"qcow2","file":"libvirt-3-storage","backing":null}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu.qcow","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-3-format"}' \ +-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-2-format","id":"virtio-disk0","bootindex":1}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu-ds.qcow","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \ +-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0001","drive":"libvirt-1-format","id":"virtio-disk1"}' \ +-netdev '{"type":"tap","fd":"39","id":"hostnet0"}' \ +-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:d8:f0:5c","devno":"fe.0.0002"}' \ +-chardev socket,id=charchannel0,fd=36,server=on,wait=off \ +-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \ +-chardev pty,id=charconsole0 \ +-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \ +-audiodev '{"id":"audio1","driver":"none"}' \ +-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0004"}' \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on +char device redirected to /dev/pts/3 (label charconsole0) +2024-11-28 20:56:00.522+0000: initiating migration +2024-11-28T20:56:01.114894Z qemu-system-s390x: Sibling indicated error 1 +warning: old compression is deprecated; use multifd compression methods instead +warning: old compression is deprecated; use multifd compression methods instead +warning: old compression is deprecated; use multifd compression methods instead +warning: block migration is deprecated; use blockdev-mirror with NBD instead +``` + +libvirt log from Plucky (QEMU 9.1.1): + +``` +LC_ALL=C \ +PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/snap/bin \ +USER=root \ +HOME=/var/lib/libvirt/qemu/domain-4-kvmguest-oracular-up \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-4-kvmguest-oracular-up/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-4-kvmguest-oracular-up/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-4-kvmguest-oracular-up/.config \ +/usr/bin/qemu-system-s390x \ +-name guest=kvmguest-oracular-upstream-cpu,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-kvmguest-oracular-up/master-key.aes"}' \ +-machine s390-ccw-virtio-9.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \ +-accel kvm \ +-cpu z13.2-base,aen=on,aefsi=on,diag318=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,edat2=on,vx=on,ipter=on,cei=on,ap=on,gpereh=on,esop=on,ib=on,siif=on,ibs=on,apqi=on,apft=on,els=on,sief2=on,apqci=on,cte=on,ais=on,bpb=on,64bscao=on,ctop=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on,gsls=on \ +-m size=524288k \ +-object '{"qom-type":"memory-backend-ram","id":"s390.ram","size":536870912}' \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid fa8bcf1a-8982-47ab-9766-ebbb695008e3 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=35,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0003"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MjQuMTA6czM5MHggMjAyNDExMjY=","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-3-format","read-only":true,"driver":"qcow2","file":"libvirt-3-storage","backing":null}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu.qcow","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-3-format"}' \ +-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-2-format","id":"virtio-disk0","bootindex":1}' \ +-blockdev '{"driver":"file","filename":"/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu-ds.qcow","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \ +-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0001","drive":"libvirt-1-format","id":"virtio-disk1"}' \ +-netdev '{"type":"tap","fd":"36","id":"hostnet0"}' \ +-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:d8:f0:5c","devno":"fe.0.0002"}' \ +-chardev socket,id=charchannel0,fd=34,server=on,wait=off \ +-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \ +-chardev pty,id=charconsole0 \ +-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \ +-audiodev '{"id":"audio1","driver":"none"}' \ +-incoming defer \ +-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0004"}' \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on +char device redirected to /dev/pts/3 (label charconsole0) +2024-11-29T22:28:21.417201Z qemu-system-s390x: Unknown savevm section or instance 's390_css' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices +2024-11-29T22:28:21.417496Z qemu-system-s390x: load of migration failed: Invalid argument +``` + +Domain XML: + +```xml +<domain type='kvm' id='3'> + <name>kvmguest-oracular-upstream-cpu</name> + <uuid>fa8bcf1a-8982-47ab-9766-ebbb695008e3</uuid> + <metadata> + <uvt:ssh_known_hosts xmlns:uvt="https://launchpad.net/uvtool/libvirt/1">ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDhWPh2Wfm2Ouh/W+H9IXJGFHfH4UVCB6+EBI0PwuDXR2Ocl4hTTSNPSX2LVS4MfVn9pgl5BK9MUVsMPfFjhmEhpNNt+rmaCelrDT8A7v/RoBY4IGEBFMhAkiwlI7pk3BrFoHEKtiijNLEWczdjMigZvhTs2amn8cUotFIsQSTpM7+7IX+m7clxfe6p59mVPjfMzBhwDG0GyV7CXdMpvsGlE2mPSacWWZ/baWIoFjKcmyQtTjSQleH1qSthI8rD5F7EyYd1Oa8Bo7vZ9j1/DPeGQRJPkebO81hPjm/1x1H5pTITIzARdNuBkM0yuDyqMQLP/u65WGinvXJYm20gEvMbiHGaT3il1QKKNEGmNGtY/SedRE8XQ58n090IBLz/3WJtjgQCY/SRgHUv7nMYYenmshvBfdue9kExJTjwWTRtT2R2UdkxS5UVye4vvDAY0DFuqX13wyvIeCU28MU+HpmnE31m9uXlVXXZxDuqGUBJ1PrDc4a40bvj9yTZTn9NEOs= root@localhost +ssh-dss AAAAB3NzaC1kc3MAAACBAKqzgDKUGk6P/h6N5X4nJoHPr+MQzzXkotN8XEihvtWwvV1KYK+ioT69nA7ThEAZ6rPEjWPt7X4Sy6BcNd4j3kzlaXYLkrMJm3nohqbqQBDxCv8bhozy6HS/VDu95vrpnNFSiMRCfbBye0zyKfZsuRaPaKfHQ+8MnsBqSPxKajFrAAAAFQDuG3ZoanC+oZwMRYZ/am0vhfD+EwAAAIEAixSzoZr03kYZE+LcusyrasvZIqKF3P4M2vtzvFBPpPccFB5XoaqhWI4PvSxGYxZxlj9vRmSc8Yv56jdn8oDPIhFfgZVbDIkvpB2jQdb5VaRVWj7XwUcHB7B117Dr9qA6+6HJtBLRTDdTXzMQ+NhdFp42XCF+1qRefH9VPL9FoxwAAACAJa+u/YvaiGwT0DXBtTz4PgyFYmNHPvXBOVhDAw0likajBiuOdn8oL4KuWTafCq5ReDxXFaMML0OuT86+lSVt2naX0idyHjuSPkgmatozlpcz0kWYhuBl1B1sa3kr8xjDOUJlxkybqpdGJ5aoW+kRO+bpJLEzuXtu6Xshw5fOBZw= root@localhost +ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBHI8u/wAvZLJqIpAd5YSpu9VEaRQOxy0FKzyryeb3kjahkryKPhSX65miZ9Lx7oz5nORFsdeS2xR56ZQj+8HpqM= root@localhost +ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDXY+MW1SikusLdkhPrni76LlaZB042p/DVItVeHRCCa root@localhost +</uvt:ssh_known_hosts> + </metadata> + <memory unit='KiB'>524288</memory> + <currentMemory unit='KiB'>524288</currentMemory> + <vcpu placement='static'>1</vcpu> + <resource> + <partition>/machine</partition> + </resource> + <os> + <type arch='s390x' machine='s390-ccw-virtio-9.0'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='custom' match='exact' check='partial'> + <model fallback='forbid'>z13.2-base</model> + <feature policy='require' name='aen'/> + <feature policy='require' name='aefsi'/> + <feature policy='require' name='diag318'/> + <feature policy='require' name='msa5'/> + <feature policy='require' name='msa4'/> + <feature policy='require' name='msa3'/> + <feature policy='require' name='msa2'/> + <feature policy='require' name='msa1'/> + <feature policy='require' name='sthyi'/> + <feature policy='require' name='edat'/> + <feature policy='require' name='ri'/> + <feature policy='require' name='edat2'/> + <feature policy='require' name='vx'/> + <feature policy='require' name='ipter'/> + <feature policy='require' name='cei'/> + <feature policy='require' name='ap'/> + <feature policy='require' name='gpereh'/> + <feature policy='require' name='esop'/> + <feature policy='require' name='ib'/> + <feature policy='require' name='siif'/> + <feature policy='require' name='ibs'/> + <feature policy='require' name='apqi'/> + <feature policy='require' name='apft'/> + <feature policy='require' name='els'/> + <feature policy='require' name='sief2'/> + <feature policy='require' name='apqci'/> + <feature policy='require' name='cte'/> + <feature policy='require' name='ais'/> + <feature policy='require' name='bpb'/> + <feature policy='require' name='64bscao'/> + <feature policy='require' name='ctop'/> + <feature policy='require' name='ppa15'/> + <feature policy='require' name='zpci'/> + <feature policy='require' name='sea_esop2'/> + <feature policy='require' name='te'/> + <feature policy='require' name='cmm'/> + <feature policy='require' name='gsls'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-s390x</emulator> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu.qcow' index='2'/> + <backingStore type='file' index='3'> + <format type='qcow2'/> + <source file='/var/lib/uvtool/libvirt/images/x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MjQuMTA6czM5MHggMjAyNDExMjY='/> + <backingStore/> + </backingStore> + <target dev='vda' bus='virtio'/> + <alias name='virtio-disk0'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/var/lib/uvtool/libvirt/images/kvmguest-oracular-upstream-cpu-ds.qcow' index='1'/> + <backingStore/> + <target dev='vdb' bus='virtio'/> + <alias name='virtio-disk1'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/> + </disk> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci.0'/> + </controller> + <controller type='virtio-serial' index='0'> + <alias name='virtio-serial0'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0003'/> + </controller> + <interface type='network'> + <mac address='52:54:00:d8:f0:5c'/> + <source network='default' portid='8b9c05f0-9534-4e05-afff-ec73e4a55b9c' bridge='virbr0'/> + <target dev='vnet1'/> + <model type='virtio'/> + <alias name='net0'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0002'/> + </interface> + <console type='pty' tty='/dev/pts/3'> + <source path='/dev/pts/3'/> + <target type='sclp' port='0'/> + <alias name='console0'/> + </console> + <channel type='unix'> + <source mode='bind' path='/run/libvirt/qemu/channel/3-kvmguest-oracular-up/org.qemu.guest_agent.0'/> + <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> + <alias name='channel0'/> + <address type='virtio-serial' controller='0' bus='0' port='1'/> + </channel> + <audio id='1' type='none'/> + <memballoon model='virtio'> + <alias name='balloon0'/> + <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0004'/> + </memballoon> + <panic model='s390'/> + </devices> + <seclabel type='dynamic' model='apparmor' relabel='yes'> + <label>libvirt-fa8bcf1a-8982-47ab-9766-ebbb695008e3</label> + <imagelabel>libvirt-fa8bcf1a-8982-47ab-9766-ebbb695008e3</imagelabel> + </seclabel> + <seclabel type='dynamic' model='dac' relabel='yes'> + <label>+64055:+993</label> + <imagelabel>+64055:+993</imagelabel> + </seclabel> +</domain> +``` diff --git a/results/classifier/zero-shot/105/semantic/2911 b/results/classifier/zero-shot/105/semantic/2911 new file mode 100644 index 000000000..be1a56050 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2911 @@ -0,0 +1,78 @@ +semantic: 0.839 +graphic: 0.815 +mistranslation: 0.815 +other: 0.783 +assembly: 0.754 +instruction: 0.734 +device: 0.727 +boot: 0.677 +network: 0.654 +vnc: 0.640 +KVM: 0.623 +socket: 0.609 + +G5/970 emulation not complete enough for OSX ? +Description of problem: +Leopard image that boots on G4 does not boot on G5 +Steps to reproduce: +1. Find preinstalled hdd image on Archive.org: MacOSLeopard.img +2. Try to boot it like above with -cpu 970, or 970FX +3. Observe early hang +Additional information: +``` +cpus[0] = 0x7f794b3e3040 0x7f794b3e5bc0 +cpus[1] = 0x7f794afe3ec0 0x7f794afe6a40 +Trying to write invalid spr 276 (0x114) at 00000000000b6634 +Trying to read invalid spr 277 (0x115) at 00000000000b6638 +Trying to read invalid spr 276 (0x114) at 00000000000b663c +Trying to write invalid spr 277 (0x115) at 00000000000b6658 +Trying to write invalid spr 276 (0x114) at 00000000000b665c +Trying to read invalid spr 276 (0x114) at 00000000000b6660 +Trying to write invalid spr 277 (0x115) at 00000000000b670c +Trying to write invalid spr 276 (0x114) at 00000000000b6710 +Trying to read invalid spr 276 (0x114) at 00000000000b6714 +invalid/unsupported opcode: 00 - 00 - 00 - 00 (00000000) 0000000000000000 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to write invalid spr 304 (0x130) at 0000000000003e14 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +Trying to read invalid spr 304 (0x130) at 0000000000003e38 +invalid/unsupported opcode: 00 - 00 - 00 - 00 (00000000) 0000000000000008 + +last lin repeats infinitely. +``` + +from my email to qemu-ppc list: + +SPR 304 was already in target/ppc/cpu_init.c + +but sadly after adding those it still dies early :( + +I looked at + +IBM PowerPC 970FX RISC Microprocessor 11.6 SCOM Facility + +but whole thing a bit more complex than pair of regs. + +==== + +11.6.1 Processor Core SCOM SPR Access Each processor (core) has two special purpose registers (SPRs) used to access the SCOM interface: SCOMC and SCOMD. SCOMC and SCOMD are both 64-bit read/write SPRs and are used for SCOM Control and SCOM Data respectively. The interface is implemented as a direct connection to the parallel-to-serial converter, which handles the arbitration between the core and service processor. + +11.6.2 Operating System Protocol to Access SCOM SPRs In the PowerPC 970FX, SCOMC and SCOMD are complete operations. They do not require a software protocol in order to function properly except to disable external (asynchronous) interrupts. Software must check the error bits after performing an SCOMC to ensure that the command successfully completed. Table 11-14 Operating System Code to Access SCOM outlines a general software protocol for using these registers. + +==== + +Low level asm init for OSX XNU kernel seems to live at + +https://github.com/apple-oss-distributions/xnu/blob/xnu-1228/osfmk/ppc/start.s diff --git a/results/classifier/zero-shot/105/semantic/2953 b/results/classifier/zero-shot/105/semantic/2953 new file mode 100644 index 000000000..4c1ae6fb4 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/2953 @@ -0,0 +1,79 @@ +semantic: 0.851 +graphic: 0.837 +KVM: 0.826 +device: 0.821 +vnc: 0.794 +other: 0.784 +assembly: 0.770 +instruction: 0.743 +boot: 0.693 +mistranslation: 0.682 +network: 0.670 +socket: 0.480 + +"DMAR: DRHD: handling fault status reg 2" with vfio on kernel 6.13.11-200.fc41.x86_64, works with 6.13.9-200.fc41.x86_64 +Description of problem: +Since kernel 6.13.11-200.fc41.x86_64, I cannot use VFIO to pass an NVIDIA GeForce GTX 1070 card to a Windows guest. The same setup works just fine in 6.13.9-200.fc41.x86_64. The issue symptoms are the same regardless if I use kernel command line arguments to isolate cpus or not. + +Symptoms: +- qemu logs show: +``` +2025-05-07T09:59:49.957891Z qemu-system-x86_64: vfio: Cannot reset device 0000:36:00.1, no available reset mechanism. +2025-05-07T09:59:49.958444Z qemu-system-x86_64: vfio: Cannot reset device 0000:36:00.0, no available reset mechanism. +2025-05-07T09:59:49.959119Z qemu-system-x86_64: vfio: Cannot reset device 0000:36:00.1, no available reset mechanism. +2025-05-07T09:59:49.959635Z qemu-system-x86_64: vfio: Cannot reset device 0000:36:00.0, no available reset mechanism. +``` +- in dmesg I see: +``` +kernel: DMAR: DRHD: handling fault status reg 2 +kernel: DMAR: [INTR-REMAP] Request device [36:00.0] fault index 0x50 [fault reason 0x22] Present field in the IRTE entry is clear +``` +- the VM hangs at boot (please see the notes below (*)). +Steps to reproduce: +Boot the same libvirt domain in kernel 6.13.9-200.fc41.x86_64 (works) and any other more recent kernel (>= 6.13.11-200.fc41.x86_64). +Additional information: +(*) Note that in a working kernel, the boot process is in any case finicky, and it shows these phases: +1. tianocore logo shows, and one single cpu is fully utilized by the guest +2. slowly, the loader find the Windows bootloader, and prints a message that it is loading and running it +3. some time passes, while cpus seem idle +4. finally the spinning wheel of the Windows bootloader appears + +Phase 1-3 can take anywhere from 0 to 60 seconds, in an apparently random manner. + +When running on the faulty kernels, it seems that the virtual machine gets stuck in phase 1, and I must use `virsh destroy` to interrupt it. + +lspci output: +``` +-[0000:00]-+-00.0 Intel Corporation Tiger Lake-UP3/H35 4 cores Host Bridge/DRAM Registers + +-02.0 Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] + +-04.0 Intel Corporation TigerLake-LP Dynamic Tuning Processor Participant + +-06.0-[01]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 + +-07.0-[02-33]-- + +-0a.0 Intel Corporation Tigerlake Telemetry Aggregator Driver + +-0d.0 Intel Corporation Tiger Lake-LP Thunderbolt 4 USB Controller + +-0d.2 Intel Corporation Tiger Lake-LP Thunderbolt 4 NHI #0 + +-14.0 Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller + +-14.2 Intel Corporation Tiger Lake-LP Shared SRAM + +-15.0 Intel Corporation Tiger Lake-LP Serial IO I2C Controller #0 + +-15.1 Intel Corporation Tiger Lake-LP Serial IO I2C Controller #1 + +-15.2 Intel Corporation Tiger Lake-LP Serial IO I2C Controller #2 + +-16.0 Intel Corporation Tiger Lake-LP Management Engine Interface + +-1c.0-[34]----00.0 Intel Corporation Wi-Fi 6 AX200 + +-1c.5-[35]----00.0 Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader + +-1d.0-[36]--+-00.0 NVIDIA Corporation GP104 [GeForce GTX 1070] + | \-00.1 NVIDIA Corporation GP104 High Definition Audio Controller + +-1f.0 Intel Corporation Tiger Lake-LP LPC Controller + +-1f.3 Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller + +-1f.4 Intel Corporation Tiger Lake-LP SMBus Controller + \-1f.5 Intel Corporation Tiger Lake-LP SPI Controller +``` + +kernel command line arguments (optimized with cpu isolation): +``` +intel_pstate=per_cpu_perf_limits rd.driver.blacklist=nouveau modprobe.blacklist=nouveau module_blacklist=nouveau default_hugepagesz=1G hugepagesz=1G hugepages=13 i2c_i801.disable_features=0x10 rd.driver.pre=vfio_pci,vfio,vfio_iommu_type1 vfio-pci.ids=10de:1b81,10de:10f0 modprobe.blacklist=xpad systemd.unit=multi-user.target systemd.wants=bluetooth.service isolcpus=domain,managed_irq,1-3,5-7 rcu_nocbs=1-3,5-7 irqaffinity=0,4 nospectre_v2 +``` + +kernel command line arguments (without cpu isolation, same symptoms): +``` +intel_pstate=per_cpu_perf_limits rd.driver.blacklist=nouveau modprobe.blacklist=nouveau module_blacklist=nouveau default_hugepagesz=1G hugepagesz=1G hugepages=13 rd.driver.pre=vfio_pci,vfio,vfio_iommu_type1 vfio-pci.ids=10de:1b81,10de:10f0 modprobe.blacklist=xpad systemd.unit=multi-user.target systemd.wants=bluetooth.service +``` diff --git a/results/classifier/zero-shot/105/semantic/304636 b/results/classifier/zero-shot/105/semantic/304636 new file mode 100644 index 000000000..8659929aa --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/304636 @@ -0,0 +1,103 @@ +semantic: 0.870 +other: 0.853 +assembly: 0.839 +graphic: 0.837 +network: 0.833 +device: 0.831 +instruction: 0.826 +mistranslation: 0.801 +socket: 0.759 +vnc: 0.743 +KVM: 0.731 +boot: 0.695 + +-hda FAT:. limited to 504MBytes + +Binary package hint: qemu + +The size of the virtual FAT file system (for sharing a particular directory with Guest OS) is hard-coded to be limited to 504MBytes, in block-vvfat.c +-- +/* 504MB disk*/ +bs->cyls=1024; bs->heads=16; bs->secs=63; +-- + +If the directory contents exceeds this is stops with an assert +-- +qemu: block-vvfat.c:97: array_get: Assertion `index < array->next' failed. +Aborted +-- + +Also the FAT16 mode (default) only uses 8KByte cluster sizes which prevents the above being increased. 16KByte and 32KByte sectors can be selected with the following patch +-- +--- block-vvfat.c_orig 2008-12-02 12:37:28.000000000 -0700 ++++ block-vvfat.c 2008-12-02 19:50:35.000000000 -0700 +@@ -1042,6 +1042,12 @@ + s->fat_type = 32; + } else if (strstr(dirname, ":16:")) { + s->fat_type = 16; ++ } else if (strstr(dirname, ":16-16K:")) { ++ s->fat_type = 16; ++ s->sectors_per_cluster=0x20; ++ } else if (strstr(dirname, ":16-32K:")) { ++ s->fat_type = 16; ++ s->sectors_per_cluster=0x40; + } else if (strstr(dirname, ":12:")) { + s->fat_type = 12; + s->sector_count=2880; +-- + +Cheers, +Mungewell + +please send your patch to upstream at <email address hidden>, the vvfat code in qemu is fragile and should be carefully reviewed. + +the path ever came? i'm still having this problem, i can't run qemu on windows because in the fat: partition there are files bigger then 500 mb + +Please submit the patch upstream + +Linked against upstream, confirmed and wishlist. Ubuntu should get this when upstream takes the patch. + +Thanks! +:-Dustin + +Thank YOU, dustin! +What's next? I don't understand, should i do something else or i can just wait for the fix? somebody has to send the patch ? +Another thing: why wishlist? this is clearly a bug, and a quite serious one: you just can't give a fat bigger than 500 mb to qemu. i can't use qemu since years, because of this... +Thanks + +Hello. + +I am using qemu-5.2.0 in Windows with operating system Minix 3.1.2a and vfat fail to write files of size over 4096 bytes. The read works well. This is not ploblem of Minix 3.1.2a because in Bochs emulator reads an writes of files of any size works well. + +I also consider this to be a major bug as it prevents communication of information between the guest OS and the host. I have over 300 students complaining about this bug present in qemu. + +Thanks. + + + +Hi Pedro, + +Sorry to hear of your difficulty, but given the age of this bug report, I'd strongly urge you to file a new bug report. Since this was last looked at over 10 years ago, it's extremely likely your issue is completely unrelated to the originally reported one. + +Here are a couple pages on how to write effective bug reports, that I'd encourage reading to ensure your report is actionable and can (hopefully) get resolved expediently: + + * https://help.ubuntu.com/community/ReportingBugs + * https://ubuntu.com/server/docs/reporting-bugs + +A few other tips specific to qemu (per the upstream bug tracker): + + * Include the QEMU release version or the git commit hash into the description, so that it is later still clear in which version you have found the bug. Reports against the latest release or even the latest development tree are usually acted upon faster. + * Include the full command line used to launch the QEMU guest. + * Reproduce the problem directly with a QEMU command-line. Avoid frontends and management stacks, to ensure that the bug is in QEMU itself and not in a frontend. + * Include information about the host and guest (operating system, version, 32/64-bit). + + +Pedro, +please also note that vvfat driver is general in a bad state and more or less completely unmaintaind. I can only strongly recommend to *not* use it in production. If you have to share files between guest and host, please use something more modern like virtio-fs (or maybe virtio-9p) instead. + +If you need OS portability then the "usb-mtp" device is also an option for adhoc file sharing. + +This bug report has been moved to QEMU's new bug tracker on gitlab.com and thus gets now closed in Launchpad. Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/66 + diff --git a/results/classifier/zero-shot/105/semantic/369 b/results/classifier/zero-shot/105/semantic/369 new file mode 100644 index 000000000..9e83203eb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/369 @@ -0,0 +1,14 @@ +semantic: 0.420 +mistranslation: 0.362 +other: 0.270 +graphic: 0.176 +KVM: 0.159 +vnc: 0.127 +boot: 0.099 +socket: 0.097 +device: 0.085 +assembly: 0.058 +network: 0.054 +instruction: 0.029 + +Remove leading underscores from #defines diff --git a/results/classifier/zero-shot/105/semantic/490484 b/results/classifier/zero-shot/105/semantic/490484 new file mode 100644 index 000000000..4538fe5eb --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/490484 @@ -0,0 +1,76 @@ +semantic: 0.914 +graphic: 0.910 +device: 0.886 +other: 0.886 +KVM: 0.868 +boot: 0.859 +mistranslation: 0.854 +instruction: 0.850 +vnc: 0.833 +assembly: 0.812 +network: 0.789 +socket: 0.715 + +running 64bit client in 64bit host with intel crashes + +Binary package hint: qemu-kvm + +running windows 7 VM halts on early boot with + +kvm: unhandled exit 80000021 +kvm_run returned -22 + +ProblemType: Bug +Architecture: amd64 +Date: Mon Nov 30 21:28:54 2009 +DistroRelease: Ubuntu 9.10 +KvmCmdLine: Error: command ['ps', '-C', 'kvm', '-F'] failed with exit code 1: UID PID PPID C SZ RSS PSR STIME TTY TIME CMD +MachineType: System manufacturer P5Q-PRO +NonfreeKernelModules: fglrx +Package: kvm (not installed) +ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.31-14-generic root=UUID=17a8e181-fac7-461e-8cad-8aea97be2536 ro quiet splash +ProcEnviron: + LANGUAGE=en_US:en + PATH=(custom, user) + LANG=en_US.UTF-8 + SHELL=/bin/bash +ProcVersionSignature: Ubuntu 2.6.31-14.48-generic +SourcePackage: qemu-kvm +Uname: Linux 2.6.31-14-generic x86_64 +dmi.bios.date: 07/10/2008 +dmi.bios.vendor: American Megatrends Inc. +dmi.bios.version: 1004 +dmi.board.asset.tag: To Be Filled By O.E.M. +dmi.board.name: P5Q-PRO +dmi.board.vendor: ASUSTeK Computer INC. +dmi.board.version: Rev 1.xx +dmi.chassis.asset.tag: Asset-1234567890 +dmi.chassis.type: 3 +dmi.chassis.vendor: Chassis Manufacture +dmi.chassis.version: Chassis Version +dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1004:bd07/10/2008:svnSystemmanufacturer:pnP5Q-PRO:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5Q-PRO:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion: +dmi.product.name: P5Q-PRO +dmi.product.version: System Version +dmi.sys.vendor: System manufacturer + + + +Thanks for the information. + +regards +chuck + +Hey Chuck- + +You marked this confirmed... Are you able to reproduce this? + +Hi Sarunas- + +Were you able to install windows7 and just the reboot failed? Or are you using a windows7 image that was installed elsewhere (or otherwise)? + +Anthony, any idea of the state of 64bit Windows7 on a 64bit QEMU host? + +I was able to install windows7 and just the reboot failed. It all works in VirtualBox OSE though. + +Looks like the install failed to succeed and there was not an MBR written. + diff --git a/results/classifier/zero-shot/105/semantic/526653 b/results/classifier/zero-shot/105/semantic/526653 new file mode 100644 index 000000000..6b88698f1 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/526653 @@ -0,0 +1,57 @@ +semantic: 0.282 +instruction: 0.259 +device: 0.239 +graphic: 0.177 +mistranslation: 0.134 +assembly: 0.093 +socket: 0.089 +other: 0.088 +network: 0.077 +vnc: 0.074 +boot: 0.067 +KVM: 0.066 + +Breakpoint on Memory address fails with KVM + +Using QEMU version 0.12.50 under ubuntu Karmic x64 + +To reproduce the error using a floppy with a bootloder: +qemu-system-x86_64 -s -S -fda floppy.img -boot a -enable-kvm + +connect with gdb: +(gdb) set arch i8086 +The target architecture is assumed to be i8086 +(gdb) target remote localhost:1234 +Remote debugging using localhost:1234 +0x0000fff0 in ?? () +(gdb) break *0x7c00 +Breakpoint 1 at 0x7c00 +(gdb) continue +Continuing. + +The breakpoint is not hit. + +If you close qemu and start it without kvm support: + +qemu-system-x86_64 -s -S -fda floppy.img -boot a + +(gdb) set arch i8086 +The target architecture is assumed to be i8086 +(gdb) target remote localhost:1234 +Remote debugging using localhost:1234 +0x0000fff0 in ?? () +(gdb) break *0x7c00 +Breakpoint 1 at 0x7c00 +(gdb) continue +Continuing. + +Breakpoint 1, 0x00007c00 in ?? () +(gdb) + +The breakpoint is hit. If you wait until after the bootloader has been loaded into memory, you can properly set breakpoints with or without kvm enabled. + +Triaging old bug tickets ... can you still reproduce this issue with the +latest version of QEMU (currently version 2.8)? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/568614 b/results/classifier/zero-shot/105/semantic/568614 new file mode 100644 index 000000000..b56341c17 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/568614 @@ -0,0 +1,100 @@ +semantic: 0.910 +other: 0.906 +instruction: 0.877 +mistranslation: 0.859 +socket: 0.853 +device: 0.847 +assembly: 0.840 +KVM: 0.840 +graphic: 0.827 +vnc: 0.799 +network: 0.732 +boot: 0.728 + +x86_64 host curses interface: spacing/garbling + +Environment: +Arch Linux x86_64, kernel 2.6.33, qemu 0.12.3 + +Steps to reproduce: +1. Have a host system running 64-bit Linux. +2. Start a qemu VM with the -curses flag. + +Expected results: +Text displayed looks as it would on a real text-mode display, and VM is therefore usable. + +Actual results: +Text displayed contains an extra space between characters, causing text to flow off the right and bottom sides of the screen. This makes the curses interface unintelligible. + +The attached patch fixes this problem on 0.12.3 on my installation without changing behavior on a 32-bit machine. I don't know enough of the semantics of console_ch_t to know if this is the "correct" fix or if there should be, say, an extra cast somewhere instead. + + + +Just compiled qemu from git to check for bug, and it is still present. + +This patch should address the problem. I've submitted it to qemu-devel. + +Perhaps you have found an endianness bug as well, but my issue is with word size (my host is a 64-bit Intel). I manually applied the patch to a 0.12.3 build, and I am still seeing the problem with the curses console. + +It seems that the console_ch_t type is used in a number of different contexts. The console.c and console.h files use console_ch_t fairly uniformly. However, the `screen' array in curses.c is cast to both uint8_t* and chtype* (unsigned int* according to my curses library) and passed as console_ch_t* (unsigned long*) to vga_hw_text_update. That's three different bit-sizes on my machine... is it possible that the problem lies here? + +Hi Devin, + +Can you test if this bug still exists with 0.14 (or better still, a git build)? + +Brad + + +I just compiled from git and the problem persists. + +I will reiterate that the issue appears to be with the word size of the types used, not with endianness; see comment 4. I have not dug any further into the QEMU code to see if I find a more "correct-looking" solution than the quick patch I have attached to this bug. + +any news on that bug? + +If I remember correctly, the type is unsigned long because it needs to match "chtype" as declared in curses.h. On some implementations of curses it may be declared differently, we really should use the "chtype" type directly but console.h is also used when the use of curses was disabled in qemu config. + +I'm pretty sure the curses support has been tested on machines of both endianness at one point, but mostly with ncurses only. + +I just checked out the ncurses source - it looks like the type of "chtype" actually depends on how ncurses is configured at build time: + +curses.h.in: +#if @cf_cv_enable_lp64@ && defined(_LP64) +typedef unsigned chtype; +typedef unsigned mmask_t; +#else +typedef unsigned @cf_cv_typeof_chtype@ chtype; +typedef unsigned @cf_cv_typeof_mmask_t@ mmask_t; +#endif + +So even if Qemu targets only ncurses, we can't assume that chtype is anything in particular. + +In light of that, I guess this bug boils down to "let's use chtype directly," which (naively) seems like it could be #ifdef'd pretty easily. + +This is probably the source of the problem. As you say it'd be best to use chtype directly if it can be done cleanly, unfortunately it looks like it'll add a curses specific snippet in console.h, but so be it. The only other option is to add a conversion step in curses.c and give up zero-copy passing screen data to curses, which I'd like to avoid. + +How about using CONFIG_CURSES? If we --enable-curses, then we know we have curses.h and chtype. If we --disable-curses, then we don't care. + +Patch attached. It's a no-op if curses is disabled, and it fixes the issue on my machine with curses enabled. I had to undef the color constants from curses.h so we didn't conflict with enum color_names in console.c. + +for which version is the last patch? +also 12.3? + +Pretty sure it was the latest Arch package: 0.14.1. Did you have trouble applying it? + +I can run back and make a patch against git if you can't use this. + +no I don't believe it was your fault +I couldn't get the code compile even without your patch... + +man this sucks... i had hoped it would be upstream with 0.15 but I might have to try to compile it by myself again + +thanks ;) + +Alright, I've sent a patch to qemu-devel. Let's see what happens now... + +ahhhh it worked, just tried it with latest stable 0.15 git !!! finally you are my hero! =) + +Fix had been included here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=df00bed0fa30a6f5712 +... so closing this bug ticket now. + diff --git a/results/classifier/zero-shot/105/semantic/600 b/results/classifier/zero-shot/105/semantic/600 new file mode 100644 index 000000000..88b161b92 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/600 @@ -0,0 +1,14 @@ +semantic: 0.598 +instruction: 0.555 +other: 0.486 +device: 0.483 +network: 0.415 +boot: 0.332 +graphic: 0.236 +mistranslation: 0.233 +vnc: 0.159 +socket: 0.115 +assembly: 0.113 +KVM: 0.111 + +Have 'info mtree' accept an (optional) 'name' parameter to pick a specific address space diff --git a/results/classifier/zero-shot/105/semantic/639651 b/results/classifier/zero-shot/105/semantic/639651 new file mode 100644 index 000000000..f3ca85a61 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/639651 @@ -0,0 +1,256 @@ +semantic: 0.904 +device: 0.897 +assembly: 0.896 +other: 0.895 +instruction: 0.873 +graphic: 0.870 +boot: 0.868 +vnc: 0.857 +socket: 0.849 +KVM: 0.816 +mistranslation: 0.808 +network: 0.804 + +DRIVER_IRQL_NOT_LESS_OR_EQUAL booting WIndows XP with Synaptics driver installed + +Positng the issue here since I did not get any reply on the ML. + +I was trying to update some windows XP (SP3) images in kvm. + +It worked fine several times but last time I added mass storage +drivers to sysprep and now on the second boot after reseal (the first +is mini-setup) I get a BSOD with message +DRIVER_IRQL_NOT_LESS_OR_EQUAL. I can post the screenshot if somebody +thinks it is interesting enough. + +The same image works on hardware (which has controllers different from +the qemu PIIX3) and in VirtualBox (with the default PIIX4 as well as +PIIX3) so long as IO apic is enabled). + +I am not sure if this is an error with the MS drivers or how they are +used in sysprep in this particular case or if his is some strange +error in qemu emulation in the PIIX3 controller or elsewhere. + +The image is originally created on hardware with MP acpi (not virtualization). + + + +Note that most of the section is generated by sysprep (the drivers referencing C:\windows), only small part at the end is generated by my script for additional drivers (references C:\drivers). + +The previous images that did not have this section worked only on controllers compatible with the MS generic PCI IDE driver (and also on KVM). + +Just to be sure, you are not using the virtio-blk driver for Windows here? + +I have seen similar crashes with the older version of virtio-blk when used on +recent versions of KVM. + + +On 7 October 2010 11:05, Jes Sorensen <email address hidden> wrote: +> Just to be sure, you are not using the virtio-blk driver for Windows +> here? +> +> I have seen similar crashes with the older version of virtio-blk when used on +> recent versions of KVM. +> +> -- +> DRIVER_IRQL_NOT_LESS_OR_EQUAL booting WIndows XP with Synaptics driver installed +> https://bugs.launchpad.net/bugs/639651 +> You received this bug notification because you are a member of qemu- +> devel-ml, which is subscribed to QEMU. +> +> Status in QEMU: New +> Status in Debian GNU/Linux: New +> +> Bug description: +> Positng the issue here since I did not get any reply on the ML. +> +> I was trying to update some windows XP (SP3) images in kvm. +> +> It worked fine several times but last time I added mass storage +> drivers to sysprep and now on the second boot after reseal (the first +> is mini-setup) I get a BSOD with message +> DRIVER_IRQL_NOT_LESS_OR_EQUAL. +> +> It turns out that the error is unrelated to storage drivers. It is triggered by Synaptics driver installing for the PS2 mouse in kvm (which does not happen in VirtualBox or on real hardware). +> +> The image is originally created on hardware with MP acpi (not virtualization). +> +> qemu-kvm 0.12.5+dfsg-2 +> +Actually the issue is caused by the Synaptics touchpad driver binding +to the PS/2 mouse device in qemu. + +I have no idea how PS/2 devices are detected but the one present in +qemu is misdetected as a synaptics touchapd by the Synaptics driver +for Windows. + +As a workaround I have patched my qemu to not enable the PS/2 mouse device. + +Thanks + +Michal + +On 10/07/10 11:51, Michal Suchanek wrote: +> Actually the issue is caused by the Synaptics touchpad driver binding +> to the PS/2 mouse device in qemu. +> +> I have no idea how PS/2 devices are detected but the one present in +> qemu is misdetected as a synaptics touchapd by the Synaptics driver +> for Windows. +> +> As a workaround I have patched my qemu to not enable the PS/2 mouse device. + +Hi Michal, + +If you have the time to look, it would be interesting to see what +actually goes over the wire to/from the PS/2 driver in QEMU just prior +to the crash. It would be good to get this fixed. + +Cheers, +Jes + +I have no idea how to log the data. + +I looked at the qemu man page but it does not even mention the PS/2 +mouse as a chardev nor offers an option to log traffic of chardevs +without attaching them to a file and thus detaching them from the +emulated device. + +Thanks + +Michal + +On 10/07/10 12:17, Michal Suchanek wrote: +> I have no idea how to log the data. +> +> I looked at the qemu man page but it does not even mention the PS/2 +> mouse as a chardev nor offers an option to log traffic of chardevs +> without attaching them to a file and thus detaching them from the +> emulated device. + +I would attack it by hacking hw/ps.c a bit to monitor the transactions +in ps2_queue() and ps2_read_data(). + +Cheers, +Jes + + +Attaching logged PS/2 port communication. + +For reference attaching the patch used for obtaining the log. + +I guess the problem here is that qemu emulates some very basic mouse (reported as PS/2 compatible mouse in Windows) whereas real hardware usually has a fancy mouse (reported as MS Explorer compatible mouse in Windows). + +I don't know how PS/2 mice are detected but due to different mice being detected differently there is obviously some detection mechanism. + +The Windos device IDs in a Touchpad driver that does not exhibit the problem (synaptics v8) + +[SynMfg] +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010D +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010E +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010F +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0110 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0111 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0112 +%PS2.SynDeviceDesc% = HP__0100__PS2_Inst,*SYN0113 +%PS2.SynDeviceDesc% = HP__0100__PS2_Inst,*SYN0114 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0115 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0116 +%PS2.SynDeviceDesc% = HP_GROUP1_PS2_Inst,*SYN0117 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0118 +%PS2.SynDeviceDesc% = HP_GROUP5_PS2_Inst,*SYN0119 +%PS2.SynDeviceDesc% = HP_GROUP6_PS2_Inst,*SYN011A +%PS2.SynDeviceDesc% = HP_GROUP4_PS2_Inst,*SYN011B +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN011C +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN011D +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN011E +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN011F +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0120 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0121 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0122 +%PS2.SynDeviceDesc% = HP_GROUP7_PS2_Inst,*SYN0123 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0124 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0125 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0126 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0127 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0128 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0129 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN012A +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN012B +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN012C + +and in drivers that do exhibit the problem (note the first device): + +Synaptics v9 +[SynMfg] +%PS2.SynDeviceDesc% = HP__0100__PS2_Inst,*PNP0F13 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010D +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010E +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN010F +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0110 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0111 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0112 +%PS2.SynDeviceDesc% = HP__0100__PS2_Inst,*SYN0113 +%PS2.SynDeviceDesc% = HP__0100__PS2_Inst,*SYN0114 +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0115 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0116 +%PS2.SynDeviceDesc% = HP_GROUP1_PS2_Inst,*SYN0117 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0118 +%PS2.SynDeviceDesc% = HP_GROUP5_PS2_Inst,*SYN0119 +%PS2.SynDeviceDesc% = HP_GROUP6_PS2_Inst,*SYN011A +%PS2.SynDeviceDesc% = HP_GROUP4_PS2_Inst,*SYN011B +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN011C +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN011D +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN011E +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN011F +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0120 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0121 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0122 +%PS2.SynDeviceDesc% = HP_GROUP7_PS2_Inst,*SYN0123 +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0124 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0125 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0126 +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0127 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0128 +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0129 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN012A +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN012B +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN012C +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN012D +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN012E +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN012F +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0130 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0131 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0132 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0133 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0134 +%PS2.SynDeviceDesc% = HP_GROUP10_PS2_Inst,*SYN0135 +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN0136 +%PS2.SynDeviceDesc% = HP_GROUP3_PS2_Inst,*SYN0137 +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN0138 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN0139 +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN013A +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN013B +%PS2.SynDeviceDesc% = HP_GROUP9_PS2_Inst,*SYN013C +%PS2.SynDeviceDesc% = HP_GROUP8_PS2_Inst,*SYN013D +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN013E +%PS2.SynDeviceDesc% = HP_GROUP2_PS2_Inst,*SYN013F + +Syanptics v14 +[SynMfg] +%PS2.DeviceDesc% = DEFAULT__0000__PS2_Inst,*PNP0F13,*PNP0F0E,*PNP0F03,*PNP0F12 ; Std PS/2 mouse +%PS2.SynDeviceDesc% = DEFAULT__0000__PS2_Inst,*SYN0002 ; Synaptics PS2 TouchPad + +Either of the synaptics drivers v9 and v14 would bind to the qemu mouse and crash the system when installed. + +QEMU 0.12 is pretty much outdated nowadays ... can you still reproduce this problem with the latest version of QEMU, or can we close this ticket nowadays? + +Does qemu allow to disable the PS/2 port now? + +If so then there is easy workaround in case something similar ever happens. + + +Hmm, I'm not aware of a way to disable the PS2 mouse in QEMU yet. Looking at your other related bug (https://bugs.launchpad.net/qemu/+bug/813546), I think you might need to discuss that patch on the qemu-devel mailing list. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/645662 b/results/classifier/zero-shot/105/semantic/645662 new file mode 100644 index 000000000..838de0f0d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/645662 @@ -0,0 +1,620 @@ +semantic: 0.692 +graphic: 0.690 +assembly: 0.683 +other: 0.659 +network: 0.626 +instruction: 0.621 +device: 0.589 +vnc: 0.488 +socket: 0.486 +boot: 0.428 +mistranslation: 0.416 +KVM: 0.319 + +QEMU x87 emulation of trig and other complex ops is only at 64-bit precision, not 80-bit + +When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. + +Regression testing errors: + +test_cmath +test test_cmath failed -- Traceback (most recent call last): + File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in + self.fail(error_message) +AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +Expected: complex(3.141592653589793, -2.1073424255447014e-08) +Received: complex(3.141592653589793, -2.1073424338879928e-08) +Received value insufficiently close to expected value. + + +test_float +test test_float failed -- Traceback (most recent call last): + File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in + self.assertEqual(s, repr(float(s))) +AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' + + +test_math +test test_math failed -- multiple errors occurred; run in verbose mode for deta + +=> + +runtests.sh -v test_math + +le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +test_math BAD + 1 BAD + 0 GOOD + 0 SKIPPED + 1 total +le01:~/tools/python3/Python-3.1.2# + +Just found time to look a bit deeper into this. The problem is in the (real) asinh() function. +I think it is possible that somewhere a calculation was done in float instead of double or a double value was shortened to float. +Please note that until this is fixed, qemu i386 is not IEEE754 conform. + +Minimal test code in c: +--- +#include <stdio.h> +#include <math.h> + +int main(){ /* compile with gcc -lm test.c */ + double x, y; + x = -2.1073424255447017e-08; + y = asinh(x); + printf("y = %20.16e\n",y); + printf("should be -2.1073424255447017e-08\n"); +} + +--- + + +Forgot so say: This is still present in 0.13.0. + +And I unexpectedly had more time to dig. + +Findings: + + - asinh() is not in IEEE754, please ignore my comment about + non-conformity, sorry. + - The calculation for asinh() is pretty badly conditioned, i.e. + it blows up errors in the basic calculations. + - asinh() is implemented in glibc in assembler on the FPU stack. + This would mean 80 bits float representation. + +Error observations: + + Calculations for asinh( -2.1073424255447017e-08), derived from + the Python 3.1.2 selftest. Formula is asinh(x) = log(x+sqrt(x*x+1)) + Reference is a calculation with long double (128 bit) + + - qemu: 6 correct digits behind the dot + - C with double: 7 correct digits behind the dot + - i387: 10 correct digits behind the dot + +Possible root cause: + + The observed error may be due to qemu using 64 bit double math in + the FPU implementation, instead of the 80 bit internal precision + a real i387 uses. + +Comparison with kernel i387 emulation + + As an additional verification, I tried the calculations using the + Linux kernel coprocessor emulation. This basically failed. With + kernel 2_6_26 and 2_6_27 and the ''no387'' flag, the kernel locked + up without output early during boot. With 2.6.34, I could not do + an ssh login. I may try this again later, and especially try + to find out what is wrong with qemu and 2.6.34. + +Consequences: + + 1) qemu is significantly less accurate for some mathematics than a + genuine i387. This may cause problesm with sensitive mathematics + developed on a real i387. + 2) qemu can be fingerprinted easily using this problem. This may + have security implications. + +Fix Options: + + 1) - Moving the FPU internal representation to long double. It will + still not be the same results as on an i387, but it will be + _more_ accurate instead of less, which generally is far less + of a problem. + + Pro: Higher accuracy + Cons: Slower + + - Use code derived from the Linux kernel i387 FPU emulator. + The emulator is not bit-exact, but relatively close and it + is used at least on some small x86 architectures. + + Pro: Higher accuracy + Cons: Need to integrate foreign code + + - Leave it as it is but add a clear warning to the "Known qemu + Problems" Section in the manual and Wiki (I did not find one, I + think it should be added in an easily findable place), that + i387 FPU emulation is only 64 bits internally and may be less + exact for combined calculations done on the FPU stack, such + as asinh() in glibc(), than a real i387. + + You may recommend to use kernel FPU emulation, but + this may not work or be a pain to get working. + + Pro: Least work + Cons: Not really a solution + +2) Fixing this is difficult and there are other ways to find + out that you are running in qemu anyways. I think the security + implications are insiginficant in most cases. + + + + + + + +Can you still reproduce this problem with the latest version of QEMU? As far as I know, the FPU emulation has been completely switched to softfloat with the latest versions, so I assume your problem might be fixed in recent releases... + +Hi, + +I just tried for about 2 hours without success or getting any useful +diagnostic to compile and run current qemu 2.8. No console, +no window opens, -curses does not work in .configure, despite +ncurses-dev being installed, etc. + +Quite frankly, as I have not used qemu for several years now, +I cannot be bothered to fight through an apparently mostly +broken build and install process. I did include two diagnostic +options in my bug-report, please try them out yourself. + +Regards, +Arno + +On Thu, Jan 19, 2017 at 09:34:56 CET, Thomas Huth wrote: +> Can you still reproduce this problem with the latest version of QEMU? As +> far as I know, the FPU emulation has been completely switched to +> softfloat with the latest versions, so I assume your problem might be +> fixed in recent releases... +> +> ** Changed in: qemu +> Status: New => Incomplete +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/645662 +> +> Title: +> Python 3.1.2 math errors with Qemu 0.12.5 +> +> Status in QEMU: +> Incomplete +> +> Bug description: +> When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +> gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +> 3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +> on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. +> +> Regression testing errors: +> +> test_cmath +> test test_cmath failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in +> self.fail(error_message) +> AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +> Expected: complex(3.141592653589793, -2.1073424255447014e-08) +> Received: complex(3.141592653589793, -2.1073424338879928e-08) +> Received value insufficiently close to expected value. +> +> +> test_float +> test test_float failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in +> self.assertEqual(s, repr(float(s))) +> AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' +> +> +> test_math +> test test_math failed -- multiple errors occurred; run in verbose mode for deta +> +> => +> +> runtests.sh -v test_math +> +> le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +> test_math BAD +> 1 BAD +> 0 GOOD +> 0 SKIPPED +> 1 total +> le01:~/tools/python3/Python-3.1.2# +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/645662/+subscriptions + +-- +Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: <email address hidden> +GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 +---- +A good decision is based on knowledge and not on numbers. -- Plato + +If it's in the news, don't worry about it. The very definition of +"news" is "something that hardly ever happens." -- Bruce Schneier + + +Looks like your test code from comment #1 still prints out a wrong value, so the bug has apparently not been fixed by the FPU updates... + +I had a brief look at softfloat. In principle, it should fix the +issue, but only if the FPU uses 80-bit double-extended-precision +internally. I guess the qemu FPU is still stuck at 64 bit double +internally and that does not cut it for some calculations. + +Just to be sure, I re-tested the code from comment 1 and it does +work as expected with a real FPU. + +Regards, +Arno + + +On Mon, Jan 23, 2017 at 22:20:18 CET, Thomas Huth wrote: +> Looks like your test code from comment #1 still prints out a wrong +> value, so the bug has apparently not been fixed by the FPU updates... +> +> ** Changed in: qemu +> Status: Incomplete => Triaged +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/645662 +> +> Title: +> Python 3.1.2 math errors with Qemu 0.12.5 +> +> Status in QEMU: +> Triaged +> +> Bug description: +> When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +> gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +> 3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +> on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. +> +> Regression testing errors: +> +> test_cmath +> test test_cmath failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in +> self.fail(error_message) +> AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +> Expected: complex(3.141592653589793, -2.1073424255447014e-08) +> Received: complex(3.141592653589793, -2.1073424338879928e-08) +> Received value insufficiently close to expected value. +> +> +> test_float +> test test_float failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in +> self.assertEqual(s, repr(float(s))) +> AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' +> +> +> test_math +> test test_math failed -- multiple errors occurred; run in verbose mode for deta +> +> => +> +> runtests.sh -v test_math +> +> le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +> test_math BAD +> 1 BAD +> 0 GOOD +> 0 SKIPPED +> 1 total +> le01:~/tools/python3/Python-3.1.2# +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/645662/+subscriptions + +-- +Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: <email address hidden> +GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 +---- +A good decision is based on knowledge and not on numbers. -- Plato + +If it's in the news, don't worry about it. The very definition of +"news" is "something that hardly ever happens." -- Bruce Schneier + + +softfloat deals with doing the basic IEEE ops (add, subtract, multiply, etc) at the correct 80 bit precision, but it doesn't provide implementations of the x87's more complex operations (sin, cos, log, etc), so QEMU is still doing those by converting from 80 bit to host double and using the host C math library routines. Fixing this would require writing hand-coded routines for all these operations, which is quite a tricky bit of work. (Sadly we can't just use the bochs implementations, because they're under the softfloat2b license which isn't GPL2 compatible.) + + +Laurent Vivier recently started to adapt some of function from the "Previous" NeXT emulator for m68k, see e.g. this patch series here: + + https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg04454.html + +Not sure, but I think that could help to fix at least some of the missing functions. + +That explains it. For most operations that approach works well as basically nobody uses the 80 bit formats directly anyways. Unfortunately asinh() is very badly conditioned in the region tested and it is not enough. + +A possible approach to fix this would be to use long double (128 bit) were available instead of just double and to document this limitation for double. + +The test code from comment #1 now prints out the correct value with QEMU v4.1, so I think this has been fixed with the softfloat work that has been done within the last year. + +To be sure, you can also run my original C test code from +2010. If that produces a bit-identtical result, then this +has indeed been fixed. If there are deviations in the last +digits, then the fingerprinting issues is still there, but +at least Python has stopped complaining. + +Regards, +Arno + +On Fri, Aug 16, 2019 at 07:47:59 CEST, Thomas Huth wrote: +> The test code from comment #1 now prints out the correct value with QEMU +> v4.1, so I think this has been fixed with the softfloat work that has +> been done within the last year. +> +> ** Changed in: qemu +> Status: Confirmed => Fix Released +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/645662 +> +> Title: +> QEMU x87 emulation of trig and other complex ops is only at 64-bit +> precision, not 80-bit +> +> Status in QEMU: +> Fix Released +> +> Bug description: +> When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +> gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +> 3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +> on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. +> +> Regression testing errors: +> +> test_cmath +> test test_cmath failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in +> self.fail(error_message) +> AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +> Expected: complex(3.141592653589793, -2.1073424255447014e-08) +> Received: complex(3.141592653589793, -2.1073424338879928e-08) +> Received value insufficiently close to expected value. +> +> +> test_float +> test test_float failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in +> self.assertEqual(s, repr(float(s))) +> AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' +> +> +> test_math +> test test_math failed -- multiple errors occurred; run in verbose mode for deta +> +> => +> +> runtests.sh -v test_math +> +> le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +> test_math BAD +> 1 BAD +> 0 GOOD +> 0 SKIPPED +> 1 total +> le01:~/tools/python3/Python-3.1.2# +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/645662/+subscriptions + +-- +Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: <email address hidden> +GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 +---- +A good decision is based on knowledge and not on numbers. -- Plato + +If it's in the news, don't worry about it. The very definition of +"news" is "something that hardly ever happens." -- Bruce Schneier + + +Looking at our code we're still implementing the x87 insns FSIN, FCOS, FSINCOS, FPTAN, FPATAN, F2XM1, FYL2X, FYL2XP1 by "convert the floatx80 to a host double and use the host C library functions", so I think this bug is still unfixed. If the C program in comment 1 and/or the Python code has stopped reporting failures it's probably just because the guest C library routines have stopped using the x87 80-bit FPU instructions internally. + + +Fine by me. I suggest to keep tracking this though, if necessary +in another bug item. + +Regards, +Arno + + +On Fri, Aug 16, 2019 at 16:06:29 CEST, Peter Maydell wrote: +> Looking at our code we're still implementing the x87 insns FSIN, FCOS, +> FSINCOS, FPTAN, FPATAN, F2XM1, FYL2X, FYL2XP1 by "convert the floatx80 +> to a host double and use the host C library functions", so I think this +> bug is still unfixed. If the C program in comment 1 and/or the Python +> code has stopped reporting failures it's probably just because the guest +> C library routines have stopped using the x87 80-bit FPU instructions +> internally. +> +> +> ** Changed in: qemu +> Status: Fix Released => Confirmed +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/645662 +> +> Title: +> QEMU x87 emulation of trig and other complex ops is only at 64-bit +> precision, not 80-bit +> +> Status in QEMU: +> Confirmed +> +> Bug description: +> When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +> gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +> 3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +> on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. +> +> Regression testing errors: +> +> test_cmath +> test test_cmath failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in +> self.fail(error_message) +> AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +> Expected: complex(3.141592653589793, -2.1073424255447014e-08) +> Received: complex(3.141592653589793, -2.1073424338879928e-08) +> Received value insufficiently close to expected value. +> +> +> test_float +> test test_float failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in +> self.assertEqual(s, repr(float(s))) +> AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' +> +> +> test_math +> test test_math failed -- multiple errors occurred; run in verbose mode for deta +> +> => +> +> runtests.sh -v test_math +> +> le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +> test_math BAD +> 1 BAD +> 0 GOOD +> 0 SKIPPED +> 1 total +> le01:~/tools/python3/Python-3.1.2# +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/645662/+subscriptions + +-- +Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: <email address hidden> +GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 +---- +A good decision is based on knowledge and not on numbers. -- Plato + +If it's in the news, don't worry about it. The very definition of +"news" is "something that hardly ever happens." -- Bruce Schneier + + +For 5.1 (commits 1f18a1e6ab8368a4, 5eebc49d2d0aa5fc7e, 5ef396e2ba865f34a, eca30647fc078f4) we reimplemented FPATAN, FYL2X, FYL2XP1, FPREM, FPREM1, F2XM1 to do proper 80-bit precision operations. However the trig operations FPTAN, FSINCOS, FSIN, FCOS are still implemented as naive "convert to host double and use host C library functions". + + +Talk about a "blast form the past!" This Bug is now over 10 years old. +But at least somebody is still working on it and it was not +just quietly dropped. I can respect that. + +My original recommendation stands: At least use long double for the +calcuations where available. + +Regards, +Arno + +On Sun, Nov 22, 2020 at 00:21:50 CET, Peter Maydell wrote: +> For 5.1 (commits 1f18a1e6ab8368a4, 5eebc49d2d0aa5fc7e, +> 5ef396e2ba865f34a, eca30647fc078f4) we reimplemented FPATAN, FYL2X, +> FYL2XP1, FPREM, FPREM1, F2XM1 to do proper 80-bit precision operations. +> However the trig operations FPTAN, FSINCOS, FSIN, FCOS are still +> implemented as naive "convert to host double and use host C library +> functions". +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/645662 +> +> Title: +> QEMU x87 emulation of trig and other complex ops is only at 64-bit +> precision, not 80-bit +> +> Status in QEMU: +> Confirmed +> +> Bug description: +> When doing the regression tests for Python 3.1.2 with Qemu 0.12.5, (Linux version 2.6.26-2-686 (Debian 2.6.26-25lenny1)), +> gcc (Debian 4.3.2-1.1) 4.3.2, Python compiled from sources within qemu, +> 3 math tests fail, apparently because the floating point unit is buggy. Qmeu was compiled from original sources +> on Debian Lenny with kernel 2.6.34.6 from kernel.org, gcc (Debian 4.3.2-1.1) 4.3. +> +> Regression testing errors: +> +> test_cmath +> test test_cmath failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_cmath.py", line 364, in +> self.fail(error_message) +> AssertionError: acos0034: acos(complex(-1.0000000000000002, 0.0)) +> Expected: complex(3.141592653589793, -2.1073424255447014e-08) +> Received: complex(3.141592653589793, -2.1073424338879928e-08) +> Received value insufficiently close to expected value. +> +> +> test_float +> test test_float failed -- Traceback (most recent call last): +> File "/root/tools/python3/Python-3.1.2/Lib/test/test_float.py", line 479, in +> self.assertEqual(s, repr(float(s))) +> AssertionError: '8.72293771110361e+25' != '8.722937711103609e+25' +> +> +> test_math +> test test_math failed -- multiple errors occurred; run in verbose mode for deta +> +> => +> +> runtests.sh -v test_math +> +> le01:~/tools/python3/Python-3.1.2# ./runtests.sh -v test_math +> test_math BAD +> 1 BAD +> 0 GOOD +> 0 SKIPPED +> 1 total +> le01:~/tools/python3/Python-3.1.2# +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/645662/+subscriptions + +-- +Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: <email address hidden> +GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 +---- +A good decision is based on knowledge and not on numbers. -- Plato + +If it's in the news, don't worry about it. The very definition of +"news" is "something that hardly ever happens." -- Bruce Schneier + + +In the meantime, target/m68k has implemented these trig +functions for its (only slightly different) 96-bit +extended-float format. + +With a minor amount of work this code could be shared. + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/83 + + diff --git a/results/classifier/zero-shot/105/semantic/691424 b/results/classifier/zero-shot/105/semantic/691424 new file mode 100644 index 000000000..f6f70bbe5 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/691424 @@ -0,0 +1,101 @@ +semantic: 0.906 +boot: 0.871 +network: 0.870 +assembly: 0.863 +device: 0.859 +other: 0.858 +mistranslation: 0.854 +graphic: 0.853 +instruction: 0.843 +socket: 0.836 +vnc: 0.768 +KVM: 0.750 + +qemu/kvm SDL over ssh -X broken + +qemu/kvm by default uses SDL to render the output of its emulated VGA graphics. +This is broken over ssh -X since quite a while. +The only workaround I know, is to use qemu -vnc :0 +and connect using vncviewer + + +How To Reproduce: +1. zypper in qemu +2. ssh -X localhost qemu -cdrom ANYISOFILE + +Actual Results: +qemu hangs in an endless loop on the BIOS display screen + +Expected Results: +should boot up the iso as 0.10 versions did + +Reproducible: Always + + +this is what broke it: +$ git bisect bad +c18a2c360e3100bbd71162cf922dcd8c429a8b71 is first bad commit +commit c18a2c360e3100bbd71162cf922dcd8c429a8b71 +Author: Stefano Stabellini <email address hidden> +Date: Wed Jun 24 11:58:25 2009 +0100 + + sdl zooming + + Hi all, + this patch implements zooming capabilities for the sdl interface. + A new sdl_zoom_blit function is added that is able to scale and blit a + portion of a surface into another. + This way we can enable SDL_RESIZABLE and have a real_screen surface with + a different size than the guest surface and let sdl_zoom_blit take care + of the problem. + + Signed-off-by: Stefano Stabellini <email address hidden> + Signed-off-by: Anthony Liguori <email address hidden> + +:100644 100644 a06c9bfc22cc6de1c6e5e9068d6bf59d89613767 f8dc5065dd27010bfdbb6bcfb0c6e3af25024cdb M Makefile +:100644 100644 417217582363a87ee67e746ba798e285a64b6cdc 35183399f65de6f50f3baa4767ab7d4d11d45bca M console.h +:100644 100644 178b5532b8d9dd2194a8662fbfdcd49b4bc04222 d81399e51276e1c97fa1f7272ef16ea4c312b51b M sdl.c +:000000 100644 0000000000000000000000000000000000000000 56d3604fc3d79e4cc4622be8437c78bf70075da3 A sdl_zoom.c +:000000 100644 0000000000000000000000000000000000000000 33dc63408b43a37fd6b1acde3fa62b1a51315e75 A sdl_zoom.h +:000000 100644 0000000000000000000000000000000000000000 64bbca849bd3af678c2259b4d8cc0e48c6a6b43c A sdl_zoom_template.h + + +This problem occurs on both Debian and openSUSE. + +One possible way to get X11-forwarding back on qemu master is to disable zoom by this patch. + +But I do not know why the do_sdl_resize function should be problematic. +There is probably a better solution. + +Hi, + +I tried this with a current (git) build, and I'm not able to reproduce it. + +I do see a problem with a bad initial SDL window size (its much too small) on a remote machine over a moderate-level network (wireless LAN). I don't see that when ssh-ing to localhost (even though both hosts are basically the same). + +I do see differences between current (git) qemu and the 0.12.5 version. Current git boots the ISO, but doesn't appear to get to the login screen. + +I'm not sure what the differences between our configurations are. I have SDL 1.2.14-6ubuntu3 + +Still investigation. + +Brad + + + +I now found that it depends on my client side. The bug happens when I ssh -XC from my netbook with 1024x600(intel) to a server, but when I ssh -XC to the same server from my laptop with 1024x768(fbdev), then it works. +So might be that the scaling code that made the difference in my bisecting, is only used for small screens. + +I can confirm this: + +My client: + +Ubuntu 12.04 LTS via ssh -X with Gnome - Terminal 3.4.1.1 and compiz enabled +3.2.0-33-generic #52-Ubuntu SMP Thu Oct 18 16:29:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux + +Can you still reproduce this issue with the latest version of QEMU, with the latest version of SDL? + +It seems to be working now with current versions, so has probably been fixed somewhere. + +OK, thanks a lot for testing it again! + diff --git a/results/classifier/zero-shot/105/semantic/714 b/results/classifier/zero-shot/105/semantic/714 new file mode 100644 index 000000000..d13c241d9 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/714 @@ -0,0 +1,56 @@ +semantic: 0.914 +graphic: 0.775 +device: 0.697 +instruction: 0.693 +vnc: 0.667 +assembly: 0.487 +mistranslation: 0.482 +socket: 0.476 +network: 0.387 +boot: 0.277 +other: 0.232 +KVM: 0.188 + +Command line arguments are not passed correctly with user-space semihosting +Description of problem: +The emulated process always receives a value of 1 for `argc`, with `argv[0]` returning seemingly random characters (in Ubuntu packaged qemu 5.2), but correlating with command-line input (output below from master built qemu 6.1): +``` +$ qemu-arm -cpu cortex-m7 ./a.out 123 test +argc: 1 +argv: + - @@@ + +$ qemu-arm -cpu cortex-m7 ./a.out +argc: 1 +argv: + [0] @ +``` +Steps to reproduce: +1. Compile the following program with [ARM embedded toolchain](https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads): +```cpp +#include <iostream> + +int main(int argc, char* argv[]) { + std::cout << "argc: " << argc << "\n"; + std::cout << "argv: \n"; + + for (int i = 0; i < argc; i++) + std::cout << " [" << i << "] " << argv[i] << "\n"; + return 0; +} +``` + +``` +$ $CXX --version +arm-none-eabi-g++ (GNU Arm Embedded Toolchain 10-2020-q4-major) 10.2.1 20201103 (release) +Copyright (C) 2020 Free Software Foundation, Inc. +This is free software; see the source for copying conditions. There is NO +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + +$ $CXX main.cpp --specs=rdimon.specs -mcpu=cortex-m7 +``` + +2. Run in user-space (semihosted): +``` +$ qemu-arm -cpu cortex-m7 ./a.out +``` diff --git a/results/classifier/zero-shot/105/semantic/757702 b/results/classifier/zero-shot/105/semantic/757702 new file mode 100644 index 000000000..66e69998c --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/757702 @@ -0,0 +1,854 @@ +semantic: 0.809 +instruction: 0.784 +socket: 0.773 +mistranslation: 0.767 +vnc: 0.731 +network: 0.717 +device: 0.711 +KVM: 0.703 +assembly: 0.701 +graphic: 0.695 +boot: 0.649 +other: 0.582 + +ARM: singlestepping insn which UNDEFs should stop at UNDEF vector insn, not after it + +ARMv7a has lot of undefined instruction from its instruction opcode space. This undefined instructions are very useful for replacing sensitive non-priviledged instructions of guest operating systems (virtualization). The undefined instruction exception executes at <exception_base> + 0x4, where <exception_base> can be 0x0 or 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, seems like this is a new bug. As as example, if we try to execute value "0xec019800" in qemu 0.14.0 then it should cause undefined exception at <exception_base>+0x4 since "0xec019800" is an undefined instruction. + +I can't reproduce this (either with current trunk or with qemu 0.14.0 release version). Also, if we were directing UNDEF exceptions to the SVC entry point I think it would cause fairly obvious breakage of Linux guests. + +I'm going to attach the test program I used to confirm that we are correctly directing the exception to the 0x4 vector: + +./arm-softmmu/qemu-system-arm -kernel ~/linaro/qemu-misc-tests/undef-exc.axf -semihosting +Starting test +In undef vector + +I'll also attach the binary, since it's only 2K and the source needs armcc to build. + +If you can provide a simple test program and qemu command line which demonstrates the behaviour you think is incorrect I can investigate further. + + + + + + +> ARMv7a has lot of undefined instruction from its instruction opcode space. This undefined instructions +>are very useful for replacing sensitive non-priviledged instructions of guest operating systems (virtualization). + +PS: please don't use arbitrary UNDEF instruction patterns for this (the one you quoted is in the STC instruction space for example). There's an officially-defined "permanently UNDEF" space: + cond 0111 1111 xxxx xxxx xxxx 1111 xxxx +available for this purpose, which will mean you don't have to worry about newer versions of the architecture allocating the UNDEF patterns you were using. + + +Hi, + +You are right, I have deliberately used an instruction from a "permanently +UNDEF" space. I have used this instruction because thats this are the only +UNDEF instructions with maximum payload of 20 bits. + +Also, the instruction "0xec019800" does not belong to STC instruction space. +GNU object dump does not display it as undefined instruction due to internal +bug, but it is definitely an undefined instruction. +May be the undefined instructions from "permanently UNDEF" space are only +executing from offset 0x8 in QEMU 0.14.0. It used to work fine with QEMU +0.13.0. + +PFA, my test elf file. The UNDEF instruction that i have reported is +at location 0x100058 in this elf file. The execution of elf file starts from +0x100000. + +I have launched qemu with command: ./qemu-system-arm -s -S -M realview-pb-a8 +-serial stdio -kernel ../../../xvisor/tests/armv7/pb-a8/arm_test.elf +I am debugging using gdb command: arm-none-eabi-gdb arm_test.elf +--eval-command="target remote localhost:1234" + +Please let me know if you are not able to reproduce the bug. + +--Anup + +On Tue, Apr 12, 2011 at 3:13 PM, Peter Maydell <email address hidden>wrote: + +> > ARMv7a has lot of undefined instruction from its instruction opcode +> space. This undefined instructions +> >are very useful for replacing sensitive non-priviledged instructions of +> guest operating systems (virtualization). +> +> PS: please don't use arbitrary UNDEF instruction patterns for this (the one +> you quoted is in the STC instruction space for example). There's an +> officially-defined "permanently UNDEF" space: +> cond 0111 1111 xxxx xxxx xxxx 1111 xxxx +> available for this purpose, which will mean you don't have to worry about +> newer versions of the architecture allocating the UNDEF patterns you were +> using. +> +> -- +> You received this bug notification because you are a direct subscriber +> of the bug. +> https://bugs.launchpad.net/bugs/757702 +> +> Title: +> Undefined instruction exception starts at offset 0x8 instead of 0x4 +> +> Status in QEMU: +> New +> +> Bug description: +> ARMv7a has lot of undefined instruction from its instruction opcode +> space. This undefined instructions are very useful for replacing +> sensitive non-priviledged instructions of guest operating systems +> (virtualization). The undefined instruction exception executes at +> <exception_base> + 0x4, where <exception_base> can be 0x0 or +> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +> seems like this is a new bug. As as example, if we try to execute +> value "0xec019800" in qemu 0.14.0 then it should cause undefined +> exception at <exception_base>+0x4 since "0xec019800" is an undefined +> instruction. +> +> To unsubscribe from this bug, go to: +> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +> + + +Hi + +The correct command to launch qemu will be: ./qemu-system-arm -s -S -M +realview-pb-a8 -serial stdio -kernel arm_test.elf +Sorry, for mistake in previous mail. + +--Anup + +On Tue, Apr 12, 2011 at 3:48 PM, Anup Patel +<email address hidden>wrote: + +> Hi, +> +> You are right, I have deliberately used an instruction from a "permanently +> UNDEF" space. I have used this instruction because thats this are the only +> UNDEF instructions with maximum payload of 20 bits. +> +> Also, the instruction "0xec019800" does not belong to STC instruction +> space. GNU object dump does not display it as undefined instruction due to +> internal bug, but it is definitely an undefined instruction. +> May be the undefined instructions from "permanently UNDEF" space are only +> executing from offset 0x8 in QEMU 0.14.0. It used to work fine with QEMU +> 0.13.0. +> +> PFA, my test elf file. The UNDEF instruction that i have reported is +> at location 0x100058 in this elf file. The execution of elf file starts from +> 0x100000. +> +> I have launched qemu with command: ./qemu-system-arm -s -S -M +> realview-pb-a8 -serial stdio -kernel +> ../../../xvisor/tests/armv7/pb-a8/arm_test.elf +> I am debugging using gdb command: arm-none-eabi-gdb arm_test.elf +> --eval-command="target remote localhost:1234" +> +> Please let me know if you are not able to reproduce the bug. +> +> --Anup +> +> On Tue, Apr 12, 2011 at 3:13 PM, Peter Maydell <email address hidden>wrote: +> +>> > ARMv7a has lot of undefined instruction from its instruction opcode +>> space. This undefined instructions +>> >are very useful for replacing sensitive non-priviledged instructions of +>> guest operating systems (virtualization). +>> +>> PS: please don't use arbitrary UNDEF instruction patterns for this (the +>> one you quoted is in the STC instruction space for example). There's an +>> officially-defined "permanently UNDEF" space: +>> cond 0111 1111 xxxx xxxx xxxx 1111 xxxx +>> available for this purpose, which will mean you don't have to worry about +>> newer versions of the architecture allocating the UNDEF patterns you were +>> using. +>> +>> -- +>> You received this bug notification because you are a direct subscriber +>> of the bug. +>> https://bugs.launchpad.net/bugs/757702 +>> +>> Title: +>> Undefined instruction exception starts at offset 0x8 instead of 0x4 +>> +>> Status in QEMU: +>> New +>> +>> Bug description: +>> ARMv7a has lot of undefined instruction from its instruction opcode +>> space. This undefined instructions are very useful for replacing +>> sensitive non-priviledged instructions of guest operating systems +>> (virtualization). The undefined instruction exception executes at +>> <exception_base> + 0x4, where <exception_base> can be 0x0 or +>> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +>> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +>> seems like this is a new bug. As as example, if we try to execute +>> value "0xec019800" in qemu 0.14.0 then it should cause undefined +>> exception at <exception_base>+0x4 since "0xec019800" is an undefined +>> instruction. +>> +>> To unsubscribe from this bug, go to: +>> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +>> +> +> + + +> Also, the instruction "0xec019800" does not belong to STC instruction space. + +Yes it does. STC encoding A1 is: cond:4 110 p u d w 0 rn:4 crd:4 coproc:4 imm:8 +For STC the combination of P=0 U=0 D=0 W=0 is UNDEFINED, but it's still in STC space. This is not "permanently UNDEF", it might be allocated to do something in future. + +> PFA, my test elf file. + +Thanks. Your test case appears to be broken in that it doesn't actually set up the vector table at address 0: +cam-vm-266:karmic:qemu-misc-tests$ objdump --disassemble ~/Desktop/arm_test.elf |less + +[...] +Disassembly of section .text: + +00100000 <_start_vect>: + 100000: e59ff018 ldr pc, [pc, #24] ; 100020 <__reset> + 100004: e59ff018 ldr pc, [pc, #24] ; 100024 <__undefined_instruction> + 100008: e59ff018 ldr pc, [pc, #24] ; 100028 <__software_interrupt> + 10000c: e59ff018 ldr pc, [pc, #24] ; 10002c <__prefetch_abort> + 100010: e59ff018 ldr pc, [pc, #24] ; 100030 <__data_abort> + 100014: e59ff018 ldr pc, [pc, #24] ; 100034 <__not_used> + 100018: e59ff018 ldr pc, [pc, #24] ; 100038 <__irq> + 10001c: e59ff018 ldr pc, [pc, #24] ; 10003c <__fiq> + +So what happens is: +0x00100000: e59ff018 ldr pc, [pc, #24] # qemu starts us at the ELF entry point +0x00100054: e3a08000 mov r8, #0 ; 0x0 +0x00100054: e3a08000 mov r8, #0 ; 0x0 +0x00100058: ec019800 stc 8, cr9, [r1], {0} # here's our UNDEF +0x00000004: 00000000 andeq r0, r0, r0 # jump to UNDEF vector at 0x4 as expected +...but since nothing was loaded at address 0 the code is all NOPs and we just execute through it... +0x00000008: 00000000 andeq r0, r0, r0 +0x0000000c: 00000000 andeq r0, r0, r0 +0x00000010: 00000000 andeq r0, r0, r0 +...etc... + +and eventually we fall into the actual image start at 0x100000, and we go round in a big loop. + +You can tell we're going to the correct vector if you ask gdb to put a breakpoint there with "break *0x4" -- we hit it after executing the undef. + + +Actually, the undefined instruction that I have used is documented as +undefined at two places in "ARM Instruction Set Encoding" section of ARMv7a +reference manual: +1. Refer "Table A5-22 Supervisor Call, and coprocessor instructions" +2. Refer "A8.6.188 STC, STC2" + +So you see one can easily get confused that this instruction belongs to STC +space. Actually speaking this UNDEF instruction spans not only in STC space +but also in LDC space. + +--Anup + +On Tue, Apr 12, 2011 at 4:19 PM, Peter Maydell <email address hidden>wrote: + +> > Also, the instruction "0xec019800" does not belong to STC instruction +> space. +> +> Yes it does. STC encoding A1 is: cond:4 110 p u d w 0 rn:4 crd:4 coproc:4 +> imm:8 +> For STC the combination of P=0 U=0 D=0 W=0 is UNDEFINED, but it's still in +> STC space. This is not "permanently UNDEF", it might be allocated to do +> something in future. +> +> > PFA, my test elf file. +> +> Thanks. Your test case appears to be broken in that it doesn't actually set +> up the vector table at address 0: +> cam-vm-266:karmic:qemu-misc-tests$ objdump --disassemble +> ~/Desktop/arm_test.elf |less +> +> [...] +> Disassembly of section .text: +> +> 00100000 <_start_vect>: +> 100000: e59ff018 ldr pc, [pc, #24] ; 100020 <__reset> +> 100004: e59ff018 ldr pc, [pc, #24] ; 100024 +> <__undefined_instruction> +> 100008: e59ff018 ldr pc, [pc, #24] ; 100028 +> <__software_interrupt> +> 10000c: e59ff018 ldr pc, [pc, #24] ; 10002c +> <__prefetch_abort> +> 100010: e59ff018 ldr pc, [pc, #24] ; 100030 +> <__data_abort> +> 100014: e59ff018 ldr pc, [pc, #24] ; 100034 +> <__not_used> +> 100018: e59ff018 ldr pc, [pc, #24] ; 100038 <__irq> +> 10001c: e59ff018 ldr pc, [pc, #24] ; 10003c <__fiq> +> +> So what happens is: +> 0x00100000: e59ff018 ldr pc, [pc, #24] # qemu starts us at the ELF +> entry point +> 0x00100054: e3a08000 mov r8, #0 ; 0x0 +> 0x00100054: e3a08000 mov r8, #0 ; 0x0 +> 0x00100058: ec019800 stc 8, cr9, [r1], {0} # here's our UNDEF +> 0x00000004: 00000000 andeq r0, r0, r0 # jump to UNDEF vector +> at 0x4 as expected +> ...but since nothing was loaded at address 0 the code is all NOPs and we +> just execute through it... +> 0x00000008: 00000000 andeq r0, r0, r0 +> 0x0000000c: 00000000 andeq r0, r0, r0 +> 0x00000010: 00000000 andeq r0, r0, r0 +> ...etc... +> +> and eventually we fall into the actual image start at 0x100000, and we +> go round in a big loop. +> +> You can tell we're going to the correct vector if you ask gdb to put a +> breakpoint there with "break *0x4" -- we hit it after executing the +> undef. +> +> -- +> You received this bug notification because you are a direct subscriber +> of the bug. +> https://bugs.launchpad.net/bugs/757702 +> +> Title: +> Undefined instruction exception starts at offset 0x8 instead of 0x4 +> +> Status in QEMU: +> New +> +> Bug description: +> ARMv7a has lot of undefined instruction from its instruction opcode +> space. This undefined instructions are very useful for replacing +> sensitive non-priviledged instructions of guest operating systems +> (virtualization). The undefined instruction exception executes at +> <exception_base> + 0x4, where <exception_base> can be 0x0 or +> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +> seems like this is a new bug. As as example, if we try to execute +> value "0xec019800" in qemu 0.14.0 then it should cause undefined +> exception at <exception_base>+0x4 since "0xec019800" is an undefined +> instruction. +> +> To unsubscribe from this bug, go to: +> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +> + + +Also, in the test case hits 0x8 after encountering UNDEF instruction at +0x100058. +The test case is not broken it failed in initialization sequence itself. + +PS: I had download most recent version of QEMU 0.14.0 and build it my self. + +On Tue, Apr 12, 2011 at 4:33 PM, Anup Patel +<email address hidden>wrote: + +> Actually, the undefined instruction that I have used is documented as +> undefined at two places in "ARM Instruction Set Encoding" section of ARMv7a +> reference manual: +> 1. Refer "Table A5-22 Supervisor Call, and coprocessor instructions" +> 2. Refer "A8.6.188 STC, STC2" +> +> So you see one can easily get confused that this instruction belongs to STC +> space. Actually speaking this UNDEF instruction spans not only in STC space +> but also in LDC space. +> +> --Anup +> +> On Tue, Apr 12, 2011 at 4:19 PM, Peter Maydell <email address hidden>wrote: +> +>> > Also, the instruction "0xec019800" does not belong to STC instruction +>> space. +>> +>> Yes it does. STC encoding A1 is: cond:4 110 p u d w 0 rn:4 crd:4 coproc:4 +>> imm:8 +>> For STC the combination of P=0 U=0 D=0 W=0 is UNDEFINED, but it's still in +>> STC space. This is not "permanently UNDEF", it might be allocated to do +>> something in future. +>> +>> > PFA, my test elf file. +>> +>> Thanks. Your test case appears to be broken in that it doesn't actually +>> set up the vector table at address 0: +>> cam-vm-266:karmic:qemu-misc-tests$ objdump --disassemble +>> ~/Desktop/arm_test.elf |less +>> +>> [...] +>> Disassembly of section .text: +>> +>> 00100000 <_start_vect>: +>> 100000: e59ff018 ldr pc, [pc, #24] ; 100020 <__reset> +>> 100004: e59ff018 ldr pc, [pc, #24] ; 100024 +>> <__undefined_instruction> +>> 100008: e59ff018 ldr pc, [pc, #24] ; 100028 +>> <__software_interrupt> +>> 10000c: e59ff018 ldr pc, [pc, #24] ; 10002c +>> <__prefetch_abort> +>> 100010: e59ff018 ldr pc, [pc, #24] ; 100030 +>> <__data_abort> +>> 100014: e59ff018 ldr pc, [pc, #24] ; 100034 +>> <__not_used> +>> 100018: e59ff018 ldr pc, [pc, #24] ; 100038 <__irq> +>> 10001c: e59ff018 ldr pc, [pc, #24] ; 10003c <__fiq> +>> +>> So what happens is: +>> 0x00100000: e59ff018 ldr pc, [pc, #24] # qemu starts us at the +>> ELF entry point +>> 0x00100054: e3a08000 mov r8, #0 ; 0x0 +>> 0x00100054: e3a08000 mov r8, #0 ; 0x0 +>> 0x00100058: ec019800 stc 8, cr9, [r1], {0} # here's our UNDEF +>> 0x00000004: 00000000 andeq r0, r0, r0 # jump to UNDEF +>> vector at 0x4 as expected +>> ...but since nothing was loaded at address 0 the code is all NOPs and we +>> just execute through it... +>> 0x00000008: 00000000 andeq r0, r0, r0 +>> 0x0000000c: 00000000 andeq r0, r0, r0 +>> 0x00000010: 00000000 andeq r0, r0, r0 +>> ...etc... +>> +>> and eventually we fall into the actual image start at 0x100000, and we +>> go round in a big loop. +>> +>> You can tell we're going to the correct vector if you ask gdb to put a +>> breakpoint there with "break *0x4" -- we hit it after executing the +>> undef. +>> +>> -- +>> You received this bug notification because you are a direct subscriber +>> of the bug. +>> https://bugs.launchpad.net/bugs/757702 +>> +>> Title: +>> Undefined instruction exception starts at offset 0x8 instead of 0x4 +>> +>> Status in QEMU: +>> New +>> +>> Bug description: +>> ARMv7a has lot of undefined instruction from its instruction opcode +>> space. This undefined instructions are very useful for replacing +>> sensitive non-priviledged instructions of guest operating systems +>> (virtualization). The undefined instruction exception executes at +>> <exception_base> + 0x4, where <exception_base> can be 0x0 or +>> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +>> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +>> seems like this is a new bug. As as example, if we try to execute +>> value "0xec019800" in qemu 0.14.0 then it should cause undefined +>> exception at <exception_base>+0x4 since "0xec019800" is an undefined +>> instruction. +>> +>> To unsubscribe from this bug, go to: +>> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +>> +> +> + + +Sorry, I didn't notice the footnote in table A5-22; I see what you mean now. It's still not permanently-UNDEF space though and you'd really be better off using that instead. In any case, qemu does properly UNDEF the instruction so this is a bit of a diversion. + + +> Also, in the test case hits 0x8 after encountering UNDEF instruction at 0x100058. + +So if you run qemu like this: +qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s -S + +and run arm-none-gnueabi-gdb with no arguments and in gdb type these commands: + +(gdb) target remote :1234 +Remote debugging using :1234 +0x00100000 in ?? () +(gdb) break *0x4 +Breakpoint 1 at 0x4 +(gdb) break *0x8 +Breakpoint 2 at 0x8 +(gdb) c +Continuing. + +...what does gdb do? +(For me it says "Breakpoint 1, 0x00000004 in ?? ()" which is what I expect.) + + +I see 0x00000008 (). + +I am using qemu-0.14.0.tar.gz available for QEMU Downloads. + +--Anup + +On Tue, Apr 12, 2011 at 5:12 PM, Peter Maydell <email address hidden>wrote: + +> > Also, in the test case hits 0x8 after encountering UNDEF instruction +> at 0x100058. +> +> So if you run qemu like this: +> qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s -S +> +> and run arm-none-gnueabi-gdb with no arguments and in gdb type these +> commands: +> +> (gdb) target remote :1234 +> Remote debugging using :1234 +> 0x00100000 in ?? () +> (gdb) break *0x4 +> Breakpoint 1 at 0x4 +> (gdb) break *0x8 +> Breakpoint 2 at 0x8 +> (gdb) c +> Continuing. +> +> ...what does gdb do? +> (For me it says "Breakpoint 1, 0x00000004 in ?? ()" which is what I +> expect.) +> +> -- +> You received this bug notification because you are a direct subscriber +> of the bug. +> https://bugs.launchpad.net/bugs/757702 +> +> Title: +> Undefined instruction exception starts at offset 0x8 instead of 0x4 +> +> Status in QEMU: +> New +> +> Bug description: +> ARMv7a has lot of undefined instruction from its instruction opcode +> space. This undefined instructions are very useful for replacing +> sensitive non-priviledged instructions of guest operating systems +> (virtualization). The undefined instruction exception executes at +> <exception_base> + 0x4, where <exception_base> can be 0x0 or +> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +> seems like this is a new bug. As as example, if we try to execute +> value "0xec019800" in qemu 0.14.0 then it should cause undefined +> exception at <exception_base>+0x4 since "0xec019800" is an undefined +> instruction. +> +> To unsubscribe from this bug, go to: +> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +> + + +Try this out one last time. I am sure you will be able to replicate the +problem. + +Run qemu like this: +qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s -S + +and run arm-none-gnueabi-gdb with no arguments and in gdb type these +commands: + +(gdb) target remote :1234 +Remote debugging using :1234 +0x00100000 in ?? () +(gdb) si +0x00100054 in ?? () +(gdb) si +0x00100054 in ?? () +(gdb) si +0x00000008 in ?? () + +(I expect it to jump to 0x00000004 after 0x00100054) + +--Anup + +On Tue, Apr 12, 2011 at 5:40 PM, Anup Patel +<email address hidden>wrote: + +> I see 0x00000008 (). +> +> I am using qemu-0.14.0.tar.gz available for QEMU Downloads. +> +> --Anup +> +> +> On Tue, Apr 12, 2011 at 5:12 PM, Peter Maydell <email address hidden>wrote: +> +>> > Also, in the test case hits 0x8 after encountering UNDEF instruction +>> at 0x100058. +>> +>> So if you run qemu like this: +>> qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s -S +>> +>> and run arm-none-gnueabi-gdb with no arguments and in gdb type these +>> commands: +>> +>> (gdb) target remote :1234 +>> Remote debugging using :1234 +>> 0x00100000 in ?? () +>> (gdb) break *0x4 +>> Breakpoint 1 at 0x4 +>> (gdb) break *0x8 +>> Breakpoint 2 at 0x8 +>> (gdb) c +>> Continuing. +>> +>> ...what does gdb do? +>> (For me it says "Breakpoint 1, 0x00000004 in ?? ()" which is what I +>> expect.) +>> +>> -- +>> You received this bug notification because you are a direct subscriber +>> of the bug. +>> https://bugs.launchpad.net/bugs/757702 +>> +>> Title: +>> Undefined instruction exception starts at offset 0x8 instead of 0x4 +>> +>> Status in QEMU: +>> New +>> +>> Bug description: +>> ARMv7a has lot of undefined instruction from its instruction opcode +>> space. This undefined instructions are very useful for replacing +>> sensitive non-priviledged instructions of guest operating systems +>> (virtualization). The undefined instruction exception executes at +>> <exception_base> + 0x4, where <exception_base> can be 0x0 or +>> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +>> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +>> seems like this is a new bug. As as example, if we try to execute +>> value "0xec019800" in qemu 0.14.0 then it should cause undefined +>> exception at <exception_base>+0x4 since "0xec019800" is an undefined +>> instruction. +>> +>> To unsubscribe from this bug, go to: +>> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +>> +> +> + + +Hi, + +Were you able to replicate the problem with the steps that I had mentioned ? +The key thing is is if you don't set breakpoint at 0x4 or 0x8 just follow +the execution flow using "si" command of GDB. +You will definitely hit the problem. + +--Anup + +On Tue, Apr 12, 2011 at 5:57 PM, Anup Patel +<email address hidden>wrote: + +> Try this out one last time. I am sure you will be able to replicate the +> problem. +> +> Run qemu like this: +> qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s -S +> +> and run arm-none-gnueabi-gdb with no arguments and in gdb type these +> commands: +> +> (gdb) target remote :1234 +> Remote debugging using :1234 +> 0x00100000 in ?? () +> (gdb) si +> 0x00100054 in ?? () +> (gdb) si +> 0x00100054 in ?? () +> (gdb) si +> 0x00000008 in ?? () +> +> (I expect it to jump to 0x00000004 after 0x00100054) +> +> --Anup +> +> On Tue, Apr 12, 2011 at 5:40 PM, Anup Patel <<email address hidden> +> > wrote: +> +>> I see 0x00000008 (). +>> +>> I am using qemu-0.14.0.tar.gz available for QEMU Downloads. +>> +>> --Anup +>> +>> +>> On Tue, Apr 12, 2011 at 5:12 PM, Peter Maydell <email address hidden>wrote: +>> +>>> > Also, in the test case hits 0x8 after encountering UNDEF instruction +>>> at 0x100058. +>>> +>>> So if you run qemu like this: +>>> qemu-system-arm -M realview-pb-a8 -serial stdio -kernel arm_test.elf -s +>>> -S +>>> +>>> and run arm-none-gnueabi-gdb with no arguments and in gdb type these +>>> commands: +>>> +>>> (gdb) target remote :1234 +>>> Remote debugging using :1234 +>>> 0x00100000 in ?? () +>>> (gdb) break *0x4 +>>> Breakpoint 1 at 0x4 +>>> (gdb) break *0x8 +>>> Breakpoint 2 at 0x8 +>>> (gdb) c +>>> Continuing. +>>> +>>> ...what does gdb do? +>>> (For me it says "Breakpoint 1, 0x00000004 in ?? ()" which is what I +>>> expect.) +>>> +>>> -- +>>> You received this bug notification because you are a direct subscriber +>>> of the bug. +>>> https://bugs.launchpad.net/bugs/757702 +>>> +>>> Title: +>>> Undefined instruction exception starts at offset 0x8 instead of 0x4 +>>> +>>> Status in QEMU: +>>> New +>>> +>>> Bug description: +>>> ARMv7a has lot of undefined instruction from its instruction opcode +>>> space. This undefined instructions are very useful for replacing +>>> sensitive non-priviledged instructions of guest operating systems +>>> (virtualization). The undefined instruction exception executes at +>>> <exception_base> + 0x4, where <exception_base> can be 0x0 or +>>> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +>>> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +>>> seems like this is a new bug. As as example, if we try to execute +>>> value "0xec019800" in qemu 0.14.0 then it should cause undefined +>>> exception at <exception_base>+0x4 since "0xec019800" is an undefined +>>> instruction. +>>> +>>> To unsubscribe from this bug, go to: +>>> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +>>> +>> +>> +> + + +> Were you able to replicate the problem with the steps that I had mentioned ? +> The key thing is is if you don't set breakpoint at 0x4 or 0x8 just follow +> the execution flow using "si" command of GDB. +> You will definitely hit the problem. + +Ah, I had to find a different gdb to reproduce this with (arm-none-eabi-gdb from the 2010.09 codesourcery toolchain). That gdb does single-step-insn by asking the target to single step, and you get the behaviour above. The gdb I was using does single-step-insn by setting breakpoints at where it thinks the next insn will be, which means that "si" on the UNDEF never returns because gdb has set a bp at 0x10005c which we of course never hit. With the codesourcery gdb I see 'si' having the behaviour you describe above. + +However: + +(gdb) target remote :1234 +Remote debugging using :1234 +0x00100000 in ?? () +(gdb) break *0x4 +Breakpoint 1 at 0x4 +(gdb) si +0x00100054 in ?? () +(gdb) si +0x00100058 in ?? () +(gdb) si + +Breakpoint 1, 0x00000004 in ?? () + +ie if we set an explicit breakpoint at 0x4 we do hit it. I think it's just that the singlestep doesn't do what you expect: instead of stopping before we execute the instruction at the UNDEF vector, we first execute it and then stop afterwards [this sort of makes a twisted kind of sense if you think about it -- we never actually executed the UNDEF insn because we took the exception first, so single-step executes exactly one instruction which is the one at 0x4. However it's hopelessly confusing for the user so I'd consider it a bug.] + +Can you confirm that if you set the breakpoint as I do in the transcript above you see the same output? + +So I think that what is happening here is that misbehaviour by qemu's gdb interface is confusing you, rather than the actual qemu ARM implementation being wrong. + +If you revise your test program so that it installs some actual code into the vectors rather than leaving them all as NOPs I think this will be more obvious. + + +I think you are right. This seems to be more of a GDB issue. + +Any ways thanks for your support. + +--Anup + +On Wed, Apr 13, 2011 at 5:24 PM, Peter Maydell <email address hidden>wrote: + +> > Were you able to replicate the problem with the steps that I had +> mentioned ? +> > The key thing is is if you don't set breakpoint at 0x4 or 0x8 just follow +> > the execution flow using "si" command of GDB. +> > You will definitely hit the problem. +> +> Ah, I had to find a different gdb to reproduce this with (arm-none-eabi- +> gdb from the 2010.09 codesourcery toolchain). That gdb does single-step- +> insn by asking the target to single step, and you get the behaviour +> above. The gdb I was using does single-step-insn by setting breakpoints +> at where it thinks the next insn will be, which means that "si" on the +> UNDEF never returns because gdb has set a bp at 0x10005c which we of +> course never hit. With the codesourcery gdb I see 'si' having the +> behaviour you describe above. +> +> However: +> +> (gdb) target remote :1234 +> Remote debugging using :1234 +> 0x00100000 in ?? () +> (gdb) break *0x4 +> Breakpoint 1 at 0x4 +> (gdb) si +> 0x00100054 in ?? () +> (gdb) si +> 0x00100058 in ?? () +> (gdb) si +> +> Breakpoint 1, 0x00000004 in ?? () +> +> ie if we set an explicit breakpoint at 0x4 we do hit it. I think it's +> just that the singlestep doesn't do what you expect: instead of stopping +> before we execute the instruction at the UNDEF vector, we first execute +> it and then stop afterwards [this sort of makes a twisted kind of sense +> if you think about it -- we never actually executed the UNDEF insn +> because we took the exception first, so single-step executes exactly one +> instruction which is the one at 0x4. However it's hopelessly confusing +> for the user so I'd consider it a bug.] +> +> Can you confirm that if you set the breakpoint as I do in the transcript +> above you see the same output? +> +> So I think that what is happening here is that misbehaviour by qemu's +> gdb interface is confusing you, rather than the actual qemu ARM +> implementation being wrong. +> +> If you revise your test program so that it installs some actual code +> into the vectors rather than leaving them all as NOPs I think this will +> be more obvious. +> +> -- +> You received this bug notification because you are a direct subscriber +> of the bug. +> https://bugs.launchpad.net/bugs/757702 +> +> Title: +> Undefined instruction exception starts at offset 0x8 instead of 0x4 +> +> Status in QEMU: +> New +> +> Bug description: +> ARMv7a has lot of undefined instruction from its instruction opcode +> space. This undefined instructions are very useful for replacing +> sensitive non-priviledged instructions of guest operating systems +> (virtualization). The undefined instruction exception executes at +> <exception_base> + 0x4, where <exception_base> can be 0x0 or +> 0xfff00000. Currently, in qemu 0.14.0 undefined instruction fault at +> 0x8 offset instead of 0x4. This was not a problem with qemu 0.13.0, +> seems like this is a new bug. As as example, if we try to execute +> value "0xec019800" in qemu 0.14.0 then it should cause undefined +> exception at <exception_base>+0x4 since "0xec019800" is an undefined +> instruction. +> +> To unsubscribe from this bug, go to: +> https://bugs.launchpad.net/qemu/+bug/757702/+subscribe +> + + +> I think you are right. This seems to be more of a GDB issue. + +The debug stub is still part of QEMU, so let's not close this bug just yet :-) + + +Triaging old bug tickets ... can you somehow still reproduce this problem with the latest version of QEMU (currently v2.9), or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + +This is still a bug, we shouldn't have let it expire. + + +Fix has been included here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=ba3c35d9c4026361fd3 + diff --git a/results/classifier/zero-shot/105/semantic/855630 b/results/classifier/zero-shot/105/semantic/855630 new file mode 100644 index 000000000..78f87a1de --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/855630 @@ -0,0 +1,44 @@ +semantic: 0.773 +other: 0.737 +graphic: 0.652 +mistranslation: 0.602 +instruction: 0.558 +device: 0.512 +vnc: 0.498 +socket: 0.418 +network: 0.337 +boot: 0.331 +assembly: 0.226 +KVM: 0.150 + +Cant Run Wine (posix not nptl) past 0.14.1 + +when trying to build qemu I can build with ./configure --static --enable-sdl --target-list=i386-linux-user just fine with 0.12.5 + +But when I try to go on 0.13.0 or higher (tested on 0.15.0) it will say it cant find libSDL. + +Tried with arm and x86 versions of Ubuntu 9.10 and 11.04. Same on all 4 tests. + +I found I could run posix wine on 12.5 but I cant go higher for posix wine because of that libSDL. + +Oh I forgot to mention, you can compile qemu 0.13.0 or higher without the --static, and with --enable-sdl just fine all the way up to 0.15.5 + +But with --static, it cant find libSDL. + +0.12.5 Can do this just fine. It can do --static --enable-sdl together, and find my libSDL, and run posix wine. + +SDL is only of any use for the system emulation targets. If you're just building a linux-user target there is no point passing --enable-sdl to configure. Just use "./configure --static --target-list=i386-linux-user". + + +Thanks! Your right I disabled SDL and wine posix still worked on 12.5.. but not on 13.0 or higher.. I thought SDL was the reason why. ??? =) + +Wine posix isnt fun to get. http://www.onsitedentalsystems.com/wine.tar.gz + +the binary files in there run fine for qemu-i386 0.12.5 but nothing higher then that.. I dont know why. + +Triaging old bug tickets... The problem with the SDL static linking has likely been fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=5f37e6d4a7b22ccf1bb8fa4 +Can you still reproduce the other issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/zero-shot/105/semantic/876 b/results/classifier/zero-shot/105/semantic/876 new file mode 100644 index 000000000..e2af2d0d5 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/876 @@ -0,0 +1,47 @@ +semantic: 0.884 +graphic: 0.841 +device: 0.838 +instruction: 0.814 +socket: 0.791 +vnc: 0.730 +boot: 0.708 +network: 0.535 +other: 0.471 +mistranslation: 0.392 +KVM: 0.209 +assembly: 0.197 + +snek-arm fails on s390x with qemu >6.1 +Description of problem: +snek is a language inspired by python for embedded. The tests run snek code natively (in this case on s390x) as well as in python3 as well as emulated for arm. +The latter is what fails... + +the Ubuntu testing has spotted this in: + +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220211_065108_2144a@/log.gz +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220212_050524_3b7ee@/log.gz +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220214_080226_46968@/log.gz + +In there all work, but one test fails reproducible, that is `test/pass-slice.py` + +When eliminating the automation in makefiles and all that it comes down to: +``` +$ qemu-system-arm -chardev stdio,mux=on,id=stdio0 -serial none -monitor none -semihosting-config enable=on,chardev=stdio0,arg='snek',arg=test/pass-slice.py -machine mps2-an385,accel=tcg -cpu cortex-m3 -kernel /usr/share/snek/snek-qemu-arm-1.7.elf -nographic -bios none +fail: [::-5] (model 'o' impl '') +``` + +To be clear: +- the test for python3 works on all platforms +- the test for snek-native works on all platforms +- the test for snek-arm work on all platforms except s390x +- with qemu 6.0 this worked, but the more recent qemu 6.2 makes it fail +- only some subtests of pass-slice.py fail (see below) + +I've gone into some details for the snek side of things in [the bug report there](https://github.com/keith-packard/snek/issues/58). +Steps to reproduce: +1. get an s390x system +2. get the snek elf file for arm +3. run qemu-system-arm as shown above + +P.S. I tried this on latest head (building qemu in an F35 container) and it fails there as well, hence I'm listing commit 2d88a3a595 as affected as well. +We know 6.0 was ok, so likely 6.0->6.1 brought the issue, I have not yet checked if a bisect is feasible for this. diff --git a/results/classifier/zero-shot/105/semantic/908 b/results/classifier/zero-shot/105/semantic/908 new file mode 100644 index 000000000..55d480f7d --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/908 @@ -0,0 +1,14 @@ +semantic: 0.560 +mistranslation: 0.412 +device: 0.369 +graphic: 0.365 +instruction: 0.291 +network: 0.231 +other: 0.153 +vnc: 0.099 +KVM: 0.055 +assembly: 0.050 +boot: 0.035 +socket: 0.014 + +since when is qemu-guest-agent included in the qemu package ? diff --git a/results/classifier/zero-shot/105/semantic/935 b/results/classifier/zero-shot/105/semantic/935 new file mode 100644 index 000000000..ba94d3a1a --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/935 @@ -0,0 +1,72 @@ +semantic: 0.792 +graphic: 0.779 +other: 0.769 +network: 0.768 +instruction: 0.760 +device: 0.708 +mistranslation: 0.707 +assembly: 0.698 +socket: 0.677 +vnc: 0.637 +KVM: 0.608 +boot: 0.567 + +insert ivshmem device into pci-bridge, but vm network disconnects +Description of problem: +To extend PCI slot number in Windows vm, a new pci-bridge is created in Windows vm as bus.1. But when I insert a ivshmem file in host to this pci-bridge(bus.1), the Windows vm disconnects(lose remote desktop connection). +Steps to reproduce: +1. add new pci-bridge into windows vm, add windows vm xml configuration like this: +```xml +<devices> + <controller type='pci' index='0' model='pci-root'/> + <controller type='pci' index='1' model='pci-bridge'> + <address type='pci' domain='0' bus='0' slot='0x0d' function='0' multifunction='off'/> + </controller> +</devices> +``` + +2.restart this Windows vm, new pci-bridge has been created, its name is pci.1 and bus is bus.1: +```sh +$ virsh qemu-monitor-command --hmp --domain 56 --cmd info pci + Bus 0, device 13, function 0: + PCI bridge: PCI device 1b36:0001 + IRQ 10. + BUS 0. + secondary bus 1. + subordinate bus 1. + IO range [0xc000, 0xcfff] + memory range [0xfe000000, 0xfe1fffff] + prefetchable memory range [0xe4000000, 0xe41fffff] + BAR0: 64 bit memory at 0xfe422000 [0xfe4220ff]. + id "pci.1" +``` +3. create a shm file `/dev/shm/test1` in host using `shm_open()`, size is 32M + +4. create new object: +```sh +virsh qemu-monitor-command --hmp --domain 56 --cmd object_add memory-backend-file,share=on,id=objtest1,size=32M,mem-path=/dev/shm/test1 +``` + +5. insert this ivshmem file into new pci-bridge and use bus.1 slot number(1:1.0): +```sh +virsh qemu-monitor-command --hmp --domain 56 --cmd device_add ivshmem-plain,memdev=objtest1,id=test1,bus=pci.1,addr=0x01.0x00 +``` + +6. After inserting this ivshmem file into new pci-bridge, the remote desktop connection of this windows vm disconnects. + +7. New ivshmem file has been created: +``` +$ virsh qemu-monitor-command --hmp --domain 57 --cmd info pci + Bus 1, device 1, function 0: + RAM controller: PCI device 1af4:1110 + BAR0: 32 bit memory at 0xfe1fff00 [0xfe1fffff]. + BAR2: 64 bit prefetchable memory at 0x4bc000000 [0x4bfffffff]. + id "test1" + +``` +Additional information: +When insert ivshmem file into bus.1(pci-bridge), the remote desktop connection of Windows vm is sometimes disconnected, and sometimes it is normal. + +The newly added ivshmem device can be found in the device manager of the Windows vm, but sometimes it cannot be found. + +Thanks for your help! diff --git a/results/classifier/zero-shot/105/semantic/969 b/results/classifier/zero-shot/105/semantic/969 new file mode 100644 index 000000000..418944175 --- /dev/null +++ b/results/classifier/zero-shot/105/semantic/969 @@ -0,0 +1,14 @@ +semantic: 0.594 +graphic: 0.212 +assembly: 0.185 +mistranslation: 0.152 +instruction: 0.143 +other: 0.051 +device: 0.044 +KVM: 0.031 +vnc: 0.029 +network: 0.021 +boot: 0.006 +socket: 0.003 + +qemu: Georgian translation |