diff options
Diffstat (limited to 'results/classifier/105/vnc')
96 files changed, 13118 insertions, 0 deletions
diff --git a/results/classifier/105/vnc/1004050 b/results/classifier/105/vnc/1004050 new file mode 100644 index 000000000..4d3073328 --- /dev/null +++ b/results/classifier/105/vnc/1004050 @@ -0,0 +1,54 @@ +vnc: 0.832 +boot: 0.810 +instruction: 0.773 +device: 0.722 +graphic: 0.691 +semantic: 0.527 +socket: 0.387 +mistranslation: 0.313 +network: 0.256 +other: 0.238 +assembly: 0.154 +KVM: 0.130 + +qemu-system-ppc64 by default has non-working keyboard + +Compile qemu from git and do: + + ./ppc64-softmmu/qemu-system-ppc64 + +(ie. no parameters). It boots to an OpenBIOS prompt. However the keyboard doesn't work. After ~10 keypresses, qemu just says: + +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full + +There is no indication inside the guest that OpenBIOS is seeing keyboard events. + +Also there's no indication of what type of keyboard devices are available, nor what we should use. + +I have also experienced the same issue with qemu-system-ppc64. It appears that ppc64 is not able to communicate with the USB controller. This issue is not seen with with qemu-system-ppc. + +tboyes@tboyes-dev:~/qemu$ qemu-system-ppc64 -serial stdio -m 1024 -net nic -net user debian-ppc.qcow2 -cdrom debian-6.0.5-powerpc-netinst.iso -boot d +VNC server running on `127.0.0.1:5901' +C>> annot manage 'OHCI USB controller' PCI device type 'usb': +>> 106b 3f (c 3 10) + +>> ============================================================= +>> OpenBIOS 1.0 [May 30 2012 16:55] +>> Configuration device id QEMU version 1 machine id 3 +>> CPUs: 1 +>> Memory: 1024M +>> UUID: 00000000-0000-0000-0000-000000000000 +>> CPU type PowerPC,970FX +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full +usb-kbd: warning: key event queue full + + + + +AFAIK an OHCI driver has been added to OpenBIOS in 2014, so marking this bug as fixed now. If you still have issues with OpenBIOS, please report them to the OpenBIOS project instead of the QEMU bug tracker, thanks! + diff --git a/results/classifier/105/vnc/1047576 b/results/classifier/105/vnc/1047576 new file mode 100644 index 000000000..83965d49b --- /dev/null +++ b/results/classifier/105/vnc/1047576 @@ -0,0 +1,82 @@ +vnc: 0.880 +KVM: 0.853 +other: 0.847 +graphic: 0.844 +mistranslation: 0.822 +device: 0.820 +semantic: 0.811 +instruction: 0.800 +boot: 0.782 +assembly: 0.756 +socket: 0.751 +network: 0.728 + +qemu unittest emulator failure on latest git master + +Running the emulator unittest, using the cmdline: + +16:01:30 INFO | Running emulator +16:01:30 INFO | Running qemu command (reformatted): +16:01:30 INFO | /home/lmr/Code/autotest.git/autotest/client/tests/virt/kvm/qemu +16:01:30 INFO | -S +16:01:30 INFO | -name 'unittest_vm' +16:01:30 INFO | -nodefaults +16:01:30 INFO | -chardev socket,id=hmp_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20120907-155940-WomlFZY3,server,nowait +16:01:30 INFO | -mon chardev=hmp_id_humanmonitor1,mode=readline +16:01:30 INFO | -chardev socket,id=serial_id_20120907-155940-WomlFZY3,path=/tmp/serial-20120907-155940-WomlFZY3,server,nowait +16:01:30 INFO | -device isa-serial,chardev=serial_id_20120907-155940-WomlFZY3 +16:01:30 INFO | -chardev socket,id=seabioslog_id_20120907-155940-WomlFZY3,path=/tmp/seabios-20120907-155940-WomlFZY3,server,nowait +16:01:30 INFO | -device isa-debugcon,chardev=seabioslog_id_20120907-155940-WomlFZY3,iobase=0x402 +16:01:30 INFO | -m 512 +16:01:30 INFO | -smp 2,cores=1,threads=1,sockets=2 +16:01:30 INFO | -kernel '/home/lmr/Code/autotest.git/autotest/client/tests/virt/kvm/unittests/emulator.flat' +16:01:30 INFO | -vnc :0 +16:01:30 INFO | -chardev file,id=testlog,path=/tmp/testlog-20120907-155940-WomlFZY3 +16:01:30 INFO | -device testdev,chardev=testlog +16:01:30 INFO | -rtc base=utc,clock=host,driftfix=none +16:01:30 INFO | -boot order=cdn,once=c,menu=off +16:01:30 INFO | -S +16:01:30 INFO | -enable-kvm + +We get + +16:01:32 INFO | Waiting for unittest emulator to complete, timeout 600, output in /tmp/testlog-20120907-155940-WomlFZY3 +16:01:32 INFO | [qemu output] KVM internal error. Suberror: 1 +16:01:32 INFO | [qemu output] emulation failure +16:01:32 INFO | [qemu output] RAX=ffffffffffffeff8 RBX=ffffffffffffe000 RCX=fffffffffffff000 RDX=000000000044d2b0 +16:01:32 INFO | [qemu output] RSI=000000000044c9fa RDI=000000000044e370 RBP=ffffffffffffeff8 RSP=000000000044d2b0 +16:01:32 INFO | [qemu output] R8 =000000000000000a R9 =00000000000003f8 R10=0000000000000000 R11=0000000000000000 +16:01:32 INFO | [qemu output] R12=ffffffffffffe000 R13=000000001fff6000 R14=000000001fff5000 R15=0000000000000000 +16:01:32 INFO | [qemu output] RIP=0000000000400a89 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 +16:01:32 INFO | [qemu output] ES =0010 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +16:01:32 INFO | [qemu output] CS =0008 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] +16:01:32 INFO | [qemu output] SS =0000 0000000000000000 ffffffff 00000000 +16:01:32 INFO | [qemu output] DS =0010 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +16:01:32 INFO | [qemu output] FS =0010 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] +16:01:32 INFO | [qemu output] GS =0010 000000000044c370 ffffffff 00c09300 DPL=0 DS [-WA] +16:01:32 INFO | [qemu output] LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT +16:01:32 INFO | [qemu output] TR =0048 000000000040a452 0000ffff 00008b00 DPL=0 TSS64-busy +16:01:32 INFO | [qemu output] GDT= 000000000040a00a 00000447 +16:01:32 INFO | [qemu output] IDT= 0000000000000000 00000fff +16:01:32 INFO | [qemu output] CR0=80010011 CR2=0000000000000000 CR3=000000001ffff000 CR4=00000020 +16:01:32 INFO | [qemu output] DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 +16:01:32 INFO | [qemu output] DR6=00000000ffff0ff0 DR7=0000000000000400 +16:01:32 INFO | [qemu output] EFER=0000000000000500 +16:01:32 INFO | [qemu output] Code=88 77 00 49 8d 84 24 f8 0f 00 00 48 89 e2 48 89 e9 48 89 c5 <c9> 48 87 e2 48 87 e9 48 81 f9 99 88 77 00 0f 94 c0 48 39 d5 40 0f 94 c6 40 0f b6 f6 21 c6 + +More logs will be attached to this bug report. + + + +Adding relevant qemu and unittest versions + +software_version_qemu_kvm=git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git:master:4c3e02beed9878a5f760eeceb6cd42c475cf0127 +software_version_kvm_unit_tests=git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git:master:09b657b6d3a80d0424b8b370462a77d284117926 + + +Triaging old bug tickets ... Can you still reproduce this problem with the latest version of QEMU? + +This doesn't reproduce with the latest version of QEMU, you may close it. + +Thanks for checking again! + diff --git a/results/classifier/105/vnc/11357571 b/results/classifier/105/vnc/11357571 new file mode 100644 index 000000000..3faedfcb4 --- /dev/null +++ b/results/classifier/105/vnc/11357571 @@ -0,0 +1,55 @@ +vnc: 0.950 +graphic: 0.915 +device: 0.763 +instruction: 0.758 +network: 0.705 +semantic: 0.694 +other: 0.687 +socket: 0.600 +boot: 0.571 +KVM: 0.516 +mistranslation: 0.516 +assembly: 0.389 + +[Qemu-devel] [BUG] VNC: client won't send FramebufferUpdateRequest if job in flight is aborted + +Hi Gerd, Daniel. + +We noticed that if VncSharePolicy was configured with +VNC_SHARE_POLICY_FORCE_SHARED mode and +multiple vnc clients opened vnc connections, some clients could go blank screen +at high probability. +This problem can be reproduced when we regularly reboot suse12sp3 in graphic +mode both +with RealVNC and noVNC client. + +Then we dig into it and find out that some clients go blank screen because they +don't +send FramebufferUpdateRequest any more. One step further, we notice that each +time +the job in flight is aborted one client go blank screen. + +The bug is triggered in the following procedure. +Guest reboot => graphic mode switch => graphic_hw_update => vga_update_display +=> vga_draw_graphic (full_update = 1) => dpy_gfx_replace_surface => +vnc_dpy_switch => +vnc_abort_display_jobs (client may have job in flight) => job removed from the +queue +If one client has vnc job in flight, *vnc_abort_display_jobs* will wait until +its job is abandoned. +This behavior is done in vnc_worker_thread_loop when 'if (job->vs->ioc == NULL +|| job->vs->abort == true)' +branch is taken. + +As we can see, *vnc_abort_display_jobs* is intended to do some optimization to +avoid unnecessary client update. +But if client sends FramebufferUpdateRequest for some graphic area and its +FramebufferUpdate response job +is abandoned, the client may wait for the response and never send new +FramebufferUpdateRequest, which may +case the client go blank screen forever. + +So I am wondering whether we should drop the *vnc_abort_display_jobs* +optimization or do some trick here +to push the client to send new FramebufferUpdateRequest. Do you have any idea ? + diff --git a/results/classifier/105/vnc/1136477 b/results/classifier/105/vnc/1136477 new file mode 100644 index 000000000..9673ee37b --- /dev/null +++ b/results/classifier/105/vnc/1136477 @@ -0,0 +1,30 @@ +vnc: 0.885 +graphic: 0.726 +device: 0.713 +semantic: 0.634 +other: 0.553 +network: 0.506 +instruction: 0.497 +socket: 0.490 +mistranslation: 0.489 +boot: 0.211 +assembly: 0.197 +KVM: 0.159 + +qemu doesn't sanitize command line options carrying plaintext passwords + +A slight security problem exists with qemu's lack of sanitization of argv[], for cases where the user may have specified a plaintext password for spice/vnc authorization. (Yes, it's not great to use this facility, but it's convenient and not grotesquely unsafe, were it not for this bug.) It would be nice if those plaintext passwords were nuked from the command line, so a subsequent "ps awux" didn't show them for all to see. + +See also https://bugzilla.redhat.com/show_bug.cgi?id=916279 + +Hi, + +Thanks for submitting this report. I've removed the security label from the bug after reading through the comments and the referenced bug. Modifying argv is not terribly portable and I think a reasonable person would expect that a password specified on the command line would be visible through a ps. + +Patches would certainly be considered but I don't consider this a security issue. Just a request for an enhancement. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1150 b/results/classifier/105/vnc/1150 new file mode 100644 index 000000000..3ecfed1aa --- /dev/null +++ b/results/classifier/105/vnc/1150 @@ -0,0 +1,97 @@ +vnc: 0.790 +mistranslation: 0.789 +KVM: 0.782 +semantic: 0.774 +graphic: 0.725 +other: 0.714 +instruction: 0.618 +device: 0.566 +boot: 0.554 +network: 0.516 +assembly: 0.510 +socket: 0.467 + +guest Linux Kernel hangs and reports CPU lockup/stuck (Qemu >= 6.0.1 regression) +Description of problem: +Since at least [qemu-6.0.1](https://download.qemu.org/qemu-6.0.1.tar.xz) my VM guest is having CPU problems. It looks like [qemu-6.0.0](https://download.qemu.org/qemu-6.0.0.tar.xz) is fine, but I can't confirm this 100 %. + +Problem: The guest hangs for about 30 seconds and dmesg reports errors. + +<details> +<summary>dmesg</summary> + +``` +[ 310.791732] watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [swapper/1:0] +[ 310.791753] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter af_packet iscsi_ibft iscsi_boot_sysfs rfkill dm_crypt essiv authenc pktcdvd intel_rapl_msr intel_rapl_common kvm_intel kvm cirrus drm_kms_helper irqbypass cec pcspkr joydev rc_core syscopyarea sysfillrect sysimgblt virtio_balloon fb_sys_fops i2c_piix4 button nls_iso8859_1 nls_cp437 vfat fat drm fuse configfs ip_tables x_tables ext4 crc16 mbcache jbd2 hid_generic usbhid sd_mod t10_pi virtio_scsi virtio_net net_failover virtio_blk failover sr_mod cdrom ata_generic crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd xhci_pci xhci_pci_renesas xhci_hcd cryptd serio_raw ehci_pci uhci_hcd ehci_hcd usbcore ata_piix ahci libahci virtio_pci virtio_pci_modern_dev libata floppy qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod sg scsi_mod +[ 310.792102] Supported: Yes +[ 310.792108] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.14.21-150400.22-default #1 SLE15-SP4 0b6a6578ade2de5c4a0b916095dff44f76ef1704 +[ 310.792121] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 +[ 310.792127] RIP: 0010:__do_softirq+0x6e/0x2bc +[ 310.792146] Code: 8b 70 2c 81 60 2c ff f7 ff ff 89 74 24 14 c7 44 24 10 0a 00 00 00 48 c7 c0 c0 30 03 00 65 66 c7 00 00 00 fb 66 0f 1f 44 00 00 <bb> ff ff ff ff 41 0f bc de 83 c3 01 89 1c 24 0f 84 92 00 00 00 49 +[ 310.792154] RSP: 0018:ffffb9a8c00d0f98 EFLAGS: 00000206 +[ 310.792163] RAX: 00000000000330c0 RBX: ffffb9a8c0093e18 RCX: 0000000034b47837 +[ 310.792169] RDX: ffff9835c02dd100 RSI: 0000000004200042 RDI: 0000000000000040 +[ 310.792175] RBP: 0000000000000022 R08: ffffb9a8c0093e18 R09: 0000000000000001 +[ 310.792180] R10: 0000000000000002 R11: 0000000000000283 R12: 0000000000000001 +[ 310.792185] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000 +[ 310.792191] FS: 0000000000000000(0000) GS:ffff9836f7d00000(0000) knlGS:0000000000000000 +[ 310.792197] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +[ 310.792203] CR2: 000055ed8cffbaf8 CR3: 00000001025c0001 CR4: 0000000000170ee0 +[ 310.792216] Call Trace: +[ 310.792247] <IRQ> +[ 310.792284] irq_exit_rcu+0x9c/0xc0 +[ 310.792305] common_interrupt+0x5d/0xa0 +[ 310.792331] </IRQ> +[ 310.792335] <TASK> +[ 310.792339] asm_common_interrupt+0x1e/0x40 +[ 310.792358] RIP: 0010:native_safe_halt+0xb/0x10 +[ 310.792368] Code: f0 80 48 02 20 48 8b 00 a8 08 74 82 eb c1 cc eb 07 0f 00 2d 89 f3 5f 00 f4 c3 0f 1f 44 00 00 eb 07 0f 00 2d 79 f3 5f 00 fb f4 <c3> cc cc cc cc 0f 1f 44 00 00 65 8b 15 14 ee 60 69 0f 1f 44 00 00 +[ 310.792375] RSP: 0018:ffffb9a8c0093ec8 EFLAGS: 00000212 +[ 310.792382] RAX: ffffffff96a0ca50 RBX: 0000000000000001 RCX: ffff9835c49c3700 +[ 310.792387] RDX: 00000000001df31e RSI: 0000000000000000 RDI: ffff9835c02a8000 +[ 310.792392] RBP: ffffffff97d47120 R08: 00000000001df31e R09: 0000000000029800 +[ 310.792397] R10: ffffb9a8c164bbe0 R11: 0000000000000198 R12: 0000000000000000 +[ 310.792402] R13: 0000000000000000 R14: ffffffffffffffff R15: ffff9835c02a8000 +[ 310.792409] ? __sched_text_end+0x5/0x5 +[ 310.792425] default_idle+0xa/0x10 +[ 310.792434] default_idle_call+0x2d/0xe0 +[ 310.792441] do_idle+0x1ec/0x2d0 +[ 310.792452] cpu_startup_entry+0x19/0x20 +[ 310.792460] start_secondary+0x11c/0x160 +[ 310.792475] secondary_startup_64_no_verify+0xc2/0xcb +[ 310.792501] </TASK> +``` + +``` +[ 435.511342] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 30s! +[ 435.511374] Showing busy workqueues and worker pools: +[ 435.511377] workqueue events: flags=0x0 +[ 435.511380] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 +[ 435.511385] pending: vmstat_shepherd +[ 435.511395] workqueue events_power_efficient: flags=0x80 +[ 435.511398] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 refcnt=3 +[ 435.511402] pending: neigh_periodic_work, neigh_periodic_work +[ 435.511411] workqueue events_freezable_power_: flags=0x84 +[ 435.511414] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 +[ 435.511417] in-flight: 4783:disk_events_workfn +[ 435.511425] workqueue mm_percpu_wq: flags=0x8 +[ 435.511428] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 +[ 435.511431] pending: vmstat_update +[ 435.511440] workqueue writeback: flags=0x4a +[ 435.511443] pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256 refcnt=3 +[ 435.511447] pending: wb_workfn +[ 435.511453] workqueue kblockd: flags=0x18 +[ 435.511455] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=3/256 refcnt=4 +[ 435.511459] pending: blk_mq_timeout_work, blk_mq_timeout_work, blk_mq_timeout_work +[ 435.511475] workqueue ata_sff: flags=0x8 +[ 435.511479] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/512 refcnt=2 +[ 435.511482] pending: ata_sff_pio_task [libata] +[ 435.511538] pool 2: cpus=1 node=0 flags=0x0 nice=0 hung=30s workers=3 idle: 349 51 +``` + +</details> + +It looks like the problem mostly appears if SSH is being used over a "user" network connection. A typical situation is when editing a file in Vim (compiled with X support) via SSH and using the X clipboard (`"+y"`). But the problem also happens in other situations with SSH, e. g. when using SSHFS. +The type of NIC doesn't seem to make a difference (tested `virtio` and `e1000`). But "tap" network connections don't show a problem. + + diff --git a/results/classifier/105/vnc/1162644 b/results/classifier/105/vnc/1162644 new file mode 100644 index 000000000..b5c7eda84 --- /dev/null +++ b/results/classifier/105/vnc/1162644 @@ -0,0 +1,144 @@ +vnc: 0.888 +KVM: 0.850 +instruction: 0.843 +other: 0.842 +device: 0.840 +boot: 0.823 +assembly: 0.805 +graphic: 0.792 +socket: 0.787 +semantic: 0.782 +network: 0.777 +mistranslation: 0.730 + +qemu-system-x86_64 crashed with SIGABRT in __assert_fail_base() + +Description: +When QEMU tries to boot with a usb 3.0 tablet (xhci) on a Raring ringtail box (QEMU package1.4.0+dfsg-1expubuntu4) it will crash soon afterwards: + +qemu-system-x86_64: /build/buildd/qemu-1.4.0+dfsg/hw/usb/core.c:552: usb_packet_setup: Assertion `p->iov.iov != ((void *)0)' failed. + +Component: +qemu-system -> 1.4.0+dfsg-1expubuntu4 + +Ubuntu Version: + +Description: Ubuntu Raring Ringtail (development branch) +Release: 13.04 + +Steps to reproduce it: + +I met this bug while running the virt-test suite + +https://github.com/autotest/virt-test + +Instructions to install and run it can be seen on the README file + +https://github.com/autotest/virt-test#readme + +After the suite is set, it can be reproduced on a raring (13.04) simply by running: + +./run -t qemu --tests usb.usb_reboot.usb_tablet.xhci + +Command line: + +23:52:42 INFO | Running qemu command (reformatted): +/usr/bin/kvm \ + -S \ + -name 'virt-tests-vm1' \ + -nodefaults \ + -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20130331-233911-ndvUEvrV,server,nowait \ + -mon chardev=hmp_id_hmp1,mode=readline \ + -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20130331-233911-ndvUEvrV,server,nowait \ + -device isa-serial,chardev=serial_id_serial1 \ + -chardev socket,id=seabioslog_id_20130331-233911-ndvUEvrV,path=/tmp/seabios-20130331-233911-ndvUEvrV,server,nowait \ + -device isa-debugcon,chardev=seabioslog_id_20130331-233911-ndvUEvrV,iobase=0x402 \ + -device ich9-usb-uhci1,id=usb1 \ + -device nec-usb-xhci,id=usbtest \ + -drive file='/home/lmr/Code/virt-test.git/shared/data/images/jeos-17-64.qcow2',if=none,id=virtio0 \ + -device virtio-blk-pci,drive=virtio0,bootindex=1 \ + -device virtio-net-pci,netdev=idumV1TE,mac='9a:c0:c1:c2:c3:c4',id='idmN7iHv' \ + -netdev user,id=idumV1TE,hostfwd=tcp::5000-:22 \ + -m 1024 \ + -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ + -cpu 'SandyBridge' \ + -M pc \ + -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ + -device usb-tablet,id=usb-testdev,bus=usbtest.0,port=1 \ + -vnc :0 \ + -vga std \ + -rtc base=utc,clock=host,driftfix=none \ + -boot order=cdn,once=c,menu=off \ + -enable-kvm + +ProblemType: Crash +DistroRelease: Ubuntu 13.04 +Package: qemu-system-x86 1.4.0+dfsg-1expubuntu4 +ProcVersionSignature: Ubuntu 3.8.0-15.25-generic 3.8.4 +Uname: Linux 3.8.0-15-generic x86_64 +ApportVersion: 2.9.2-0ubuntu5 +Architecture: amd64 +Date: Sun Mar 31 23:52:46 2013 +EcryptfsInUse: Yes +ExecutablePath: /usr/bin/qemu-system-x86_64 +InstallationDate: Installed on 2013-03-31 (0 days ago) +InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017.5) +MarkForUpload: True +ProcEnviron: + TERM=dumb + PATH=(custom, no user) + XDG_RUNTIME_DIR=<set> + LANG=en_US.UTF-8 + SHELL=/bin/bash +Signal: 6 +SourcePackage: qemu +StacktraceTop: + raise () from /lib/x86_64-linux-gnu/libc.so.6 + abort () from /lib/x86_64-linux-gnu/libc.so.6 + ?? () from /lib/x86_64-linux-gnu/libc.so.6 + __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 + ?? () +Title: qemu-system-x86_64 crashed with SIGABRT in raise() +UpgradeStatus: Upgraded to raring on 2013-03-31 (0 days ago) +UserGroups: adm cdrom dip libvirtd lpadmin plugdev sambashare sudo + + + + + + + + + + + + + +Thanks for reporting this bug, I will try to reproduce it, and check whether upstream git head has the same bug. + +I can't reproduce this on a clean raring system, which has the same qemu version as your quantal system. + +Is it possible for you to test on a clean raring system? + +What is your libvirt package version? + +It doesn't get any cleaner than this. I've installed the box with 12.10, immediately followed by upgrade to 13.04. What seems to be going on is that the issue is not 100% reproducible (I tried today with the same setup and could not reproduce it). + +Moreover, what really matters here is the qemu/kernel version, and nothing else. + +Libvirt version is 1.0.2-0ubuntu10. I did compile the latest git master and so far I could not reproduce it either. + +I could just reproduce it on Fedora 19 qemu-kvm version (which is 1.4.0) and qemu.git master. So the issue is not 100% reproducible, but it can be seen on qemu.git master and therefore, downstream packages such as the ones found on Ubuntu and Fedora, for example. + +Ok, thanks - i did run it 3 or 4 times. How often would you say it fails for you? + +I will mark this as affecting the upstream qemu project based on comment #10. + +On my F19 box, it's failing about 2/3 of the attempts. What is funny is that on the Ubuntu 13.04 box, I can't get the problem reproduced anymore, for some reason beyond me. + +Status changed to 'Confirmed' because the bug affects multiple users. + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +Sure, please close it. + diff --git a/results/classifier/105/vnc/1183 b/results/classifier/105/vnc/1183 new file mode 100644 index 000000000..9b35c8151 --- /dev/null +++ b/results/classifier/105/vnc/1183 @@ -0,0 +1,144 @@ +vnc: 0.737 +mistranslation: 0.737 +KVM: 0.737 +other: 0.734 +boot: 0.685 +graphic: 0.676 +instruction: 0.641 +socket: 0.641 +semantic: 0.640 +device: 0.629 +network: 0.628 +assembly: 0.622 + +KVM crash due to qcow2 out of space condition during virsh-snapshot creation +Description of problem: +virsh snapshot failed due to out of space condition (into the qcow2 image ?) + +libvirt log: + +``` +2022-08-27T06:41:41.164368Z qemu-kvm-one: terminating on signal 15 from pid 1782 (/usr/sbin/libvirtd) +2022-08-27T06:41:41.172667Z qemu-kvm-one: Failed to flush the L2 table cache: Input/output error +2022-08-27T06:41:41.172692Z qemu-kvm-one: Failed to flush the refcount block cache: Input/output error +``` +Steps to reproduce: +1. not possible for that moment - i did resize/increase the qcow2 image - +now its running again. +Additional information: +as i saw - there was a very old qemu-snapshot, which was not properly deleted. +After removing this snapshot, i did reszie the image. +I do suppose, this could be one reason the image (qcow2) got full ? + +Because all is THIN i was not aware of it (fs level ok, storage layer ok). +Is there any tool, how free space in a thin qcow2 file can be monitored ? + + + +``` +PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ +HOME=/var/lib/libvirt/qemu/domain-13-one-89 \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-13-one-89/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-13-one-89/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-13-one-89/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-kvm-one \ +-name guest=one-89,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-one-89/master-key.aes \ +-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off \ +-cpu qemu64 \ +-m 8192 \ +-overcommit mem-lock=off \ +-smp 4,sockets=4,cores=1,threads=1 \ +-uuid 8c920c7f-f687-4c47-bfc7-671425c7436b \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=40,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ +-device virtio-scsi-pci,id=scsi0,num_queues=1,bus=pci.0,addr=0x4 \ +-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \ +-blockdev '{"driver":"file","filename":"/var/lib/one//xxxx/disk.0","aio":"threads","node-name":"libvirt-3-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-3-format","read-only":false,"discard":"unmap","cache":{"direct":false,"no-flush":false},"driver":"qcow2","file":"libvirt-3-storage","backing":null}' \ +-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,device_id=drive-scsi0-0-0-0,drive=libvirt-3-format,id=scsi0-0-0-0,bootindex=1,write-cache=off \ +-blockdev '{"driver":"file","filename":"/var/lib/one//xxxx/disk.1","aio":"threads","node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-2-format","read-only":false,"discard":"unmap","cache":{"direct":false,"no-flush":false},"driver":"qcow2","file":"libvirt-2-storage","backing":null}' \ +-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,device_id=drive-scsi0-0-1-0,drive=libvirt-2-format,id=scsi0-0-1-0,write-cache=off \ +-blockdev '{"driver":"file","filename":"/var/lib/one//xxxx/disk.2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"}' \ +-device ide-cd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0 \ +-netdev tap,fd=42,id=hostnet0 \ +-device e1000,netdev=hostnet0,id=net0,mac=02:00:c0:a8:02:17,bus=pci.0,addr=0x3 \ +-chardev socket,id=charchannel0,fd=43,server,nowait \ +-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \ +-vnc 0.0.0.0:89 \ +-device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on +``` + +as the time of the crash the qcow2 status was: +(so i'm not sure the issue is about a space problem or a bug in qemu): + +``` +qemu-img info xxx/0/xxx +image: xxx/0/xxx +file format: qcow2 +virtual size: 1.46 TiB (1610612736000 bytes) +disk size: 988 GiB +cluster_size: 65536 +Snapshot list: +ID TAG VM SIZE DATE VM CLOCK ICOUNT +112 snap-111 0 B 2022-03-11 01:59:15 49:07:53.846 +282 snap-281 0 B 2022-08-20 01:59:17538:16:30.416 +283 snap-282 0 B 2022-08-21 01:59:16562:10:40.759 +284 snap-283 0 B 2022-08-22 01:59:16585:59:16.170 +285 snap-284 0 B 2022-08-23 01:59:16609:51:44.825 +286 snap-285 0 B 2022-08-24 01:59:16633:45:32.243 +287 snap-286 0 B 2022-08-25 01:59:16657:36:44.718 +288 snap-287 0 B 2022-08-26 01:59:16681:29:00.793 +Format specific information: + compat: 1.1 + compression type: zlib + lazy refcounts: false + refcount bits: 16 + corrupt: false + extended l2: false +root@proxpve1:~# qemu-img check xxxx/0/xxx +No errors were found on the image. +15252433/24576000 = 62.06% allocated, 6.32% fragmented, 0.00% compressed clusters +Image end offset: 1062936117248 + +1rst (OS) Disk on the VM: +------------------------------------------ +file format: qcow2 +virtual size: 100 GiB (107374182400 bytes) +disk size: 190 GiB +cluster_size: 65536 +Snapshot list: +ID TAG VM SIZE DATE VM CLOCK ICOUNT +282 snap-281 7.66 GiB 2022-08-20 01:59:17538:16:30.416 +283 snap-282 7.6 GiB 2022-08-21 01:59:16562:10:40.759 +284 snap-283 7.62 GiB 2022-08-22 01:59:16585:59:16.170 +285 snap-284 7.65 GiB 2022-08-23 01:59:16609:51:44.825 +286 snap-285 7.62 GiB 2022-08-24 01:59:16633:45:32.243 +287 snap-286 7.63 GiB 2022-08-25 01:59:16657:36:44.718 +288 snap-287 7.65 GiB 2022-08-26 01:59:16681:29:00.793 +Format specific information: + compat: 1.1 + compression type: zlib + lazy refcounts: false + refcount bits: 16 + corrupt: false + extended l2: false + + +No errors were found on the image. +782257/1638400 = 47.75% allocated, 22.16% fragmented, 0.00% compressed clusters +Image end offset: 315680292864 +``` diff --git a/results/classifier/105/vnc/1207686 b/results/classifier/105/vnc/1207686 new file mode 100644 index 000000000..64f75b022 --- /dev/null +++ b/results/classifier/105/vnc/1207686 @@ -0,0 +1,256 @@ +vnc: 0.959 +network: 0.949 +semantic: 0.932 +socket: 0.930 +device: 0.928 +other: 0.919 +KVM: 0.917 +boot: 0.916 +instruction: 0.906 +mistranslation: 0.892 +assembly: 0.889 +graphic: 0.870 + +qemu-1.4.0 and onwards, linux kernel 3.2.x, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process + +Hi, + +after some testing I tried to narrow down a problem, which was initially reported by some users. +Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as reported by now. + +All using some flavour of linux-3.2.x kernel. + +Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which solves the problem. +Problem could be triggert with some workload ala: + +spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat +and in parallel do some apt-get install/remove/whatever. + +That results in a somewhat stuck qemu-session with the bad "kernel_hung_task..." messages. + +A typical command-line is as follows: + +/usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable-kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run/qemu-server/760.vnc,password -qmp unix:/var/run/qemu-server/760.qmp,server,nowait -nodefaults -serial none -parallel none -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 -device virtio-blk-pci,drive=virtio0 -drive format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native -drive format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc + +no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring-session is accepted, need to hard-kill the process. + +Please give any advice on what to do for tracing/debugging, because the number of tickets here are raising, and noone knows, what users are doing inside their VM. + +Kind regards, + +Oliver Francke. + +On Fri, Aug 02, 2013 at 09:58:29AM -0000, Oliver Francke wrote: +> after some testing I tried to narrow down a problem, which was initially reported by some users. +> Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as reported by now. +> +> All using some flavour of linux-3.2.x kernel. +> +> Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which solves the problem. + +Is that a guest kernel upgrade? + +> Problem could be triggert with some workload ala: +> +> spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat +> and in parallel do some apt-get install/remove/whatever. +> +> That results in a somewhat stuck qemu-session with the bad +> "kernel_hung_task..." messages. +> +> A typical command-line is as follows: +> +> /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable- +> kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor +> unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run/qemu- +> server/760.vnc,password -qmp unix:/var/run/qemu- +> server/760.qmp,server,nowait -nodefaults -serial none -parallel none +> -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev +> type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh +> -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 +> -device virtio-blk-pci,drive=virtio0 -drive +> format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native +> -drive +> format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native +> -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive +> if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc +> +> no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- +> session is accepted, need to hard-kill the process. + +Yesterday I saw a possibly related report on IRC. It was a Windows +guest running under OpenStack with images on Ceph. + +They reported that the QEMU process would lock up - ping would not work +and their management tools showed 0 CPU activity for the guest. +However, they were able to "kick" the guest by taking a VNC screenshot +(I think). Then it would come back to life. + +If you have a Linux guest that is reporting kernel_hung_task, then it +could be a similar scenario. + +Please confirm that the hung task message is from inside the guest. + +If you are able to reproduce this and have an alternative non-Ceph +storage pool, please try that since Ceph is common to both these bug +reports. + +Stefan + + +Hi Stefan, + +Am 02.08.2013 um 17:24 schrieb Stefan Hajnoczi <email address hidden>: + +> On Fri, Aug 02, 2013 at 09:58:29AM -0000, Oliver Francke wrote: +>> after some testing I tried to narrow down a problem, which was initially reported by some users. +>> Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as reported by now. +>> +>> All using some flavour of linux-3.2.x kernel. +>> +>> Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which solves the problem. +> +> Is that a guest kernel upgrade? + +yeah, sorry if that was not clear enough. + +> +>> Problem could be triggert with some workload ala: +>> +>> spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat +>> and in parallel do some apt-get install/remove/whatever. +>> +>> That results in a somewhat stuck qemu-session with the bad +>> "kernel_hung_task..." messages. +>> +>> A typical command-line is as follows: +>> +>> /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable- +>> kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor +>> unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run/qemu- +>> server/760.vnc,password -qmp unix:/var/run/qemu- +>> server/760.qmp,server,nowait -nodefaults -serial none -parallel none +>> -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev +>> type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh +>> -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 +>> -device virtio-blk-pci,drive=virtio0 -drive +>> format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native +>> -drive +>> format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native +>> -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive +>> if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc +>> +>> no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- +>> session is accepted, need to hard-kill the process. +> +> Yesterday I saw a possibly related report on IRC. It was a Windows +> guest running under OpenStack with images on Ceph. +> +> They reported that the QEMU process would lock up - ping would not work +> and their management tools showed 0 CPU activity for the guest. +> However, they were able to "kick" the guest by taking a VNC screenshot +> (I think). Then it would come back to life. +> +> If you have a Linux guest that is reporting kernel_hung_task, then it +> could be a similar scenario. +> +> Please confirm that the hung task message is from inside the guest. +> + +confirmed. + +> If you are able to reproduce this and have an alternative non-Ceph +> storage pool, please try that since Ceph is common to both these bug +> reports. +> + +I can reproduce it with: kernel 3.2.something + qemu-1.[456] ( never spent much time on 1.3) and high I/O. +I took this VM later this day and converted it to local-storage-qcow2, no prob with any kernel. I already asked on ceph-users-list for assistance, especially from Josh ( if he's not on summer holiday ;) ) + +What is strange, I have a session via VNC-console opened and have a loop ala: + +while true; do apt-get install -y ntp libopts25; apt-get remove -y ntp-libopts25; done +and and parallel spew as described, the apt-"session" dies and one can see the hung_task-thingy, but I still can restart the spew-test. +Just for completeness. + +Thnx for you attention, + +Oliver. + +> Stefan +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1207686 +> +> Title: +> qemu-1.4.0 and onwards, linux kernel 3.2.x, heavy I/O leads to +> kernel_hung_tasks_timout_secs message and unresponsive qemu-process +> +> Status in QEMU: +> New +> +> Bug description: +> Hi, +> +> after some testing I tried to narrow down a problem, which was initially reported by some users. +> Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as reported by now. +> +> All using some flavour of linux-3.2.x kernel. +> +> Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which solves the problem. +> Problem could be triggert with some workload ala: +> +> spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat +> and in parallel do some apt-get install/remove/whatever. +> +> That results in a somewhat stuck qemu-session with the bad +> "kernel_hung_task..." messages. +> +> A typical command-line is as follows: +> +> /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet +> -enable-kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor +> unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run +> /qemu-server/760.vnc,password -qmp unix:/var/run/qemu- +> server/760.qmp,server,nowait -nodefaults -serial none -parallel none +> -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev +> type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh +> -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 +> -device virtio-blk-pci,drive=virtio0 -drive +> format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native +> -drive +> format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native +> -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive +> if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc +> +> no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- +> session is accepted, need to hard-kill the process. +> +> Please give any advice on what to do for tracing/debugging, because +> the number of tickets here are raising, and noone knows, what users +> are doing inside their VM. +> +> Kind regards, +> +> Oliver Francke. +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1207686/+subscriptions + + + +Hi, + +opened a ticket with the ceph-guys, and it turned out to be a bug in "librados aio flush". + +With latest "wip-librados-aio-flush (bobtail)" I got no error even with _very_ high load. + +Thnx for the attention ;) + +Oliver. + + +Closing as "Invalid" since this was not a QEMU bug according to comment #3. + diff --git a/results/classifier/105/vnc/1246990 b/results/classifier/105/vnc/1246990 new file mode 100644 index 000000000..2f1ae9512 --- /dev/null +++ b/results/classifier/105/vnc/1246990 @@ -0,0 +1,66 @@ +mistranslation: 0.945 +vnc: 0.936 +semantic: 0.931 +instruction: 0.918 +graphic: 0.917 +other: 0.913 +network: 0.909 +device: 0.904 +assembly: 0.904 +socket: 0.902 +boot: 0.872 +KVM: 0.854 + +[qemu-x86-64-linux-user 1.6.1] qemu: uncaught target signal 11 (Segmentation fault) - core dumped + +Rjsupplicant is an authentication client of Campus Network in most universities in China. Its Linux version has only x86 and amd64 version. + +On linux: + +./qemu-x86_64 is compiled from latest qemu 1.6.1, with ./configure options: --enable-debug --target-list=x86_64-linux-user . Compiler is gcc version 4.7.3 (Debian 4.7.3-4) + +$ sudo ./qemu-x86_64 ./rjsupplicant -n eth0 -u USER -p PASS -d 1 -s internet +qemu: uncaught target signal 11 (Segmentation fault) - core dumped + +$ sudo gdb ./qemu-x86_64 +(gdb) r ./rjsupplicant -n eth0 -u USER -p PASS -d 1 -s internet +(gdb) where +#0 0x00005555559c21bd in static_code_gen_buffer () +#1 0x00005555555b74d5 in cpu_tb_exec (cpu=0x555557972580, tb_ptr=0x5555559c2190 <static_code_gen_buffer+819792> "A\213n\250\205\355\017\205\257") + at /home/USER/x/rjsupplicant/x64/qemu-1.6.1/cpu-exec.c:56 +#2 0x00005555555b817d in cpu_x86_exec (env=0x5555579726b0) at /home/USER/x/rjsupplicant/x64/qemu-1.6.1/cpu-exec.c:631 +#3 0x00005555555d997a in cpu_loop (env=0x5555579726b0) at /home/USER/x/rjsupplicant/x64/qemu-1.6.1/linux-user/main.c:283 +#4 0x00005555555eca6b in clone_func (arg=0x7fffffffc1d0) at /home/USER/x/rjsupplicant/x64/qemu-1.6.1/linux-user/syscall.c:4266 +#5 0x00007ffff71bfe0e in start_thread (arg=0x7ffff7f04700) at pthread_create.c:311 +#6 0x00007ffff6ef493d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 + +$ file rjsupplicant +rjsupplicant: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped + +$ uname -r +3.10-2-amd64 + + +And it can be run on Linux amd64 successfully. + +Though I don't have the source code of rjsupplicant, so I don't have further information. + +`qemu-x86_64 -strace ./rjsupplicant -n eth0 -u USER -p PASS -d 1 -s internet` is attached as strace_qemu.log + + +The binary is available to download at http://ge.tt/6pgG1tw/v/0 + + + +and, `strace ./rjsuuplicant -n eth0 -u USER -p PASS -d 1 -s internet` is attached as strace_native.log + +I'm not sure x86*-linux-user targets are being tested at all. Last time I checked, x86-64 variant crashed left and right to the point of being completely unusable... + +The backtrace indicates that this is a multithreaded application. These won't work reliably under qemu-user : they tend to crash, as you have found. + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for qemu (Ubuntu) because there has been no activity for 60 days.] + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1321028 b/results/classifier/105/vnc/1321028 new file mode 100644 index 000000000..6cb2936ac --- /dev/null +++ b/results/classifier/105/vnc/1321028 @@ -0,0 +1,98 @@ +vnc: 0.899 +device: 0.894 +boot: 0.889 +graphic: 0.887 +semantic: 0.874 +other: 0.873 +socket: 0.868 +network: 0.866 +assembly: 0.852 +KVM: 0.841 +mistranslation: 0.827 +instruction: 0.811 + +qemu-system-ppc : file systems are not shutting down clean + + + + +Launching a VM that has been shutdown gracefully ( via init 0) typically requires fsck to run when it is started ; +This indications data integrity issues; + + + The symptom can be seen by observing the fsck running when the VM restarted. + +Install 14-04-LTS to a VM and the issue can be seen ; + + + +(trusty)vnc@jade-rev4:/home2/qemu$ cat vm1/go.sh +mymac="52:54:00:12:34:10" +T=` echo $mymac | cut -d: -f5,6 | sed s/\://` +mytap="tap$T" + + +tunctl -d $mytap +tunctl -t $mytap + + /usr/bin/qemu-system-ppc -M ppce500 -nographic -kernel jade-kernel.bin \ + -initrd jade-initrd-2.0.bin -m 1G -enable-kvm \ + -drive file=jade-ubuntu-14.04.raw,if=virtio \ + -net nic,vlan=0,macaddr=$mymac \ + -net tap,vlan=0,ifname=$mytap,script=/usr/local/bin/qemu-ifup \ + -append "console=ttyS0 ssgyboot=break" \ + -no-shutdown -no-reboot -name `basename $fp` + +ProblemType: Bug +DistroRelease: Ubuntu 14.04 +Package: qemu-system-ppc 2.0.0~rc1+dfsg-0ubuntu3.1 +ProcVersionSignature: Ubuntu 3.13.0-24.46-powerpc-e500mc 3.13.9 +Uname: Linux 3.13.0-24-powerpc-e500mc ppc +ApportVersion: 2.14.1-0ubuntu3 +Architecture: powerpc +Date: Mon May 19 17:01:14 2014 +ProcEnviron: + TERM=xterm + PATH=(custom, no user) + LANG=en_US.UTF-8 + SHELL=/bin/bash +SourcePackage: qemu +UpgradeStatus: Upgraded to trusty on 2014-04-29 (20 days ago) + + + +Can you explain what "ssgyboot=break" tells the kernel to do? + +Could you upload the guest's /var/log/syslog after reboot? + +Please show + +echo $? + +immediately after the qemu-system-ppc command has exited? + +1. ssgyboot-break has no effect in the VM kernel; It is only used by jade-initrd-2.0.bin ; + + kernel cmdline: + +[ 0.000000] pcpu-alloc: s28800 r8192 d16256 u53248 alloc=13*4096 +[ 0.000000] pcpu-alloc: [0] 0 +[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pag +es: 260608 +[ 0.000000] Kernel command line: (host)/boot/vmlinux-3.13.0-24-powerpc-e500mc + root=/dev/vda1 ro quiet splash vt.handoff=7 + +2: see attached file . + +3. VM's do not exit from qemu-system-ppc ; see defect : 1317603 . + + + +/var/log/syslog attached as syslog.onboot when file system is dirty ; + +Since the VM terminated ; the device should have been umounted ; + +I am thinking since the VM hasn't terminated due to defect 1317603 , qemu-system-ppc has to be terminated with kill <pid> which is not kill -9 ; I could see some inconsistencies with virtio device file not being sync'ed ; but the VM did unmount it; + + + diff --git a/results/classifier/105/vnc/1339 b/results/classifier/105/vnc/1339 new file mode 100644 index 000000000..ea1a27fd9 --- /dev/null +++ b/results/classifier/105/vnc/1339 @@ -0,0 +1,29 @@ +vnc: 0.948 +device: 0.931 +instruction: 0.897 +socket: 0.857 +graphic: 0.827 +network: 0.820 +mistranslation: 0.532 +boot: 0.487 +semantic: 0.476 +KVM: 0.423 +assembly: 0.275 +other: 0.273 + +RVV vfncvt.rtz.x.f.w Assertion failed +Description of problem: +when execute +``` +vsetvli t0, x0, e16, m1 +vfncvt.rtz.x.f.w v0, v4 +``` +report error: + +qemu-riscv64: ../target/riscv/translate.c:212: decode_save_opc: Assertion \`ctx->insn_start != NULL' failed. Segmentation fault (core dumped) +Steps to reproduce: +1. write the code +2. compile +3. excute +Additional information: + diff --git a/results/classifier/105/vnc/1354279 b/results/classifier/105/vnc/1354279 new file mode 100644 index 000000000..4641e9b4e --- /dev/null +++ b/results/classifier/105/vnc/1354279 @@ -0,0 +1,83 @@ +vnc: 0.938 +graphic: 0.874 +KVM: 0.867 +network: 0.863 +socket: 0.819 +device: 0.818 +semantic: 0.663 +instruction: 0.661 +mistranslation: 0.537 +boot: 0.439 +assembly: 0.341 +other: 0.194 + +The guest will be destroyed after hot remove the VF from the guest. + +Environment: +------------ +Host OS (ia32/ia32e/IA64):ia32e +Guest OS (ia32/ia32e/IA64):ia32e +Guest OS Type (Linux/Windows):Linux +kvm.git Commit:9f6226a762c7ae02f6a23a3d4fc552dafa57ea23 +qemu.git Commit:5a7348045091a2bc15d85bb177e5956aa6114e5a +Host Kernel Version:3.16.0-rc1 +Hardware:Romley_EP, Ivytown_EP,Haswell_EP + + +Bug detailed description: +-------------------------- +hot add the VF to the guest, then hot remove the VF from the guest, the guest will be destroyed. + +note: +1. when hot add the VF with vfio, the hot remove the VF from the guest, the guest works fine. +2. this shoule be a qemu bug: +kvm + qemu = result +9f6226a7 + 5a734804 = bad +9f6226a7 + 9f862687 = good + + + +Reproduce steps: +---------------- +1. create guest +qemu-system-x86_64 --enable-kvm -m 1024 -smp 4 -net none rhel6u5.qcow -monitor pty +2. hot add the vf to guest +echo "device_add pci-assign,host=05:10.0,id=nic" >/dev/pts/x +cat /dev/pts/x +3. hot remove the VF froem guest +echo "device_del nic" >/dev/pts/x + +Current result: +---------------- +the guest willl be destroyed after hot remove the VF from the guest + +Expected result: +---------------- +the guest works fine after hot remove the VF from the guest + + +Basic root-causing log: +---------------------- +[root@vt-snb9 qemu.git]# qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none rhel6u5.qcow -monitor pty +VNC server running on `::1:5900' +** +ERROR:qom/object.c:725:object_unref: assertion failed: (obj->ref > 0) +Aborted (core dumped) + +the first bad commit is: +commit 22a893e4f55344f96e1aafc66f5fedc491a5ca97 +Author: Paolo Bonzini <email address hidden> +Date: Wed Jun 11 10:58:06 2014 +0200 + + memory: MemoryRegion: replace owner field with QOM parent + + The two are now the same. + + Reviewed-by: Peter Crosthwaite <email address hidden> + Signed-off-by: Paolo Bonzini <email address hidden> + +test on Ivytown_EP +kvm.git + qemu.git: c77dcacb_0e4a7737 +kernel version: 3.16.0 +after hot remove the VF from the guest, the guest works fine. + diff --git a/results/classifier/105/vnc/1388735 b/results/classifier/105/vnc/1388735 new file mode 100644 index 000000000..b84e9a8bb --- /dev/null +++ b/results/classifier/105/vnc/1388735 @@ -0,0 +1,60 @@ +vnc: 0.822 +instruction: 0.818 +other: 0.751 +device: 0.686 +semantic: 0.681 +mistranslation: 0.670 +network: 0.666 +graphic: 0.582 +socket: 0.460 +assembly: 0.427 +boot: 0.378 +KVM: 0.297 + +QEMU no longer allows to use full TCP port range for VNC + +After upgrade to QEMU version 2.1.0 (Debian 2.1+dfsg-4ubuntu6), I am no longer able to use any TCP port for VNC display. +For example, if I need to assign VNC server a TCP port 443, I used to run: +# qemu-system-x86_64 -vnc :-5457 +qemu-system-x86_64: Failed to start VNC server on `:-1000': can't convert to a number:-5457 +expected behavior: as any VNC software, take port base of 5900, substract 5457 display number, and use TCP port 443 + +I ask to change vnc port conversion routine to allow input values in range of all TCP ports, from 1 to 65535. + +I really depend on ability to use full TCP range for VNC port numbers, and inablity to do so in new version of QEMU is very disappointing. + +I disagree. This is a vnc port number, and by definition it can't really be negative. The fact that some vnc software allows negative port like this, or that some software uses tcp port number in place of vnc port number, does not make it more valid. + +We're talking about an issue in original vnc specification, -- maybe they should have used tcp port in the first place. + +And yes, this way we can't specify tcp port less than 5900. + +In order to solve this issue for real, I think the best way is to allow specifying tcp port somehow. How does other vnc software deal with this? One example I can think of is to use double-semicolon syntax, like this: -vnc ::443. But we should just agree on some common way, already used by other vnc software. + +What is in use today? + +Thanks, + +/mjt + +Unfortunately, standard (eirther RFB Protocl V 3.X or RFC 6143) doesn't define bahavior with ports different from 5900: + Note that the only port number assigned by IANA for RFB is port 5900, + so RFB clients and servers should avoid using other port numbers + unless they are communicating with servers or clients known to use + the non-standard ports. + +So it is absolutely dependant on implementation, how to handle non-standard port numbers. +Both implementations from RealNetworks (authors of original VNC) and all other implementations (Tight, and numberous java applets) are allowing negative display numbers with one of two options: +* negative port number accepted, like :-5457 (like QEMU allowed to do before), +* ::<port number> allowed for direct port number instead of display number, like ::443. + +It will be best for me, to allow behavior of QEMU before 2.1 (with negative display numbers). +But notation of ::<tcp portnumber> would also solve my problem. + +Use for this is very simple - in hosting environment, I am not able to adjust firewall to allow inbound connections for not privileged ports. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1391942 b/results/classifier/105/vnc/1391942 new file mode 100644 index 000000000..d1afc690d --- /dev/null +++ b/results/classifier/105/vnc/1391942 @@ -0,0 +1,70 @@ +vnc: 0.974 +instruction: 0.937 +network: 0.915 +mistranslation: 0.912 +semantic: 0.911 +device: 0.902 +other: 0.900 +graphic: 0.831 +socket: 0.805 +assembly: 0.794 +KVM: 0.762 +boot: 0.675 + +Unnecessary events option of the trace argument with UST backend + +When running configure with the --enable-trace-backends=ust option the user should not have to specify a the "events" and "file" options because they are not used with that tracing framework. + +Right now, in order the use this option the need to specify a dummy events file. + +This fails: +$> qemu-system-x86_64 -hda debian_wheezy_amd64_standard.qcow2 -trace -m 512 +qemu-system-x86_64: -trace -m: Invalid parameter '-m' + +This works: +$> qemu-system-x86_64 -hda debian_wheezy_amd64_standard.qcow2 -trace events=dummy-events.txt -m 512 +VNC server running on `127.0.0.1:5900' + +I am using version: +$> qemu-system-x86_64 --version +QEMU emulator version 2.1.90, Copyright (c) 2003-2008 Fabrice Bellard + +On Wed, Nov 12, 2014 at 04:01:38PM -0000, Francis Deslauriers wrote: +> When running configure with the --enable-trace-backends=ust option and compiling. +> The user should not have to specify a the "events" and "file" options because they are not used with that tracing framework. +> +> Right now, in order the use this option the need to specify a dummy +> events file. +> +> This fails: +> $> qemu-system-x86_64 -hda debian_wheezy_amd64_standard.qcow2 -trace -m 512 +> qemu-system-x86_64: -trace -m: Invalid parameter '-m' +> +> This works: +> $> qemu-system-x86_64 -hda debian_wheezy_amd64_standard.qcow2 -trace events=dummy-events.txt -m 512 +> VNC server running on `127.0.0.1:5900' +> +> I am using version: +> $> qemu-system-x86_64 --version +> QEMU emulator version 2.1.90, Copyright (c) 2003-2008 Fabrice Bellard + +What happens when you pass no -trace option? + +Stefan + + +It works without the -trace option. + +Want I meant with this post is that the "events" argument of the "-trace" option has no effect in the case of using LTTng UST as the tracing backend because the events are enabled from the LTTng tracer itself. + +Is there some way I can make an argument optional or conditional to a tracing framework? + +Thanks, + +Francis + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1393486 b/results/classifier/105/vnc/1393486 new file mode 100644 index 000000000..99f62ee0f --- /dev/null +++ b/results/classifier/105/vnc/1393486 @@ -0,0 +1,32 @@ +vnc: 0.774 +device: 0.719 +mistranslation: 0.671 +socket: 0.656 +network: 0.608 +instruction: 0.547 +graphic: 0.505 +semantic: 0.407 +boot: 0.320 +other: 0.318 +KVM: 0.313 +assembly: 0.068 + +hw/virtio/virtio-rng.c:150: bad test ? + +hw/virtio/virtio-rng.c:150:31: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] + + if (!vrng->conf.period_ms > 0) { + error_setg(errp, "'period' parameter expects a positive integer"); + return; + } + +Maybe better code + + if (vrng->conf.period_ms <= 0) { + error_setg(errp, "'period' parameter expects a positive integer"); + return; + } + +Fixed here: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=a3a292c420d2fec3c07 + diff --git a/results/classifier/105/vnc/1453608 b/results/classifier/105/vnc/1453608 new file mode 100644 index 000000000..b1548dd1d --- /dev/null +++ b/results/classifier/105/vnc/1453608 @@ -0,0 +1,24 @@ +vnc: 0.795 +mistranslation: 0.751 +semantic: 0.700 +device: 0.626 +instruction: 0.544 +other: 0.543 +graphic: 0.520 +network: 0.492 +socket: 0.388 +boot: 0.174 +KVM: 0.159 +assembly: 0.077 + +explain what pcsys_monitor in manpage + +The specification of vnc passwords seems to have changed. `man qemu-system-x86_64` mentions `set_password` to be used in `pcsys_monitor`. Both are are not further mentioned in the man page and misteriously inexisting in both the web and the source root (as far as `grep -r -I 'pcsys_monitor' .` is concerned). That's too vage to be usable. + +experienced with 2.3.0 + +This should finally get fixed here: +https://gitlab.com/qemu-project/qemu/-/commit/923e931188dcbb7 + +Released with QEMU v5.2.0. + diff --git a/results/classifier/105/vnc/1453612 b/results/classifier/105/vnc/1453612 new file mode 100644 index 000000000..785313570 --- /dev/null +++ b/results/classifier/105/vnc/1453612 @@ -0,0 +1,34 @@ +vnc: 0.963 +other: 0.708 +graphic: 0.690 +device: 0.683 +socket: 0.658 +instruction: 0.631 +network: 0.550 +mistranslation: 0.414 +semantic: 0.357 +boot: 0.279 +assembly: 0.145 +KVM: 0.141 + +set_password command of monitor has poor feedback on failure + +running `set_password vnc NkkmEz5icvTAGo6MECzBVEUxP` in qemu monitor started with `-monitor stdio` gives feedback `Could not set password` which is unhelpful because it doesn't specify the reason of the failure. + +experienced with 2.3.0 + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1453613 b/results/classifier/105/vnc/1453613 new file mode 100644 index 000000000..2a4bc1012 --- /dev/null +++ b/results/classifier/105/vnc/1453613 @@ -0,0 +1,30 @@ +vnc: 0.779 +device: 0.617 +network: 0.578 +mistranslation: 0.475 +socket: 0.431 +graphic: 0.420 +semantic: 0.414 +other: 0.331 +KVM: 0.189 +boot: 0.119 +instruction: 0.080 +assembly: 0.059 + +the help message of the set_password subcommand of the qemu monitor isn't usable + +`help set_password` in qemu monitor prints `set_password protocol password action-if-connected -- set spice/vnc password` which doesn't allow to figure out how to use this subcommand. + +I think the 'help' text in the monitor is only really intended as a brief usage summary reminder (in particular "help" on its own prints the concatenation of all the "help foo" command help, so having "help foo" print a long bit of documentation makes "help" output look weird). The full documentation of each command is in the QEMU documentation proper, which is now at https://www.qemu.org/docs/master/system/monitor.html#commands and where the 'set_password' documentation describes the behaviour more fully. + +To make this work in general we'd have to somehow support rendering the rST-format documentation that ends up in the manual as a user response in the monitor, which feels like it would be tricky. + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/114 + + diff --git a/results/classifier/105/vnc/1454 b/results/classifier/105/vnc/1454 new file mode 100644 index 000000000..fadbba3c4 --- /dev/null +++ b/results/classifier/105/vnc/1454 @@ -0,0 +1,75 @@ +semantic: 0.994 +vnc: 0.993 +device: 0.993 +mistranslation: 0.992 +graphic: 0.992 +other: 0.991 +KVM: 0.991 +boot: 0.990 +assembly: 0.989 +socket: 0.988 +instruction: 0.988 +network: 0.983 + +QEMU TCG s390x fails an assertion while dispatching an FIXPT_DIVIDE exception on DR when compiled with LTO +Description of problem: +When running the attached minimal reproducer, with qemu-system-s390x version 7.2.0 compiled with LTO (`--enable-lto`) with GCC v12.2.1, QEMU fails an assertion and crashes: +``` +qemu-system-s390x: ../target/s390x/tcg/excp_helper.c:215: do_program_interrupt: Assertion `ilen == 2 || ilen == 4 || ilen == 6' failed. +Aborted (core dumped) +``` +Steps to reproduce: +1. Compile QEMU v7.2.0 for s390x with LTO enabled: + ``` + ../configure --target-list=s390x-softmmu --enable-lto + ``` +2. Compile the given reproducer assembler [lpswe-to-pgm.S](/uploads/200fb0e777ddd0ed26f51009e81c26ea/lpswe-to-pgm.S): + ``` + s390x-linux-gnu-gcc -march=z13 -m64 -nostdlib -nostartfiles -static -Wl,-Ttext=0 -Wl,--build-id=none lpswe-to-pgm.S -o lpswe-to-pgm + ``` +3. Execute QEMU on the reproducer: + ``` + ./qemu-system-s390x -kernel lpswe-to-pgm + ``` +Additional information: +I have debugged QEMU to try to find the root cause, and I believe I found it, but I'm not sure what the most appropriate way to fix it would be: + +QEMU executes the `DR` instruction by executing the `divs32` helper. + +When the helper sees that the final division result does not fit in 32 bits, it generates a program interrupt for fixed point divide by calling the `tcg_s390_program_interrupt` function, with the final parameter being the TCG host PC, which is found by calling `GETPC`. + +`tcg_s390_program_interrupt` then calls `cpu_restore_state`, and then as long as the host PC is valid, `cpu_restore_state` eventually calls `s390x_restore_state_to_opc` through a long chain of calls, which sets `CPUS390XState::int_pgm_ilen` to a valid value. + +Unfortunately when compiling with LTO, the host PC is not valid, which means we don't update `int_pgm_ilen`, resulting in the failed assertion. + +The reason the host PC is not valid when compiling with LTO, is that GCC decides to split `helper_divs32` into 2 parts, the actual div logic being the first part, and the call to `GETPC` & `tcg_s390_program_interrupt` being the second part. The way GCC implements it is by turning the second part into a separate function, which the first part calls - see disassembly below. (GCC then re-uses the second part in other similar TCG helpers) + +Because we now called the second part before calling `GETPC`, we have a new return address, and `GETPC` returns the address of the first part, instead of the TCG host PC. + +``` +000000000022c870 <helper_divs32>: + 22c870: 48 83 ec 08 sub rsp,0x8 + 22c874: 85 d2 test edx,edx + 22c876: 74 22 je 22c89a <helper_divs32+0x2a> + 22c878: 48 89 f0 mov rax,rsi + 22c87b: 48 63 ca movsxd rcx,edx + 22c87e: 48 99 cqo + 22c880: 48 f7 f9 idiv rcx + 22c883: 4c 63 c0 movsxd r8,eax + 22c886: 48 89 97 10 03 00 00 mov QWORD PTR [rdi+0x310],rdx + 22c88d: 49 39 c0 cmp r8,rax + 22c890: 75 17 jne 22c8a9 <helper_divs32+0x39> + 22c892: 4c 89 c0 mov rax,r8 + 22c895: 48 83 c4 08 add rsp,0x8 + 22c899: c3 ret + 22c89a: 48 8b 54 24 08 mov rdx,QWORD PTR [rsp+0x8] + 22c89f: be 09 00 00 00 mov esi,0x9 + 22c8a4: e8 47 e5 ff ff call 22adf0 <tcg_s390_program_interrupt> + 22c8a9: e8 b2 fe ff ff call 22c760 <helper_divs32.part.0> + +000000000022c760 <helper_divs32.part.0>: + 22c760: 48 83 ec 08 sub rsp,0x8 + 22c764: be 09 00 00 00 mov esi,0x9 + 22c769: 48 8b 54 24 08 mov rdx,QWORD PTR [rsp+0x8] + 22c76e: e8 7d e6 ff ff call 22adf0 <tcg_s390_program_interrupt> +``` diff --git a/results/classifier/105/vnc/1455912 b/results/classifier/105/vnc/1455912 new file mode 100644 index 000000000..2f046c1c8 --- /dev/null +++ b/results/classifier/105/vnc/1455912 @@ -0,0 +1,53 @@ +vnc: 0.847 +device: 0.702 +other: 0.702 +network: 0.659 +semantic: 0.611 +mistranslation: 0.555 +graphic: 0.523 +instruction: 0.446 +boot: 0.334 +socket: 0.157 +assembly: 0.099 +KVM: 0.067 + +vnc websocket option not properly parsed when running on commandline + +All of my vms are started with a simple script on the command line. +Starting with Qemu 2.3.0, the option "-vnc host:port,websocket" is no longer working. + +Previously if I said listen on Tor:17,websocket it would function correctly. Now it's kicking an error: + + +qemu-system-x86_64: -vnc tor:17,websocket: Failed to start VNC server on `(null)': address resolution failed for tor:on: Servname not supported for ai_socktype + +The error leads me to believe it's not parsing the command line options for the "vnc" option correctly. If I leave off ",websocket" it works correctly. I've even tried, replacing the hostname with an IP address, and using the alternate form " -display vnc=tor:17,webscoket". It reports the same error. + +Someone has had a similar issue with the port portion of the display as a string and not an integer (so it's looking in /etc/services etc): + +http://stackoverflow.com/questions/23079017/servname-not-supported-for-ai-socktype + + +I have more information about the bug. The host I'm running this on is called "tor' (no, it has nothing to do with an onion router, its an old nickname and something I've been calling my main dev host for years). Its IP is 10.16.0.5. If I designate the command line option as "-vnc tor:11,websocket=5711" or "-vnc 10.16.0.5:11,websocket=5711" it appears to work fine. + +I have to include the specific IP I wish it to listen on because this host has a lot of different interfaces, and I don't want it listening on all interfaces. So there's still an issue with it resolving the "short" name in local dns to the local IP, and listening only on that IP with the abbreviated option. It's still not parsed correctly. + +On another host, with much fewer interfaces and addresses, a simple "-vnc :80,websocket" works fine without modification. Same version of Qemu, the ArchLinux x86_64 package for 2.3.0-2. + + + +This is an accidental regression caused by + + commit 4db14629c38611061fc19ec6927405923de84f08 + Author: Gerd Hoffmann <email address hidden> + Date: Tue Sep 16 12:33:03 2014 +0200 + + vnc: switch to QemuOpts, allow multiple servers + + + +https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg00583.html + +Dan's patch linked in comment #4 went into git as commit 1b1aeb5828c978a, so this has been fixed (with the fix going into the 2.9.0 release). + + diff --git a/results/classifier/105/vnc/1467 b/results/classifier/105/vnc/1467 new file mode 100644 index 000000000..d6fa83030 --- /dev/null +++ b/results/classifier/105/vnc/1467 @@ -0,0 +1,14 @@ +vnc: 0.492 +other: 0.476 +device: 0.427 +network: 0.416 +KVM: 0.305 +boot: 0.292 +semantic: 0.195 +graphic: 0.124 +socket: 0.114 +mistranslation: 0.112 +instruction: 0.046 +assembly: 0.017 + +guest agent file filtering diff --git a/results/classifier/105/vnc/1486278 b/results/classifier/105/vnc/1486278 new file mode 100644 index 000000000..a8ad357c6 --- /dev/null +++ b/results/classifier/105/vnc/1486278 @@ -0,0 +1,36 @@ +vnc: 0.982 +instruction: 0.947 +device: 0.896 +network: 0.879 +socket: 0.844 +graphic: 0.802 +mistranslation: 0.775 +semantic: 0.657 +other: 0.485 +boot: 0.455 +KVM: 0.308 +assembly: 0.207 + +'info vnc' monitor command does not show websocket information + +Steps to reproduce^ +1. run + qemu-system-x86_64 -vnc 0.0.0.0:1,websocket=5701 -nographic -monitor stdio + +2. then type + (qemu) info vnc +3. see + address: 0.0.0.0:5901 + auth: none +Client: none + +There is no information about websocket parameters, but 'netstat -nltp' shows me: + ... +tcp 0 0 0.0.0.0:5701 0.0.0.0:* LISTEN 27073/qemu-system-x +.... + +I think this has been fixed in QEMU v2.10.0 with this commit here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=0a9667ecdb6d7da90a2ef64 + +Thanks! This is presumed fixed in Ubuntu also then, since 18.04 onwards shipped a qemu version higher than 2.10.0. If this is wrong, please reopen. + diff --git a/results/classifier/105/vnc/1490853 b/results/classifier/105/vnc/1490853 new file mode 100644 index 000000000..6418c2304 --- /dev/null +++ b/results/classifier/105/vnc/1490853 @@ -0,0 +1,234 @@ +vnc: 0.924 +other: 0.924 +boot: 0.921 +KVM: 0.916 +device: 0.903 +instruction: 0.893 +graphic: 0.892 +socket: 0.892 +assembly: 0.892 +network: 0.890 +mistranslation: 0.889 +semantic: 0.881 + +qemu windows guest hangs on 100% cpu usage + +hi: +I have two VM , one is winXP Prefessional SP3 32bit, another on is WindowsServer2008 Enterprise SP2 64bit. +When I hot reboot winXP in guest OS, it'll hangs on progress bar, and all the vcpu thread in qemu is 100% usage. +I try to rebuild kvm and add some debug info , I found the cpu exit reason is EXIT_REASON_PAUSE_INSTRUCTION. +It seems like all the vcpu always in spinlock waiting. I not sure it's qemu's bug or kvm's. +Any help would be appreciated. + +How reproducible: +WinXP: seems always. +WinServer2008: rare. + +Steps to Reproduce: +winXP: 1. hot reboot the xp guest os, hot reboot is necessary. +WinServer2008: not sure, I didn't do anything, it just happened. + +The different between WinXP and WInServer2008: +1. When WinXP hangs, the boot progress bar is rolling, I think that vnc is work fine. +2. When WinServer2008 hangs, the vnc show the last screen and the screen won't change anything include system time. +3. When the VM hangs , if I execute "virsh suspend vm-name" and "virsh resume vm-name", the WinServer2008 will change to normal , and work fine not hangs anymore. But WinXP not change anything, still hangs. + +qemu version: +QEMU emulator version 1.5.0, Copyright (c) 2003-2008 Fabrice Bellard +host info: +Ubuntu 12.04 LTS \n \l +Linux cvknode2026 3.13.6 #1 SMP Fri Dec 12 09:17:35 CST 2014 x86_64 x86_64 x86_64 GNU/Linux + + + qemu command line (guest OS XP): +root 7124 1178 7.6 7750360 3761644 ? Sl 14:02 435:23 /usr/bin/kvm -name x -S -machine pc-i440fx-1.5,accel=kvm,usb=off,system=windows -cpu qemu64,hv_relaxed,hv_spinlocks=0x2000 -m 6144 -smp 12,maxcpus=72,sockets=12,cores=6,threads=1 -uuid d3832129-f77d-4b21-bbf7-fd337f53e572 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/x.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,clock=vm,driftfix=slew -no-hpet -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-ehci,id=ehci,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/vms/images/sn1-of-ff.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=directsync -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive if=none,id=drive-ide0-1-1,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1,bootindex=2 -netdev tap,fd=24,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=0c:da:41:1d:f8:40,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/x.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0 -vnc 0.0.0.0:0 -device VGA,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 + + + all qemu thread (guest OS XP): +root@cvknode2026:/proc/7124/task# top -d 1 -H -p 7124 +top - 14:37:05 up 7 days, 4:07, 1 user, load average: 10.71, 10.90, 10.19 +Tasks: 14 total, 12 running, 2 sleeping, 0 stopped, 0 zombie +Cpu(s): 38.8%us, 11.2%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st +Mem: 49159888k total, 35665128k used, 13494760k free, 436312k buffers +Swap: 8803324k total, 0k used, 8803324k free, 28595100k cached + + PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P SWAP WCHAN COMMAND + 7130 root 20 0 7568m 3.6g 6628 R 101 7.7 33:43.48 3 3.8g - kvm + 7132 root 20 0 7568m 3.6g 6628 R 101 7.7 33:43.13 1 3.8g - kvm + 7133 root 20 0 7568m 3.6g 6628 R 101 7.7 33:42.70 6 3.8g - kvm + 7135 root 20 0 7568m 3.6g 6628 R 101 7.7 33:42.33 11 3.8g - kvm + 7137 root 20 0 7568m 3.6g 6628 R 101 7.7 33:42.59 17 3.8g - kvm + 7126 root 20 0 7568m 3.6g 6628 R 100 7.7 34:06.76 4 3.8g - kvm + 7127 root 20 0 7568m 3.6g 6628 R 100 7.7 33:44.14 8 3.8g - kvm + 7128 root 20 0 7568m 3.6g 6628 R 100 7.7 33:43.64 13 3.8g - kvm + 7129 root 20 0 7568m 3.6g 6628 R 100 7.7 33:43.64 7 3.8g - kvm + 7131 root 20 0 7568m 3.6g 6628 R 100 7.7 33:44.24 10 3.8g - kvm + 7134 root 20 0 7568m 3.6g 6628 R 100 7.7 33:42.47 12 3.8g - kvm + 7136 root 20 0 7568m 3.6g 6628 R 100 7.7 33:42.16 2 3.8g - kvm + 7124 root 20 0 7568m 3.6g 6628 S 1 7.7 0:30.65 14 3.8g poll_sche kvm + 7139 root 20 0 7568m 3.6g 6628 S 0 7.7 0:01.71 14 3.8g futex_wai kvm + +all thread's kernel stack (guest OS XP): +root@cvknode2026:/proc/7124/task# cat 7130/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7132/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7133/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7135/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffa02b6788>] vmx_vcpu_run+0x88/0x760 [kvm_intel] +[<ffffffffa0413aec>] __vcpu_run+0x63c/0xc30 [kvm] +[<ffffffffa0414188>] kvm_arch_vcpu_ioctl_run+0xa8/0x270 [kvm] +[<ffffffffa03fc042>] kvm_vcpu_ioctl+0x512/0x6d0 [kvm] +[<ffffffff811d4326>] do_vfs_ioctl+0x86/0x4f0 +[<ffffffff811d4821>] SyS_ioctl+0x91/0xb0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7137/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7126/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7127/stack +[<ffffffffa02b74f6>] handle_pause+0x16/0x30 [kvm_intel] +[<ffffffffa02ba0d4>] vmx_handle_exit+0x94/0x8b0 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7128/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7129/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7131/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7134/stack +[<ffffffffa02b74fe>] handle_pause+0x1e/0x30 [kvm_intel] +[<ffffffffa02ba0d4>] vmx_handle_exit+0x94/0x8b0 [kvm_intel] +[<ffffffffa0413aec>] __vcpu_run+0x63c/0xc30 [kvm] +[<ffffffffa0414188>] kvm_arch_vcpu_ioctl_run+0xa8/0x270 [kvm] +[<ffffffffa03fc042>] kvm_vcpu_ioctl+0x512/0x6d0 [kvm] +[<ffffffff811d4326>] do_vfs_ioctl+0x86/0x4f0 +[<ffffffff811d4821>] SyS_ioctl+0x91/0xb0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7136/stack +[<ffffffffa02b1fa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7124/stack +[<ffffffff811d50c9>] poll_schedule_timeout+0x49/0x70 +[<ffffffff811d678a>] do_sys_poll+0x50a/0x590 +[<ffffffff811d68eb>] SyS_poll+0x6b/0x100 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvknode2026:/proc/7124/task# cat 7139/stack +[<ffffffff810daf77>] futex_wait_queue_me+0xd7/0x150 +[<ffffffff810dc087>] futex_wait+0x1a7/0x2c0 +[<ffffffff810ddc14>] do_futex+0x334/0xb70 +[<ffffffff810de592>] SyS_futex+0x142/0x1a0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff + + qemu command line (guest OS WinServer2008): +root 25258 996 21.5 21174412 14181580 ? Sl Aug27 73740:11 /usr/bin/kvm -name zjx_1-clone -S -machine pc-i440fx-1.5,accel=kvm,usb=off,system=windows -cpu qemu64,hv_relaxed,hv_spinlocks=0x2000 -m 16384 -smp 12,maxcpus=72,sockets=12,cores=6,threads=1 -uuid 8c8b9abf-e9a6-4c3e-93cd-137a9550e593 -no-user-config -nodefaults -chardev so +cket,id=charmonitor,path=/var/lib/libvirt/qemu/zjx_1-clone.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,clock=vm,driftfix=slew -no-hpet -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-ehci,id=ehci,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus +=pci.0,addr=0x5 -drive file=/vms/aaa/zjx_1-clone.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=directsync -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/vms/isos/virtio-win2008R2.vfd,if=none,id=drive-fdc0-0-0,readonly=on,format=raw,cache=directsync -global isa-fdc.driveA=drive-fdc0-0-0 -drive if=none,id=drive-ide0-1-1,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1,bootindex=2 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=0c:da:41:1d:b6:47,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-ser +ial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/zjx_1-clone.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0 -vnc 0.0.0.0:3 -device VGA,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 + + all qemu thread (guest OS WinServer2008): + top -d 1 -H -p 25258 +top - 14:53:37 up 24 days, 21:27, 2 users, load average: 19.12, 20.56, 20.20 +Tasks: 14 total, 13 running, 1 sleeping, 0 stopped, 0 zombie +Cpu(s): 48.1%us, 18.2%sy, 0.0%ni, 33.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st +Mem: 65674944k total, 64651012k used, 1023932k free, 194608k buffers +Swap: 8803324k total, 4140324k used, 4663000k free, 363712k cached + + PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P WCHAN COMMAND +25281 root 20 0 20.2g 13g 4020 R 157 21.6 5864:12 14 - kvm +25284 root 20 0 20.2g 13g 4020 R 155 21.6 5863:02 4 - kvm +25294 root 20 0 20.2g 13g 4020 R 153 21.6 5851:59 3 - kvm +25287 root 20 0 20.2g 13g 4020 R 152 21.6 5861:20 15 - kvm +25299 root 20 0 20.2g 13g 4020 R 152 21.6 5847:14 1 - kvm +25258 root 20 0 20.2g 13g 4020 R 122 21.6 3372:41 13 - kvm +25269 root 20 0 20.2g 13g 4020 R 101 21.6 5929:42 5 - kvm +25301 root 20 0 20.2g 13g 4020 R 101 21.6 5847:26 10 - kvm +25292 root 20 0 20.2g 13g 4020 R 100 21.6 5853:18 7 - kvm +25297 root 20 0 20.2g 13g 4020 R 100 21.6 5843:37 16 - kvm +25272 root 20 0 20.2g 13g 4020 R 98 21.6 5872:52 2 - kvm +25277 root 20 0 20.2g 13g 4020 R 93 21.6 5878:21 0 - kvm +25290 root 20 0 20.2g 13g 4020 R 51 21.6 5863:15 8 - kvm +25314 root 20 0 20.2g 13g 4020 S 0 21.6 0:41.42 1 futex_wai kvm + +all thread's kernel stack (guest OS WinServer2008): +root@cvk11:/proc/25258/task# cat 25281/stack +[<ffffffffa03cdfa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffa03d60d4>] vmx_handle_exit+0x94/0x8b0 [kvm_intel] +[<ffffffffa062cbb4>] __vcpu_run+0x704/0xc30 [kvm] +[<ffffffffa062d188>] kvm_arch_vcpu_ioctl_run+0xa8/0x270 [kvm] +[<ffffffffa0615042>] kvm_vcpu_ioctl+0x512/0x6d0 [kvm] +[<ffffffff811d4326>] do_vfs_ioctl+0x86/0x4f0 +[<ffffffff811d4821>] SyS_ioctl+0x91/0xb0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25284/stack +[<ffffffffa0613537>] kvm_vcpu_yield_to+0x47/0xa0 [kvm] +[<ffffffffa06136ab>] kvm_vcpu_on_spin+0x11b/0x150 [kvm] +[<ffffffffa03cdfa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25294/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25287/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25299/stack +[<ffffffffa03d34f6>] handle_pause+0x16/0x30 [kvm_intel] +[<ffffffffa03d60d4>] vmx_handle_exit+0x94/0x8b0 [kvm_intel] +[<ffffffffa062caec>] __vcpu_run+0x63c/0xc30 [kvm] +[<ffffffffa062d188>] kvm_arch_vcpu_ioctl_run+0xa8/0x270 [kvm] +[<ffffffffa0615042>] kvm_vcpu_ioctl+0x512/0x6d0 [kvm] +[<ffffffff811d4326>] do_vfs_ioctl+0x86/0x4f0 +[<ffffffff811d4821>] SyS_ioctl+0x91/0xb0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25258/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25269/stack +[<ffffffffa03d34fe>] handle_pause+0x1e/0x30 [kvm_intel] +[<ffffffffa03d60d4>] vmx_handle_exit+0x94/0x8b0 [kvm_intel] +[<ffffffffa062caec>] __vcpu_run+0x63c/0xc30 [kvm] +[<ffffffffa062d188>] kvm_arch_vcpu_ioctl_run+0xa8/0x270 [kvm] +[<ffffffffa0615042>] kvm_vcpu_ioctl+0x512/0x6d0 [kvm] +[<ffffffff811d4326>] do_vfs_ioctl+0x86/0x4f0 +[<ffffffff811d4821>] SyS_ioctl+0x91/0xb0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25301/stack +[<ffffffffa03d34fe>] handle_pause+0x1e/0x30 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25292/stack +[<ffffffffa03cdfa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25297/stack +[<ffffffffa03cdfa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25272/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25277/stack +[<ffffffffa03cdfa3>] clear_atomic_switch_msr+0x133/0x170 [kvm_intel] +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25290/stack +[<ffffffffffffffff>] 0xffffffffffffffff +root@cvk11:/proc/25258/task# cat 25314/stack +[<ffffffff810daf77>] futex_wait_queue_me+0xd7/0x150 +[<ffffffff810dc087>] futex_wait+0x1a7/0x2c0 +[<ffffffff810ddc14>] do_futex+0x334/0xb70 +[<ffffffff810de592>] SyS_futex+0x142/0x1a0 +[<ffffffff817610ad>] system_call_fastpath+0x1a/0x1f +[<ffffffffffffffff>] 0xffffffffffffffff + +Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1516446 b/results/classifier/105/vnc/1516446 new file mode 100644 index 000000000..bad0beafb --- /dev/null +++ b/results/classifier/105/vnc/1516446 @@ -0,0 +1,579 @@ +vnc: 0.917 +mistranslation: 0.916 +other: 0.910 +instruction: 0.908 +device: 0.903 +graphic: 0.902 +assembly: 0.897 +boot: 0.894 +socket: 0.894 +KVM: 0.893 +network: 0.888 +semantic: 0.887 + +Migration always causes guest freeze in one direction. + +Hello, + +I have three debian jessie machines standard installations except for homebuild qemu-2.4.0 package using the source package from testing. I had the same problem with the standard debian jessie qemu 2.1 too. + +I have host A, B and C. + +Migrations work between all combinations of these except A -> B. B -> A works. + +I use libvirt but as per your written request I have run qemu directly and verified the same problem. + +Host A: +qemu-system-x86_64 --enable-kvm -name ashole -cpu kvm64 -m 1024 -drive file=/mnt/synctest/ashole.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -vnc 0.0.0.0:600 -k sv -vga std -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 + +Host B: +qemu-system-x86_64 --enable-kvm -name ashole -cpu kvm64 -m 1024 -drive file=/mnt/synctest/ashole.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -vnc 0.0.0.0:600 -k sv -vga std -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -incoming tcp:0:4444 + +Then in qemu monitor I run migrate -d tcp:B:4444 and the guest freeze. + +I have tried with these guest os:es, freebsd 9.1, debian wheezy and debian jessie (Standard 3.16 kernel and backported 4.2 kernel), same problem with all of them. when running the migration through libvirt virt-manager says the guest is using 100% cpu. + +I had a similar problem (https://bugzilla.kernel.org/show_bug.cgi?id=61971) 2 years ago which was solved in kernel 3.13 if I remember correctly. + +Best Regards +Magnus + +CPU info: +Host A +processor : 0 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 0 +cpu cores : 4 +apicid : 16 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 1 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 1 +cpu cores : 4 +apicid : 17 +initial apicid : 1 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 2 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1700.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 2 +cpu cores : 4 +apicid : 18 +initial apicid : 2 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 3 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 3 +cpu cores : 4 +apicid : 19 +initial apicid : 3 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 4 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 4 +cpu cores : 4 +apicid : 20 +initial apicid : 4 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 5 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 5 +cpu cores : 4 +apicid : 21 +initial apicid : 5 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 6 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 6 +cpu cores : 4 +apicid : 22 +initial apicid : 6 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +processor : 7 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-8320 Eight-Core Processor +stepping : 0 +microcode : 0x600081f +cpu MHz : 1400.000 +cache size : 2048 KB +physical id : 0 +siblings : 8 +core id : 7 +cpu cores : 4 +apicid : 23 +initial apicid : 7 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 +bugs : fxsave_leak sysret_ss_attrs +bogomips : 7023.54 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +Host B: +processor : 0 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 0 +cpu cores : 6 +apicid : 0 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +processor : 1 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 1 +cpu cores : 6 +apicid : 1 +initial apicid : 1 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +processor : 2 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 2 +cpu cores : 6 +apicid : 2 +initial apicid : 2 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +processor : 3 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 3 +cpu cores : 6 +apicid : 3 +initial apicid : 3 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +processor : 4 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 4 +cpu cores : 6 +apicid : 4 +initial apicid : 4 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +processor : 5 +vendor_id : AuthenticAMD +cpu family : 16 +model : 10 +model name : AMD Phenom(tm) II X6 1055T Processor +stepping : 0 +microcode : 0x10000bf +cpu MHz : 800.000 +cache size : 512 KB +physical id : 0 +siblings : 6 +core id : 5 +cpu cores : 6 +apicid : 5 +initial apicid : 5 +fpu : yes +fpu_exception : yes +cpuid level : 6 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall +bugs : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs +bogomips : 5624.68 +TLB size : 1024 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm stc 100mhzsteps hwpstate cpb + +Host C: +processor : 0 +vendor_id : GenuineIntel +cpu family : 6 +model : 58 +model name : Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz +stepping : 9 +microcode : 0x12 +cpu MHz : 1607.156 +cache size : 6144 KB +physical id : 0 +siblings : 4 +core id : 0 +cpu cores : 4 +apicid : 0 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt +bugs : +bogomips : 6186.23 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +processor : 1 +vendor_id : GenuineIntel +cpu family : 6 +model : 58 +model name : Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz +stepping : 9 +microcode : 0x12 +cpu MHz : 1730.066 +cache size : 6144 KB +physical id : 0 +siblings : 4 +core id : 1 +cpu cores : 4 +apicid : 2 +initial apicid : 2 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt +bugs : +bogomips : 6186.23 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +processor : 2 +vendor_id : GenuineIntel +cpu family : 6 +model : 58 +model name : Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz +stepping : 9 +microcode : 0x12 +cpu MHz : 1654.382 +cache size : 6144 KB +physical id : 0 +siblings : 4 +core id : 2 +cpu cores : 4 +apicid : 4 +initial apicid : 4 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt +bugs : +bogomips : 6186.23 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +processor : 3 +vendor_id : GenuineIntel +cpu family : 6 +model : 58 +model name : Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz +stepping : 9 +microcode : 0x12 +cpu MHz : 1610.304 +cache size : 6144 KB +physical id : 0 +siblings : 4 +core id : 3 +cpu cores : 4 +apicid : 6 +initial apicid : 6 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt +bugs : +bogomips : 6186.23 +clflush size : 64 +cache_alignment : 64 +address sizes : 36 bits physical, 48 bits virtual +power management: + +Is this the correct place to file qemu bug reports? + +Hi Magnus, + I think there's a fair possibility you're hitting a kernel bug that we found in + https://bugzilla.redhat.com/show_bug.cgi?id=1401767 +which was also a failure migrating from a newer AMD to an older AMD processor. + +That was fixed by upstream kernel fix: +commit 00c87e9a70a17b355b81c36adedf05e84f54e10d +Author: Radim Krčmář <email address hidden> +Date: Wed Feb 1 14:19:53 2017 +0100 + + KVM: x86: do not save guest-unsupported XSAVE state + + Saving unsupported state prevents migration when the new host does not + support a XSAVE feature of the original host, even if the feature is not + exposed to the guest. + + We've masked host features with guest-visible features before, with + 4344ee981e21 ("KVM: x86: only copy XSAVE state for the supported + features") and dropped it when implementing XSAVES. Do it again. + + Fixes: df1daba7d1cb ("KVM: x86: support XSAVES usage in the host") + Cc: <email address hidden> + Reviewed-by: Paolo Bonzini <email address hidden> + Signed-off-by: Radim Krčmář <email address hidden> + +it went into kernel 4.10 - so it's worth trying it on a modern kernel and seeing what happens. + +Dave + +Can you still reproduce the issue with a recent Linux kernel and the latest version of QEMU? + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1548 b/results/classifier/105/vnc/1548 new file mode 100644 index 000000000..30fe77210 --- /dev/null +++ b/results/classifier/105/vnc/1548 @@ -0,0 +1,51 @@ +vnc: 0.948 +device: 0.723 +mistranslation: 0.642 +graphic: 0.609 +semantic: 0.603 +socket: 0.573 +other: 0.556 +network: 0.443 +instruction: 0.175 +boot: 0.171 +KVM: 0.101 +assembly: 0.082 + +8.0.0rc0 Regression: vnc fails with Segmentation fault +Description of problem: +On connecting with `gvncviewer localhost:05` the qemu process fails with +``` +Segmentation fault +``` +`gvncviewer localhost:05` prints +``` +Connected to server +Error: Server closed the connection +Disconnected from server +``` +Steps to reproduce: +1. Enter `qemu-system-x86_64 -m 1536 -display vnc=:05 -k de -cdrom openSUSE-Leap-15.3-GNOME-Live-x86_64-Media.iso` in first terminal +2. Enter `gvncviewer localhost:05` in second terminal +Additional information: +Final output of `git bisect`: +``` +385ac97f8fad0e6980c5dfea71132d5ecfb16608 is the first bad commit +commit 385ac97f8fad0e6980c5dfea71132d5ecfb16608 +Author: Marc-André Lureau <marcandre.lureau@redhat.com> +Date: Tue Jan 17 15:24:40 2023 +0400 + + ui: keep current cursor with QemuConsole + + Keeping the current cursor around is useful, not only for VNC, but for + other displays. Let's move it down, see the following patches for other + usages. + + Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> + Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> + + include/ui/console.h | 1 + + ui/console.c | 8 ++++++++ + ui/vnc.c | 7 ++----- + ui/vnc.h | 1 - + 4 files changed, 11 insertions(+), 6 deletions(-) +``` diff --git a/results/classifier/105/vnc/1580 b/results/classifier/105/vnc/1580 new file mode 100644 index 000000000..106832bb5 --- /dev/null +++ b/results/classifier/105/vnc/1580 @@ -0,0 +1,57 @@ +vnc: 0.428 +mistranslation: 0.323 +other: 0.301 +device: 0.280 +KVM: 0.253 +network: 0.216 +instruction: 0.215 +semantic: 0.214 +boot: 0.210 +graphic: 0.208 +socket: 0.183 +assembly: 0.147 + +QEMU crashes when running inside Hyper-V VM on AMD EPYC +Description of problem: +Starting the VM very rarely succeeds and often it crashes with: +``` +# qemu-system-x86_64 -cpu EPYC -machine accel=kvm -smp 1 -m 512 -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd -drive if=pflash,format=raw,file=OVMF_VARS.fd -drive file=debian-11-nocloud-amd64-20230124-1270.qcow2,format=qcow2 -snapshot -monitor none +qemu: module ui-ui-gtk not found, do you want to install qemu-system-gui package? +qemu: module ui-ui-sdl not found, do you want to install qemu-system-gui package? +VNC server running on ::1:5900 +KVM internal error. Suberror: 1 +extra data[0]: 0x0000000000000001 +extra data[1]: 0x96d0cff2bed0cf0f +extra data[2]: 0x0bfd29af72b35c7c +extra data[3]: 0x0000000000000400 +extra data[4]: 0x0000000100000004 +extra data[5]: 0x00000000581c356c +extra data[6]: 0x0000000000000000 +extra data[7]: 0x0000000000000000 +emulation failure +EAX=fffd26a4 EBX=00000000 ECX=00000000 EDX=b731cdad +ESI=00000101 EDI=00005042 EBP=fffcc000 ESP=581c3564 +EIP=fffff8a8 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 +ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] +SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA] +LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT +TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy +GDT= fffffee0 00000027 +IDT= 00000000 00000000 +CR0=40000033 CR2=00000000 CR3=00800000 CR4=00000660 +DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 +DR6=00000000ffff0ff0 DR7=0000000000000400 +EFER=0000000000000100 +Code=00 0f 20 e0 0f ba e8 05 0f 22 e0 31 db e9 13 02 00 00 85 c0 <75> 38 b9 80 00 00 c0 0f 32 0f ba e8 08 0f 30 31 db b9 01 00 00 00 0f a3 0d 04 b0 80 00 74 +``` +Steps to reproduce: +1. Create a [Standard_D8ads_v5 VM](https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series) (AMD EPYC 7763 64-Core Processor) in Azure with Debian 11 +2. Install `qemu-system-x86` (1:7.2+dfsg-5~bpo11+1) from `bullseye-backports` +3. Install `ovmf` (2022.11-6) from `bookworm` (testing) +4. Run the commands under "QEMU command line" +Additional information: +VNC displays "Guest has not initialized the display (yet)". The setup works perfectly on a [Standard_D8ds_v5 VM](https://learn.microsoft.com/en-us/azure/virtual-machines/ddv5-ddsv5-series) (Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz). diff --git a/results/classifier/105/vnc/1586194 b/results/classifier/105/vnc/1586194 new file mode 100644 index 000000000..77c380abd --- /dev/null +++ b/results/classifier/105/vnc/1586194 @@ -0,0 +1,64 @@ +instruction: 0.942 +vnc: 0.923 +assembly: 0.921 +other: 0.918 +device: 0.917 +mistranslation: 0.899 +network: 0.899 +graphic: 0.891 +semantic: 0.890 +socket: 0.887 +boot: 0.881 +KVM: 0.848 + +VNC reverse broken in qemu 2.6.0 + +Hi all, + +I recently tried to upgrade from Qemu 2.4.1 to 2.6.0, but found some problems with VNC reverse connections. + +1) In "-vnc 172.16.1.3:5902,reverse" used to mean "connect to port 5902" + That seems to have changed changed since 2.4.1, the thing after the IP address is now interpreted + as a display number. If that change was intentional, the man-page needs to be fixed. + +2) After subtracting 5900 from that port number (-vnc 172.16.1.3:2,reverse), I ran into a segfault. + +---8<--- +Program received signal SIGSEGV, Segmentation fault. +qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7fffffffe118) at io/channel-socket.c:33 +33 return socket_sockaddr_to_address(&ioc->localAddr, +(gdb) bt +#0 qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7fffffffe118) at io/channel-socket.c:33 +#1 0x000055555594c0f5 in vnc_init_basic_info_from_server_addr (errp=0x7fffffffe118, info=0x555558f35990, + ioc=<optimized out>) at ui/vnc.c:146 +#2 vnc_server_info_get (vd=0x7fffecc4b010) at ui/vnc.c:223 +#3 0x000055555595192a in vnc_qmp_event (vs=0x555558f41f30, vs=0x555558f41f30, event=QAPI_EVENT_VNC_CONNECTED) + at ui/vnc.c:279 +#4 vnc_connect (vd=vd@entry=0x7fffecc4b010, sioc=sioc@entry=0x555558f34c00, skipauth=skipauth@entry=false, + websocket=websocket@entry=false) at ui/vnc.c:2994 +#5 0x00005555559520d8 in vnc_display_open (id=id@entry=0x555556437650 "default", errp=errp@entry=0x7fffffffe228) + at ui/vnc.c:3773 +#6 0x0000555555952fd3 in vnc_init_func (opaque=<optimized out>, opts=<optimized out>, errp=<optimized out>) + at ui/vnc.c:3868 +#7 0x0000555555a011da in qemu_opts_foreach (list=<optimized out>, func=0x555555952fa0 <vnc_init_func>, opaque=0x0, + errp=0x0) at util/qemu-option.c:1116 +#8 0x00005555556dcfbe in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4592 +--->8--- + +A git bisect shows that this happens since + +---8<--- +commit 98481bfcd661daa3c160cc87a297b0e60a307788 +Author: Eric Blake <email address hidden> +Date: Mon Oct 26 16:34:45 2015 -0600 + + vnc: Hoist allocation of VncBasicInfo to callers +--->8--- + +TIA + Andi + +I think this has been fixed in QEMU 2.7, likely with the following commit: +http://git.qemu.org/?p=qemu.git;a=commitdiff;h=3e7f136d8b4383d99f + + diff --git a/results/classifier/105/vnc/1604 b/results/classifier/105/vnc/1604 new file mode 100644 index 000000000..37a9f7dfe --- /dev/null +++ b/results/classifier/105/vnc/1604 @@ -0,0 +1,76 @@ +vnc: 0.650 +boot: 0.633 +socket: 0.607 +other: 0.580 +instruction: 0.563 +graphic: 0.563 +semantic: 0.553 +KVM: 0.534 +assembly: 0.534 +device: 0.487 +mistranslation: 0.460 +network: 0.384 + +Get wrong rom when loading 2 different firmware to 2 cpu. +Description of problem: +HI, I'm trying to model a machine with 2 cortex-m7 cpu. The 2 CPUs have their own address spaces. +and when loading rom to init sp and pc, the CPU1 would load the rom of CPU0, because it seems not check +address space here. +```c +void *rom_ptr_for_as(AddressSpace *as, hwaddr addr, size_t size) +{ + /* + * Find any ROM data for the given guest address range. If there + * is a ROM blob then return a pointer to the host memory + * corresponding to 'addr'; otherwise return NULL. + * + * We look not only for ROM blobs that were loaded directly to + * addr, but also for ROM blobs that were loaded to aliases of + * that memory at other addresses within the AddressSpace. + * + * Note that we do not check @as against the 'as' member in the + * 'struct Rom' returned by rom_ptr(). The Rom::as is the + * AddressSpace which the rom blob should be written to, whereas + * our @as argument is the AddressSpace which we are (effectively) + * reading from, and the same underlying RAM will often be visible + * in multiple AddressSpaces. (A common example is a ROM blob + * written to the 'system' address space but then read back via a + * CPU's cpu->as pointer.) This does mean we might potentially + * return a false-positive match if a ROM blob was loaded into an + * AS which is entirely separate and distinct from the one we're + * querying, but this issue exists also for rom_ptr() and hasn't + * caused any problems in practice. + */ + FlatView *fv; + void *rom; + hwaddr len_unused; + FindRomCBData cbdata = {}; + + /* Easy case: there's data at the actual address */ + rom = rom_ptr(addr, size); + if (rom) { + return rom; + } +``` +Steps to reproduce: +1. create a machine with 2 cortex-m7 cores and their own rom/ram. +2. Set different ram size for them. for example, cpu0 ram size:0x40000, cpu1 ram size:0x20000 +3. build firmware of 2 cpu. make sure the init SP(local at 0x0) is set to the top the ram. +4. use command: +``` +./qemu-system-arm -M mymachine -smp 2 \ +-device loader,file=./cpu0.elf,addr=0x0,cpu-num=0 \ +-device loader,file=./cpu1.elf,addr=0x0,cpu-num=1 \ +-serial stdio -serial tcp::5678,server=on,wait=off +``` +to start this machine. + +5. the cpu1 will panic when it try to use stack: +`qemu-system-arm: ../target/arm/cpu.h:2396: arm_is_secure_below_el3: Assertion failed.` + + +Sorry that I'm not sure whether this is an issue or I did something wrong. So post it here. +For local fix this problem, I add a func `rom_ptr_wit_as(addr,size,as)` to find a rom with addresspace check. +Is it proper? +Additional information: + diff --git a/results/classifier/105/vnc/1618431 b/results/classifier/105/vnc/1618431 new file mode 100644 index 000000000..792850249 --- /dev/null +++ b/results/classifier/105/vnc/1618431 @@ -0,0 +1,273 @@ +vnc: 0.748 +KVM: 0.742 +graphic: 0.681 +mistranslation: 0.637 +other: 0.622 +device: 0.603 +network: 0.575 +instruction: 0.563 +boot: 0.561 +socket: 0.545 +semantic: 0.535 +assembly: 0.532 + +windows hangs after live migration with virtio + +Several of our users reported problems with windows machines hanging +after live migrations. The common denominator _seems_ to be virtio +devices. +I've managed to reproduce this reliably on a windows 10 (+ +virtio-win-0.1.118) guest, always within 1 to 5 migrations, with a +virtio-scsi hard drive and a virtio-net network device. (When I +replace the virtio-net device with an e1000 it takes 10 or more +migrations, and without virtio devices I have not (yet) been able to +reproduce this problem. I also could not reproduce this with a linux +guest. Also spice seems to improve the situation, but doesn't solve +it completely). + +I've tested quite a few tags from qemu-git (v2.2.0 through v2.6.1, +and 2.6.1 with the patches mentioned on qemu-stable by Peter Lieven) +and the behavior is the same everywhere. + +The reproducibility seems to be somewhat dependent on the host +hardware, which makes investigating this issue that much harder. + +Symptoms: +After the migration the windows graphics stack just hangs. +Background processes are still running (eg. after installing an ssh +server I could still login and get a command prompt after the hang was +triggered... not that I'd know what to do with that on a windows +machine...) - commands which need no GUI access work, the rest just +hangs there on the command line, too. +It's also capable of responding to an NMI sent via the qemu monitor: +it then seems to "recover" and manages to show the blue sad-face +screen that something happened, reboots successfully and is usable +again without restarting the qemu process in between. +From there whole the process can be repeated. + +Here's what our command line usually looks like: + +/usr/bin/qemu -daemonize \ + -enable-kvm \ + -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control \ + -pidfile /var/run/qemu-server/101.pid \ + -smbios type=1,uuid=07fc916e-24c2-4eef-9827-4ab4026501d4 \ + -name win10 \ + -smp 6,sockets=1,cores=6,maxcpus=6 \ + -nodefaults \ + -boot menu=on,strict=on,reboot-timeout=1000 \ + -vga std \ + -vnc unix:/var/run/qemu-server/101.vnc \ + -no-hpet \ + -cpu kvm64,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce \ + -m 2048 \ + -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f \ + -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e \ + -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 \ + -device usb-tablet,id=tablet,bus=uhci.0,port=1 \ + -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \ + -iscsi initiator-name=iqn.1993-08.org.debian:01:1ba48d46fb8 \ + -drive if=none,id=drive-ide0,media=cdrom,aio=threads \ + -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200 \ + -device virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 \ + -drive file=/mnt/pve/data1/images/101/vm-101-disk-1.qcow2,if=none,id=drive-scsi0,cache=writeback,discard=on,format=qcow2,aio=threads,detect-zeroes=unmap \ + -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100 \ + -netdev type=tap,id=net0,ifname=tap101i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on \ + -device virtio-net-pci,mac=F2:2B:20:37:E6:D7,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 \ + -rtc driftfix=slew,base=localtime \ + -global kvm-pit.lost_tick_policy=discard + +I'm not sure it's virtio - I've got similar cases which happen even without any virtio; for me it goes away if you enable hpet or switch you kvm-put.lost_tick_policy=delay. + +Dave + +As the virtio related parts aren't the ones hanging (network and disks +still work...) it's unlikely, but it makes a night and day difference. + +Removing -no-hpet as suggested does seem to make a difference, too. +(Changing the tick policy doesn't, for me.) +However, I've found that there are various options which when changed +can massively influence the likelihood of hangs - but it's not always +the same options for all VMs. +With the difference being hangups after 1 to at most 2 migrations with +one setting, or the VMs still being alive and kicking after 20 and +more migrations with the other. +However the options I've tested appear to be unrelated. Eq. in my test +setups this happened with VNC settings, CPU types, toggling our +backend's ssh tunnel for encryption (which should cause nothing but +changes in timing from the perspective of qemu); and of course +replacing virtio devices always had this effect in my tests. +All this might point to some kind of race condition or time keeping +problem, but I can't seem to pinpoint it. + +Enabling hpet isn't a good option btw., since #599958 [Timedrift +problems with Win7: hpet missing time drift fixups] appears to +still be an open issue. => https://bugs.launchpad.net/qemu/+bug/599958 +(This entry is from 2010 :-( ) + +I can reproduce this bug also on Ubuntu 16.04 with libvirt. +The interesting thing is that this bug triggers faster, +if I use tunneled migration instead direct. +Using the virt-manager for migration. + +The test VM is a Win10 with virtio driver from fedora 0.1.118. + +<!-- +WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE +OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: + virsh edit win10-1 +or other application using the libvirt API. +--> + +<domain type='kvm'> + <name>win10-1</name> + <uuid>4b3533c1-20d4-4556-9d99-4fb3d04b19dc</uuid> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static'>6</vcpu> + <os> + <type arch='x86_64' machine='pc-i440fx-wily'>hvm</type> + </os> + <features> + <acpi/> + <apic/> + <hyperv> + <relaxed state='on'/> + <vapic state='on'/> + <spinlocks state='on' retries='8191'/> + </hyperv> + </features> + <cpu mode='custom' match='exact'> + <model fallback='allow'>Haswell-noTSX</model> + <topology sockets='1' cores='6' threads='1'/> + </cpu> + <clock offset='localtime'> + <timer name='rtc' tickpolicy='catchup'/> + <timer name='pit' tickpolicy='delay'/> + <timer name='hpet' present='no'/> + <timer name='hypervclock' present='yes'/> + </clock> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <pm> + <suspend-to-mem enabled='no'/> + <suspend-to-disk enabled='no'/> + </pm> + <devices> + <emulator>/usr/bin/kvm-spice</emulator> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2' cache='none' io='threads'/> + <source file='/mnt/traini3/vm-win10-1.qcow2'/> + <target dev='sda' bus='scsi'/> + <boot order='1'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw'/> + <source file='/mnt/nasi/template/iso/Win10_EnglishInternational_x64.iso'/> + <target dev='hdb' bus='ide'/> + <readonly/> + <boot order='2'/> + <address type='drive' controller='0' bus='0' target='0' unit='1'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw' cache='none'/> + <source file='/mnt/nasi/template/iso/virtio-win-0.1.118.iso'/> + <target dev='hdc' bus='ide'/> + <readonly/> + <boot order='3'/> + <address type='drive' controller='0' bus='1' target='0' unit='0'/> + </disk> + <controller type='usb' index='0' model='ich9-ehci1'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci1'> + <master startport='0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci2'> + <master startport='2'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/> + </controller> + <controller type='usb' index='0' model='ich9-uhci3'> + <master startport='4'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='bridge'> + <mac address='52:54:00:2e:4f:ea'/> + <source bridge='vmbr0'/> + <model type='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <serial type='pty'> + <target port='0'/> + </serial> + <console type='pty'> + <target type='serial' port='0'/> + </console> + <input type='tablet' bus='usb'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'> + <listen type='address' address='0.0.0.0'/> + </graphics> + <video> + <model type='cirrus' vram='16384' heads='1'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </video> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </memballoon> + </devices> +</domain> + + +When I set -rtc clock=vm the mig problem is solved, but what does this flag exactly? + +In the docu there is only + +" If you want to isolate the guest time from the host, you can set clock to "rt" instead. + To even prevent it from progressing during suspension, you can set it to "vm"." + +what does this means? + +Will the VM never be synced with the HW clock and so it will run slower on cpu load on normal running? + +Hmm, I'd not tried that one; I don't think that should change the behaviour during normal running, but the behaviour on pause and interactions with things like host ntp clock syncing is probably different - how different I'd have to dig in a bit more. + +However, we've done two patches this week that help windows migration - I'd be interested if either of them help your case; + +https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg02658.html is a qemu fix (now in current head qemu) that I wrote that helps one windows migration test case. + +https://lkml.org/lkml/2016/9/14/857 is a kernel fix that fixes some related problems. + +If one or both of these fixes together help I'd love to know either way! + +Dave + +Thank I test the 2 patches and they worked for me. +It works also if you apply only the qemu patch, +in combination the ubuntu kernel 4.4.0-38.57 and qemu 2.6.1. + +Excellent news; thanks for testing! + +Hi WOLI, + Note, if you pick up a new (4.8 ish) kernel you'll probably find you'll need to also pick up two patches that we've just posted to the qemu list: + + target-i386: introduce kvm_put_one_msr + kvm: apic: set APIC base as part of kvm_apic_put + +otherwise you get weird reboot hangs with Linux guests. + +Dave + +The patches should be part of QEMU v2.8 ==> Fix released + diff --git a/results/classifier/105/vnc/1637447 b/results/classifier/105/vnc/1637447 new file mode 100644 index 000000000..fd319f731 --- /dev/null +++ b/results/classifier/105/vnc/1637447 @@ -0,0 +1,30 @@ +vnc: 0.903 +network: 0.884 +device: 0.865 +socket: 0.784 +instruction: 0.769 +semantic: 0.673 +graphic: 0.616 +boot: 0.494 +mistranslation: 0.409 +other: 0.319 +KVM: 0.295 +assembly: 0.137 + +VNC/RFB: QEMU reports incorrect name (length) + +If the name of a machine (as set with the -name argument) has a length longer than 1024, (RFB) VNC clients will not receive a correct RFB ServerInit message. + +I suspect this is the problem: + +https://github.com/qemu/qemu/blob/master/ui/vnc.c#L2463 + +The return value of snprintf is used as the value for the name-length field in the ServerInit message. +This is problematic for names that were truncated to 1024, as the length will now be bigger than the actual name. + +I think a quick fix would be to simply report min(size,1024) to the client... + +The right fix here is to switch to use g_strdup_printf and avoid a fixed length stack buffer entirely. + +Fix has been committed: http://git.qemu.org/?p=qemu.git;a=commitdiff;h=97efe4f961dcf5a0126 + diff --git a/results/classifier/105/vnc/1649236 b/results/classifier/105/vnc/1649236 new file mode 100644 index 000000000..e037e4661 --- /dev/null +++ b/results/classifier/105/vnc/1649236 @@ -0,0 +1,40 @@ +vnc: 0.746 +device: 0.728 +semantic: 0.653 +KVM: 0.633 +other: 0.632 +network: 0.616 +socket: 0.613 +mistranslation: 0.604 +graphic: 0.546 +instruction: 0.467 +boot: 0.389 +assembly: 0.310 + +Commit snapshot fails with Permission denied when daemonized + +When running qemu with daemonize option it is impossible to run "commit all" in monitor. + +I run qemu 2.7.0 under gentoo 64 bit distribution with following command line: + +qemu-system-x86_64 -m 4096 -cpu host -smp 2 -enable-kvm -snapshot \ + -drive file=vm.img,if=virtio \ + -net nic,model=virtio,macaddr=11:11:11:11:11:11 \ + -net tap,ifname=tap$PORT,script=no,downscript=no \ + -vnc :1 -daemonize \ + -chardev vc,id=mon0 -mon chardev=mon0,mode=readline \ + -chardev socket,id=mon1,host=localhost,port=10001,server,nowait -mon chardev=mon1,mode=control + +I connect to vm through VNC viewer and press CTRL+ALT+2 and run "commit all" command. +Following error is shown: +`commit` error for `all`: Permission denied + +When I run my VM without `daemonize` option the command "commit all" works without errors. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting all older bugs to +"Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1661176 b/results/classifier/105/vnc/1661176 new file mode 100644 index 000000000..ee28ea4dc --- /dev/null +++ b/results/classifier/105/vnc/1661176 @@ -0,0 +1,22 @@ +vnc: 0.961 +mistranslation: 0.900 +network: 0.861 +device: 0.814 +graphic: 0.761 +other: 0.700 +semantic: 0.633 +instruction: 0.575 +socket: 0.401 +assembly: 0.284 +boot: 0.201 +KVM: 0.091 + +[2.8.0] Under VNC CTRL-ALT-2 doesn't get to the monitor + +With version 2.6.2 I could access the monitor via VNC by pressing CTRL-ALT-2, while CTRL-ALT-3 brought me to the "serial0 console". CTRL-ALT-1 brought me back to the VGA console. +Since 2.8.0 CTRL-ALT-2 brings me to the "serial0 console" and CTRL-ALT-3 to the "parallel0 console". The monitor is not available any more, to any CTRL-ALT-1stROW. + +Can you still reproduce your issue with the latest version of QEMU (currently v4.2.0)? Please also always add the way how you launched QEMU (ie. the command line parameters) + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1661815 b/results/classifier/105/vnc/1661815 new file mode 100644 index 000000000..90d3c71b3 --- /dev/null +++ b/results/classifier/105/vnc/1661815 @@ -0,0 +1,56 @@ +vnc: 0.749 +graphic: 0.734 +device: 0.725 +other: 0.708 +network: 0.677 +instruction: 0.595 +socket: 0.557 +semantic: 0.542 +KVM: 0.451 +assembly: 0.446 +boot: 0.368 +mistranslation: 0.251 + +Stack address is returned from function translate_one + +The vulnerable version is qemu-2.8.0, and the vulnerable function is in "target-s390x/translate.c". + +The code snippet is as following. + +static ExitStatus translate_one(CPUS390XState *env, DisasContext *s) +{ + const DisasInsn *insn; + ExitStatus ret = NO_EXIT; + DisasFields f; + ... + s->fields = &f; + ... + s->pc = s->next_pc; + return ret; +} + +A stack address, i.e. the address of local variable "f" is returned from current function through the output parameter "s->fields" as a side effect. + +This issue is one kind of undefined behaviors, according the C Standard, 6.2.4 [ISO/IEC 9899:2011] (https://www.securecoding.cert.org/confluence/display/c/DCL30-C.+Declare+objects+with+appropriate+storage+durations) + +This dangerous defect may lead to an exploitable vulnerability. +We suggest sanitizing "s->fields" as null before return. + +Note that this issue is reported by shqking and Zhenwei Zou together. + +The calling function never uses "->fields", so I do not see a real vulnerability here, is there? Did you use a code analyser for this, or how did you come across this issue? + +Thanks for your reply. + +Inspired by this issue in apache httpd (https://bz.apache.org/bugzilla/show_bug.cgi?id=59844#c0), +we customized a checker based on the Clang Static Analyzer to detect such undefined behavior. + +Yes. +After examining the code carefully, we didn't find any place where the "->fields" is accessed, either. However, we think this kind of defect seems like a 'time bomb' and we'd better fix it just to be on the safe side. + +I've finally posted a patch for this: +https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg05204.html + +Fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=344a7f656e8d211cdd6e + diff --git a/results/classifier/105/vnc/1673 b/results/classifier/105/vnc/1673 new file mode 100644 index 000000000..ee26921cf --- /dev/null +++ b/results/classifier/105/vnc/1673 @@ -0,0 +1,62 @@ +vnc: 0.873 +KVM: 0.859 +device: 0.848 +socket: 0.848 +graphic: 0.836 +semantic: 0.826 +instruction: 0.821 +boot: 0.808 +network: 0.804 +assembly: 0.760 +other: 0.719 +mistranslation: 0.712 + +compilation of 8.0.0 FAILED: target/hexagon/idef-generated-emitter.indented.c on ubuntu 18.04 +Description of problem: +Cannot compile on ubuntu 18.04. +Steps to reproduce: +1. get 8.0.0 tarball or git clone/submodule... on a ubuntu 18.04 system (with a few more recent tools in ~/opt, such as python 3.9) +2. ./configure --prefix=$HOME/opt && make +3. It finishes with this strange error: FAILED: target/hexagon/idef-generated-emitter.indented.c +``` +... +[850/10154] Compiling C object target/hexagon/idef-parser.p/meson-generated_idef-parser.yy.c.o +[851/10154] Compiling C object target/hexagon/idef-parser.p/meson-generated_idef-parser.tab.c.o +[852/10154] Compiling C object target/hexagon/idef-parser.p/_home_pbourguignon_opt_src_qemu-8.0.0_target_hexagon_idef-parser_parser-helpers.c.o +[853/10154] Linking target target/hexagon/idef-parser +[854/10154] Generating target/hexagon/idef-generated-tcg with a custom command +[855/10154] Generating target/hexagon/indent with a custom command +FAILED: target/hexagon/idef-generated-emitter.indented.c +/home/pbourguignon/bin/indent -linux target/hexagon/idef-generated-emitter.c -o target/hexagon/idef-generated-emitter.indented.c +Indenting region... +Indenting region... done +Directory `/home/pbourguignon/opt/src/qemu-8.0.0/build/-linux target/hexagon/idef-generated-emitter.c -o target/hexagon/' does not exist; create? (y or n) Error reading from stdin +ninja: build stopped: subcommand failed. +Makefile:165: recipe for target 'run-ninja' failed +make[1]: *** [run-ninja] Error 1 +make[1]: Leaving directory '/home/pbourguignon/opt/src/qemu-8.0.0/build' +GNUmakefile:10: recipe for target 'all' failed +make: *** [all] Error 2 +``` +Additional information: +https://dpaste.org/Hr9Zq +``` +~/opt/src/qemu-git +16:15[pbourguignon@frprld7818008 :0.0 qemu-git ]$ ls ~/opt/bin +./ ecl-config* pydoc3@ run-avr* run-microblaze* +../ emacs@ pydoc3.9* run-bfin* run-mips* +2to3@ emacs-27.2* python@ run-bpf* run-mn10300* +2to3-3.9* emacsclient* python3@ run-cr16* run-moxie* +bundle* erb* python3-config@ run-cris* run-msp430* +bundler* etags* python3.9* run-d10v* run-or1k* +ccl* gcore* python3.9-config* run-erc32* run-ppc* +ccmake* gdb* racc* run-frv* run-pru* +cmake* gdb-add-index* rake* run-ft32* run-riscv* +cpack* gdbserver* rbs* run-h8300* run-rl78* +ctags* gem* rdbg* run-iq2000* run-rx* +ctest* idle3@ rdoc* run-lm32* run-sh* +curl* idle3.9* ri* run-m32c* run-v850* +curl-config* irb* ruby* run-m32r* sbcl* +ebrowse* pip3* run-aarch64* run-m68hc11* sis* +ecl* pip3.9* run-arm* run-mcore* typeprof* +``` diff --git a/results/classifier/105/vnc/1686 b/results/classifier/105/vnc/1686 new file mode 100644 index 000000000..b113d0805 --- /dev/null +++ b/results/classifier/105/vnc/1686 @@ -0,0 +1,56 @@ +vnc: 0.986 +boot: 0.961 +instruction: 0.933 +device: 0.930 +graphic: 0.904 +network: 0.837 +KVM: 0.818 +socket: 0.699 +semantic: 0.619 +mistranslation: 0.453 +assembly: 0.320 +other: 0.263 + +VPS does not boots with CPU Model QEMU64 or KVM64 +Description of problem: + +Steps to reproduce: +1. Boot the VPS using AlmaLinux 9 ISO / image and it boots to kernel panic +Additional information: +VNC shows this message : + +[ 1.749935] do_exit.cold+0x14/0x9f + +[1.7502581 do_group_exit+0x33/0xa0 + +1.7506001 _x64_sys_exit_group+0x14/0x20 + +1.7510081 do_syscall 64+0x5c/0x90 + +[1.751361] ? syscall_exit_to_user_mode+0x12/0x30 + +[1.7517911 ? do_syscall_64+0x69/0x90 + +[1.752131] ? do_user_addr_fault+0x1d8/0x698 + +[1.7525091 ? exc_page_fault+0x62/0x150 1.752896] entry_SYSCALL_64_after_hwframe+ +0x63/0xcd + +[1.753612] RIP: 0033:0x7fb0e95b62d1 + +[ 1.7539561 Code: c3 of 1f 84 00 00 00 00 00 f3 Of le fa be e7 00 00 00 ba 3c 00 00 00 eb Od 89 de Of 05 48 3d 00 fe ff ff 77 1c f4 89 fe of 05 <48> 3d 00 fe ff ff 76 e7 f7 d8 89 05 ff fe 00 00 eb dd of 1f 44 00 + +[ 1.755047] RSP: 002b:00007ffe484df 288 EFLAGS: 00000246 ORIG_RAX: 00000000000 + +000e7 + +[ 1.755590] RAX: fffff ffffda RBX: 00007fb0e95b0f30 RCX: 00007fb0e95b62d1 1.756100] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 000000000000007f + +[1.756565] RBP: 00007ffe484df410 R08: 00007ffe484dedf9 R09: 0000000000000000 + +[ 1.757034] R10: 00000000ffffffff R11: 0000000000000246 R12: 00007fb0e958f000 + +[ 1.7574981 R13: 0000002300000007 R14: 0000000000000007 R15: 00007ffe484df420 + +[ 1.7579921 Kernel Offset: 0x3aa00000 from Oxffffffff81000000 (relocation ran ge: 0xffffffff80000000-0xffffffffbfffffff) + +[ 1.7589051---[ end Kernel panic code=0x00007f00 --- not syncing: Attempted to kill init! exit diff --git a/results/classifier/105/vnc/1693649 b/results/classifier/105/vnc/1693649 new file mode 100644 index 000000000..6f14e6fa9 --- /dev/null +++ b/results/classifier/105/vnc/1693649 @@ -0,0 +1,103 @@ +vnc: 0.915 +other: 0.896 +KVM: 0.880 +assembly: 0.873 +device: 0.848 +semantic: 0.844 +mistranslation: 0.839 +graphic: 0.838 +boot: 0.821 +network: 0.808 +instruction: 0.803 +socket: 0.772 + +x86 pause misbehaves with -cpu haswell + +Using qemu-2.9.0 + +When booting NetBSD using '-cpu haswell -smp 4', the system fails to initialize the additional CPUs. It appears as though the "application processor" enters routine x86_pause() but never returns. + +x86_pause() is simply two assembler instructions: 'pause; ret;' + +Replacing the routine with 'nop; nop; ret;' allows the system to proceed, of course without the benefit of the pause instruction on spin-loops! + +Additionally, booting with '-cpu phenom -smp 4' also works, although the system does seem confused about the type of CPU being used. + +Further investigation shows that pause may be working, but very very slowly. + +The "use-case" in NetBSD is for "hatching" application CPUs. The target CPU runs a loop that does + + while (flag_1 not set) + for (i = 0; i < 10000; i++) + x86_pause(); /* which is assembly code: "pause; ret;" */ + ... + set flag_2; + return; + +The boot CPU executes the following code for each application CPU: + + set flag_1; + for (i = 0; i < 100000 && flag_2 not set; i++) + i8254_delay(10); /* this should be 10usec per iteration, 1.0 sec total */ + if (flag_2 not set) + panic(cpu did not hatch); + .... + +So, the 10k executions of x86_pause are taking in excess of 1 sec to complete. Indeed, reducing that constant value from 10k to only 100 results in the same failure. So it would seem that the pause instruction is taking an extremely long time to complete. (Further reducing the interation count to only 50 results in success, although it "feels" very sluggish.) + + +Can you still reproduce this issue with the latest version of QEMU (currently 5.0)? + +Seems ok now. + +On Fri, 22 May 2020, Thomas Huth wrote: + +> Can you still reproduce this issue with the latest version of QEMU +> (currently 5.0)? +> +> ** Changed in: qemu +> Status: New => Incomplete +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1693649 +> +> Title: +> x86 pause misbehaves with -cpu haswell +> +> Status in QEMU: +> Incomplete +> +> Bug description: +> Using qemu-2.9.0 +> +> When booting NetBSD using '-cpu haswell -smp 4', the system fails to +> initialize the additional CPUs. It appears as though the "application +> processor" enters routine x86_pause() but never returns. +> +> x86_pause() is simply two assembler instructions: 'pause; ret;' +> +> Replacing the routine with 'nop; nop; ret;' allows the system to +> proceed, of course without the benefit of the pause instruction on +> spin-loops! +> +> Additionally, booting with '-cpu phenom -smp 4' also works, although +> the system does seem confused about the type of CPU being used. +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1693649/+subscriptions +> +> !DSPAM:5ec7625658281532840571! +> +> + ++--------------------+--------------------------+-----------------------+ +| Paul Goyette | PGP Key fingerprint: | E-mail addresses: | +| (Retired) | FA29 0E3B 35AF E8AE 6651 | <email address hidden> | +| Software Developer | 0786 F758 55DE 53BA 7731 | <email address hidden> | ++--------------------+--------------------------+-----------------------+ + + +Ok, thanks for checking again! So I'm closing this ticket now. + diff --git a/results/classifier/105/vnc/1696353 b/results/classifier/105/vnc/1696353 new file mode 100644 index 000000000..36915f13f --- /dev/null +++ b/results/classifier/105/vnc/1696353 @@ -0,0 +1,114 @@ +semantic: 0.961 +vnc: 0.955 +other: 0.944 +device: 0.942 +graphic: 0.934 +KVM: 0.899 +assembly: 0.892 +instruction: 0.888 +mistranslation: 0.869 +socket: 0.851 +network: 0.851 +boot: 0.835 + +golang binaries fail to start under linux-user + +With current master golang binaries fail when run under linux-user, for example: + +[will@localhost qemu]$ ./arm-linux-user/qemu-arm glide +runtime: failed to create new OS thread (have 2 already; errno=22) +fatal error: newosproc + +runtime stack: +runtime.throw(0x45f879, 0x9) + /usr/lib/golang/src/runtime/panic.go:566 +0x78 +runtime.newosproc(0x1092c000, 0x1093bfe0) + /usr/lib/golang/src/runtime/os_linux.go:160 +0x1b0 +runtime.newm(0x4ae1e8, 0x0) + /usr/lib/golang/src/runtime/proc.go:1572 +0x12c +runtime.main.func1() + /usr/lib/golang/src/runtime/proc.go:126 +0x24 +runtime.systemstack(0x5ef900) + /usr/lib/golang/src/runtime/asm_arm.s:247 +0x80 +runtime.mstart() + /usr/lib/golang/src/runtime/proc.go:1079 + +goroutine 1 [running]: +runtime.systemstack_switch() + /usr/lib/golang/src/runtime/asm_arm.s:192 +0x4 fp=0x109287ac sp=0x109287a8 +runtime.main() + /usr/lib/golang/src/runtime/proc.go:127 +0x5c fp=0x109287d4 sp=0x109287ac +runtime.goexit() + /usr/lib/golang/src/runtime/asm_arm.s:998 +0x4 fp=0x109287d4 sp=0x109287d4 + +The reason for this is that the golang runtime does not pass the CLONE_SYSVMEM flag to clone so the clone flags checks fail: + +https://github.com/golang/go/blob/master/src/runtime/os_linux.go#L155 + +The attached patch allows golang binaries to start under linux-user. + + + +The problem with doing that is that it doesn't actually change the behaviour. We use pthread_create to create the new thread, which glibc does with a clone with CLONE_SYSVSEM set. We can't tell the difference between "guest program needs the new threads to not share SysV semaphore behaviour" and "guest program doesn't care but didn't provide the flag" so we err on the side of caution and refuse to create a thread that doesn't behave the way the guest asked us for it to behave. + + +True, but it used to work albeit with slightly wrong semantics. It now fails hard even though the golang runtime doesn't make any use of Sys V semaphores so the presence of the flag is not noticeable by any normal user. + +You can also apply this patch to go - I don't have an opinion on the correct course of action though! + +diff --git a/src/runtime/os_linux.go b/src/runtime/os_linux.go +index a6efc0e3d1..64218e3f7e 100644 +--- a/src/runtime/os_linux.go ++++ b/src/runtime/os_linux.go +@@ -132,7 +132,8 @@ const ( + _CLONE_FS | /* share cwd, etc */ + _CLONE_FILES | /* share fd table */ + _CLONE_SIGHAND | /* share sig handler table */ +- _CLONE_THREAD /* revisit - okay for now */ ++ _CLONE_THREAD | /* revisit - okay for now */ ++ _CLONE_SYSVSEM + ) + + //go:noescape + + +Note that there is a go bug about this issue too: https://github.com/golang/go/issues/20763 + +The go team have applied the above patch and I can confirm that it is now working properly using go-tip. This means it will be fixed in go 1.10. + +So if you recompile your go binary with go-tip or go 1.10 when it is released, it will run fine under qemu-system-arm. + +Since this has been fixed/worked around on the go side (thanks for following up with that!) I'm going to close this as "wontfix" on QEMU's side. It would be nice to support clone() with non-standard flags but since we can't do that while we still link with libc there's no way we can do this without a massive (and massively painful!) redesign to remove our libc dependency so that all of QEMU's linux-user code runs bare on the kernel. + + +I just gave it a test with qemu-arm and Go binaries still crash for me, albeit differently: + +root@nofan:/# cat hello.go +package main + +import "fmt" + +func main() { + fmt.Println("hello world") +} +root@nofan:/# gccgo-7 hello.go -o hello +root@nofan:/# ./hello +mmap errno 9 +fatal error: mmap + +runtime stack: +mmap errno 9 +fatal error: mmap +panic during panic + +runtime stack: +mmap errno 9 +fatal error: mmap +stack trace unavailable +root@nofan:/# + +Should I file a new bug report? + +Yes, new bug please, that's definitely a different symptom and likely an unrelated issue. + + diff --git a/results/classifier/105/vnc/170 b/results/classifier/105/vnc/170 new file mode 100644 index 000000000..ae615907d --- /dev/null +++ b/results/classifier/105/vnc/170 @@ -0,0 +1,14 @@ +vnc: 0.867 +device: 0.813 +network: 0.810 +semantic: 0.390 +socket: 0.188 +graphic: 0.188 +boot: 0.176 +other: 0.135 +assembly: 0.114 +mistranslation: 0.104 +instruction: 0.073 +KVM: 0.002 + +Request to add something like "Auth failed from IP" log report for built-in VNC server diff --git a/results/classifier/105/vnc/1705717 b/results/classifier/105/vnc/1705717 new file mode 100644 index 000000000..bf87e4400 --- /dev/null +++ b/results/classifier/105/vnc/1705717 @@ -0,0 +1,76 @@ +vnc: 0.718 +mistranslation: 0.678 +graphic: 0.593 +assembly: 0.587 +KVM: 0.583 +network: 0.574 +semantic: 0.567 +device: 0.558 +instruction: 0.550 +other: 0.529 +socket: 0.508 +boot: 0.411 + +Live migration fails with 'host' cpu when KVM is inserted with nested=1 + +Qemu v2.9.0 +Linux kernel 4.9.34 + +Live migration(pre-copy) being done from one physical host to another: + +Source Qemu: +sudo qemu-system-x86_64 -drive file=${IMAGE_DIR}/${IMAGE_NAME},if=virtio -m 2048 -smp 1 -net nic,model=virtio,macaddr=${MAC} -net tap,ifname=qtap0,script=no,downscript=no -vnc :1 --enable-kvm -cpu kvm64 -qmp tcp:*:4242,server,nowait + +And KVM is inserted with nested=1 on both source and destination machine. + +Migration fails with a nested specific assertion failure on destination at target/i386/kvm.c +1629 + +Migration is successful in the following cases- + +A) cpu model is 'host' and kvm is inserted without nested=1 parameter +B) If instead of 'host' cpu model, 'kvm64' is used (KVM nested=1) +C) If instead of 'host' cpu model, 'kvm64' is used (KVM nested=0) +D) Between an L0 and a guest Hypervisor L1, with 'kvm64' as CPU type (and nested=1 for L0 KVM) + +Physical host(s)- +$ lscpu +Architecture: x86_64 +CPU op-mode(s): 32-bit, 64-bit +Byte Order: Little Endian +CPU(s): 12 +On-line CPU(s) list: 0-11 +Thread(s) per core: 1 +Core(s) per socket: 6 +Socket(s): 2 +NUMA node(s): 2 +Vendor ID: GenuineIntel +CPU family: 6 +Model: 62 +Model name: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz +Stepping: 4 +CPU MHz: 1200.091 +CPU max MHz: 2600.0000 +CPU min MHz: 1200.0000 +BogoMIPS: 4203.28 +Virtualization: VT-x +L1d cache: 32K +L1i cache: 32K +L2 cache: 256K +L3 cache: 15360K +NUMA node0 CPU(s): 0-5 +NUMA node1 CPU(s): 6-11 +Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts + +Hi, + Can you please give the exact assertion failure. + +However, I'm confused - I think you're saying that your setup is that both hosts have nested enabled, but this is a migration of top level VM - correct? Does the top level VM have a guest inside it - migration with a nested guest is known not to work, however migration of a VM on a host with nested enabled should work if the guest doesn't use the nest. + + +Hello, +I could not replicate this behavior on another system. +So, please close this bug. +Apologies for the inconvenience. + +Hmm OK; but if you do hit it again please just reopen this one and give the full assert and details + diff --git a/results/classifier/105/vnc/1715186 b/results/classifier/105/vnc/1715186 new file mode 100644 index 000000000..8dda880ad --- /dev/null +++ b/results/classifier/105/vnc/1715186 @@ -0,0 +1,65 @@ +vnc: 0.970 +socket: 0.959 +network: 0.826 +other: 0.822 +device: 0.760 +graphic: 0.752 +semantic: 0.732 +KVM: 0.673 +instruction: 0.555 +boot: 0.537 +assembly: 0.398 +mistranslation: 0.376 + +websockets: Improve error messages + +Since 2.9 / 07e95cd529af345fdeea230913f68eff5b925bb6 , whenever the VNC websocket server finds an error with the incoming connection request, it just closes the socket with no further information. + +This makes figuring out what's wrong with the request nearly impossible. + +I would be nice if: + +* HTTP 400 were returned to the client, with an appropriate error message +* Maybe something written to the log as well? + +Currently, I'm resorting to looking at my request and the websocket source and hoping I can figure out what's wrong. + +At very least we should also use 404 if given a invalid path + +Will be included for 2.11 in + +commit 3a3f8705962c8c8a47a9b981ffd5aab7274ad508 +Author: Daniel P. Berrange <email address hidden> +Date: Wed Sep 6 11:38:36 2017 +0100 + + io: include full error message in websocket handshake trace + + When the websocket handshake fails it is useful to log the real + error message via the trace points for debugging purposes. + + Fixes bug: #1715186 + + Reviewed-by: Philippe Mathieu-Daudé <email address hidden> + Signed-off-by: Daniel P. Berrange <email address hidden> + +commit f69a8bde29354493ff8aea64cc9cb3b531d16337 +Author: Daniel P. Berrange <email address hidden> +Date: Wed Sep 6 11:33:17 2017 +0100 + + io: send proper HTTP response for websocket errors + + When any error occurs while processing the websockets handshake, + QEMU just terminates the connection abruptly. This is in violation + of the HTTP specs and does not help the client understand what they + did wrong. This is particularly bad when the client gives the wrong + path, as a "404 Not Found" would be very helpful. + + Refactor the handshake code so that it always sends a response to + the client unless there was an I/O error. + + Fixes bug: #1715186 + + Reviewed-by: Philippe Mathieu-Daudé <email address hidden> + Signed-off-by: Daniel P. Berrange <email address hidden> + + diff --git a/results/classifier/105/vnc/1721221 b/results/classifier/105/vnc/1721221 new file mode 100644 index 000000000..487fa9d75 --- /dev/null +++ b/results/classifier/105/vnc/1721221 @@ -0,0 +1,104 @@ +vnc: 0.703 +other: 0.701 +KVM: 0.698 +boot: 0.665 +mistranslation: 0.659 +device: 0.656 +instruction: 0.651 +semantic: 0.637 +assembly: 0.637 +socket: 0.635 +graphic: 0.620 +network: 0.605 + +PCI-E passthrough of Nvidia GTX GFX card to Win 10 guest fails with "kvm_set_phys_mem: error registering slot: Invalid argument" + +Problem: +Passthrough of a PCI-E Nvidia GTX 970 GFX card to a Windows 10 guest from a Debian Stretch host fails after recent changes to kvm in QEMU master/trunk. Before this recent commit, everything worked as expected. + +QEMU Version: +Master/trunk pulled from github 4/10/17 ( git reflog: d147f7e815 HEAD@{0} ) + +Host: +Debian Stretch kernel SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux + +Guest: +Windows 10 Professional + +Issue is with this commit: +https://github.com/qemu/qemu/commit/f357f564be0bd45245b3ccfbbe20ace08fe83ca8 + +Subsequent commit does not help: +https://github.com/qemu/qemu/commit/3110cdbd8a4845c5b5fb861b0a664c56d993dd3c#diff-7b7a17f6e8ba4195198dd685073f43cb + +Error output from qemu: +(qemu) kvm_set_phys_mem: error registering slot: Invalid argument + +QEMU commandline used: + +./sources/qemu/x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm -serial none -parallel none -name Windows \ +-enable-kvm -cpu host,kvm=off,hv_vendor_id=sugoidesu,-hypervisor -smp 6,sockets=1,cores=3,threads=2 \ +-m 8G -mem-path /dev/hugepages -mem-prealloc -balloon none \ +-drive if=pflash,format=raw,readonly,file=vms/ovmf-x64/ovmf-x64/OVMF_CODE-pure-efi.fd \ +-drive if=pflash,format=raw,file=vms/ovmf-x64/ovmf-x64/OVMF_VARS-pure-efi.fd \ +-rtc clock=host,base=localtime \ +-readconfig ./vms/q35-virtio-graphical.cfg \ +-object iothread,id=iothread0 -object iothread,id=iothread1 -object iothread,id=iothread2 -object iothread,id=iothread3 \ +-device virtio-scsi-pci,iothread=iothread0,id=scsi0 -device virtio-scsi-pci,iothread=iothread1,id=scsi1 -device virtio-scsi-pci,iothread=iothread2,id=scsi2 -device virtio-scsi-pci,iothread=iothread3,id=scsi3 \ +-device scsi-hd,bus=scsi0.0,drive=drive0,bootindex=1 -device scsi-hd,bus=scsi1.0,drive=drive1 -device scsi-hd,bus=scsi2.0,drive=drive2 -device scsi-hd,bus=scsi3.0,drive=drive3 -device scsi-hd,bus=scsi1.0,drive=drive4 -device scsi-hd,bus=scsi2.0,drive=drive5 -device scsi-hd,bus=scsi3.0,drive=drive6 -device scsi-hd,bus=scsi1.0,drive=drive7 -device scsi-hd,bus=scsi2.0,drive=drive8 -device scsi-hd,bus=scsi3.0,drive=drive9 \ +-drive if=none,id=drive0,file=vms/w10p64.qcow2,format=qcow2,cache=none,discard=unmap \ +-drive if=none,id=drive1,file=vms/w10p64-2.qcow2,format=qcow2,cache=none,discard=unmap \ +-drive if=none,id=drive2,file=/dev/mapper/w10p64-3,format=raw,cache=none \ +-drive if=none,id=drive3,file=vms/w10p64-4.qcow2,format=qcow2,cache=none \ +-drive if=none,id=drive4,file=vms/w10p64-5.qcow2,format=qcow2,cache=none \ +-drive if=none,id=drive5,file=vms/w10p64-6.qcow2,format=qcow2,cache=none,discard=unmap \ +-drive if=none,id=drive6,file=/dev/mapper/w10p64-7,format=raw,cache=none \ +-drive if=none,id=drive7,file=vms/w10p64-8.qcow2,format=qcow2,cache=none,discard=unmap \ +-device vfio-pci,host=01:00.0,multifunction=on,x-vga=on \ +-device vfio-pci,host=01:00.1,multifunction=on \ +-netdev type=tap,id=net1,ifname=tap1,script=no,downscript=no,vhost=on \ +-device virtio-net-pci,netdev=net1,mac=52:54:00:18:32:c9,bus=pcie.2,addr=00.0,ioeventfd=on \ +-device usb-host,bus=usb.0,hostbus=3,hostport=2.1 \ +-device usb-host,hostbus=3,hostport=2.2 \ +-device usb-host,bus=ich9-ehci-1.0,hostbus=3,hostport=2.4 \ +-object input-linux,id=kbd1,evdev=/dev/input/event0,grab_all=yes,repeat=on \ +-drive if=none,id=drive8,file=vms/w10p64.qcow2-9,format=qcow2,discard=unmap \ +-drive if=none,id=drive9,file=vms/w10p64-10.qcow2,format=qcow2,cache=none,discard=unmap \ +-device usb-host,bus=usb.0,hostbus=3,hostport=9 \ +-monitor stdio + +lspci -tv +-[0000:00]-+-00.0 Intel Corporation 4th Gen Core Processor DRAM Controller + +-01.0-[01]--+-00.0 NVIDIA Corporation GM204 [GeForce GTX 970] + | \-00.1 NVIDIA Corporation GM204 High Definition Audio Controller + +-02.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller + +-03.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller + +-14.0 Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI + +-16.0 Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 + +-19.0 Intel Corporation Ethernet Connection I217-V + +-1a.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 + +-1b.0 Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller + +-1c.0-[02]-- + +-1c.3-[03]----00.0 Broadcom Limited BCM4352 802.11ac Wireless Network Adapter + +-1d.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 + +-1f.0 Intel Corporation Z87 Express LPC Controller + +-1f.2 Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] + \-1f.3 Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller + + +I should probably add that I am using a recent OVMF firmware from this repo: https://www.kraxel.org/repos/jenkins/edk2/ namely edk2.git-ovmf-x64-0-20170914.b2974.g7f2f96f1a8.noarch.rpm dated 18/9/17 + +There's another bug report / discussion thread on qemu-devel about the same commit: + +http://<email address hidden> + +I'll add a note to that thread about this LP report too. + +Apologies, the title of this bug might be very misleading. I've made a huge assumption that PCI-E GFX card pass-through was triggering the bug, purely because a Debian Stretch guest VM on the same host, using the same build of QEMU, which uses virtio-vga for graphics and the same version of OVMF firmware, boots up fine. + +More noise... OK, so I've tested the Windows 10 guest VM using the same criteria but eliminating PCI-E pass-through and using virtio-vga graphics instead, and the the guest boots up fine so perhaps my assumption is correct. + +Let me know if you need to see the memory I/O regions or dmesg of my host etc. + +Fix will end up in QEMU master soon. + diff --git a/results/classifier/105/vnc/1732671 b/results/classifier/105/vnc/1732671 new file mode 100644 index 000000000..9b2f99a69 --- /dev/null +++ b/results/classifier/105/vnc/1732671 @@ -0,0 +1,31 @@ +vnc: 0.964 +network: 0.888 +socket: 0.871 +device: 0.785 +mistranslation: 0.768 +other: 0.703 +semantic: 0.682 +graphic: 0.678 +instruction: 0.597 +KVM: 0.378 +boot: 0.365 +assembly: 0.299 + +vnc websocket compatibility issue + +WebSocket support in VNC should allow access from VNC client through upgraded WebSocket connection. This feature is not working in IE 11/Edge with noVNC HTML5 client, in contrast to that in Firefox/Safari, etc. + +The reason that IE 11/Edge fails to accept the connection upgrade is that the value equality of the `Upgrade` header field is checked in a strict case-sensitive manner in QEMU side, however, the IE/Edge does not send the exactly same string value `websocket` but a capital letter `W` instead. + +Defined in section 4.2.1 of rfc6455, the upgrade header field shall be treated case-insensitively. + +A patch shall be made in `io/channel-websock.c`, converting the value of upgrade string to lowercase before comparison is made with QIO_CHANNEL_WEBSOCK_UPGRADE_WEBSOCKET, to allow case-insensitive comparison in the process. + +Which version of QEMU did you test this against ? It should be fixed in current GIT master AFAIK + +I think it should have been fixed in 33badfd. + +Sorry for the noise. + +No problem, it is a valid bug report, since we've not actually released the fix yet, so changing status. + diff --git a/results/classifier/105/vnc/1739371 b/results/classifier/105/vnc/1739371 new file mode 100644 index 000000000..dbc5df24d --- /dev/null +++ b/results/classifier/105/vnc/1739371 @@ -0,0 +1,208 @@ +vnc: 0.912 +mistranslation: 0.903 +KVM: 0.894 +other: 0.889 +graphic: 0.888 +device: 0.865 +instruction: 0.861 +boot: 0.856 +semantic: 0.854 +socket: 0.823 +assembly: 0.807 +network: 0.794 + +qemu-system-arm snapshot loadvm core dumped + +Ubuntu Qemu is crashing trying to restore saved snapshot in qemu-system-arm. +I've tried different guests kernels, but I wasn't lucky with any of them. + +The guest vm boots and I can use it normally. The issue is when I save the snapshot using "savevm Base0", "quit" and then I restore that snapshot using "-loadvm Base0" from the cmd line. + +The only difference I've noticed is tweaking the guest memory: +* With -m 512 or 1024 it crashes as you can see below. +* With -m 2048 it doesn't crash, it restores the vm and I can see the screen as it was, but the OS is halted. And it's not just the keyboard. I've tried saving the snapshot while it's booting with lot of lines being printed on screen and after restoring it, the OS is frozen. + +I also tried limiting the guest kernel memory using the mem parameter (mem=2048M) and disabling the kernel address space randomization (nokaslr) with the same results. + +OS: Ubuntu 16.04.3 LTS (xenial) + +$ qemu-system-arm --version +QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.16), Copyright (c) 2003-2008 Fabrice Bellard + +$ qemu-system-arm -kernel kernel/vmlinuz-4.10.0-42-generic -initrd kernel/initrd.img-4.10.0-42-generic -M vexpress-a15 -m 512 -append 'root=/dev/mmcblk0 rootwait console=tty0' -sd vexpress-4G.qcow2 -dtb device-tree/vexpress-v2p-ca15-tc1.dtb -loadvm Base0 +pulseaudio: set_sink_input_volume() failed +pulseaudio: Reason: Invalid argument +pulseaudio: set_sink_input_mute() failed +pulseaudio: Reason: Invalid argument +qemu: fatal: Trying to execute code outside RAM or ROM at 0xc0321568 + +R00=00000001 R01=00000000 R02=00000000 R03=c0321560 +R04=c1500000 R05=c150529c R06=c1505234 R07=c14384d0 +R08=00000000 R09=00000000 R10=c1501f50 R11=c1501f3c +R12=c1501f40 R13=c1501f30 R14=c030a184 R15=c0321568 +PSR=60070093 -ZC- A S svc32 +s00=6374652f s01=636f6c2f d00=636f6c2f6374652f +s02=7273752f s03=6962732f d01=6962732f7273752f +s04=6e612f6e s05=6f726361 d02=6f7263616e612f6e +s06=7c7c206e s07=63202820 d03=632028207c7c206e +s08=202f2064 s09=72202626 d04=72202626202f2064 +s10=702d6e75 s11=73747261 d05=73747261702d6e75 +s12=722d2d20 s13=726f7065 d06=726f7065722d2d20 +s14=652f2074 s15=632f6374 d07=632f6374652f2074 +s16=00000000 s17=00000000 d08=0000000000000000 +s18=00000000 s19=00000000 d09=0000000000000000 +s20=00000000 s21=00000000 d10=0000000000000000 +s22=00000000 s23=00000000 d11=0000000000000000 +s24=00000000 s25=00000000 d12=0000000000000000 +s26=00000000 s27=00000000 d13=0000000000000000 +s28=00000000 s29=00000000 d14=0000000000000000 +s30=00000000 s31=00000000 d15=0000000000000000 +s32=00000000 s33=00000000 d16=0000000000000000 +s34=00000000 s35=00000000 d17=0000000000000000 +s36=00000000 s37=00000000 d18=0000000000000000 +s38=00000000 s39=00000000 d19=0000000000000000 +s40=00000000 s41=00000000 d20=0000000000000000 +s42=00000000 s43=00000000 d21=0000000000000000 +s44=00000000 s45=00000000 d22=0000000000000000 +s46=00000000 s47=00000000 d23=0000000000000000 +s48=00000000 s49=00000000 d24=0000000000000000 +s50=00000000 s51=00000000 d25=0000000000000000 +s52=00000000 s53=00000000 d26=0000000000000000 +s54=00000000 s55=00000000 d27=0000000000000000 +s56=00000000 s57=00000000 d28=0000000000000000 +s58=00000000 s59=00000000 d29=0000000000000000 +s60=00000000 s61=00000000 d30=0000000000000000 +s62=00000000 s63=00000000 d31=0000000000000000 +FPSCR: 00000000 +Aborted (core dumped) + +As I said above, the same happens when -m 1024 is used. + +I have a different issue when I use the qemu git master version, but I'm submiting a different ticket for that. + +Cheers, +Gus + +I suspect this is the same underlying state save/restore bug as your other LP:1739378 -- it's just that since 2.5.0 we've improved the error reporting for cases where the incoming migration state doesn't match what we're expecting, so we report an error message rather than just mangling the simulation state and causing the guest to crash on restart. + + +Thanks Peter for your prompt response. + +I wonder if you have any plan of fixing it. Also if there is any workaround. + +Is the bug limited to arm? Is there any main ticket where I can follow the +progress of this issue? + +On Thu., 21 Dec. 2017, 2:50 am Peter Maydell, <email address hidden> +wrote: + +> I suspect this is the same underlying state save/restore bug as your +> other LP:1739378 -- it's just that since 2.5.0 we've improved the error +> reporting for cases where the incoming migration state doesn't match +> what we're expecting, so we report an error message rather than just +> mangling the simulation state and causing the guest to crash on restart. +> +> -- +> You received this bug notification because you are subscribed to the bug +> report. +> https://bugs.launchpad.net/bugs/1739371 +> +> Title: +> qemu-system-arm snapshot loadvm core dumped +> +> Status in QEMU: +> New +> +> Bug description: +> Ubuntu Qemu is crashing trying to restore saved snapshot in +> qemu-system-arm. +> I've tried different guests kernels, but I wasn't lucky with any of them. +> +> The guest vm boots and I can use it normally. The issue is when I save +> the snapshot using "savevm Base0", "quit" and then I restore that +> snapshot using "-loadvm Base0" from the cmd line. +> +> The only difference I've noticed is tweaking the guest memory: +> * With -m 512 or 1024 it crashes as you can see below. +> * With -m 2048 it doesn't crash, it restores the vm and I can see the +> screen as it was, but the OS is halted. And it's not just the keyboard. +> I've tried saving the snapshot while it's booting with lot of lines being +> printed on screen and after restoring it, the OS is frozen. +> +> I also tried limiting the guest kernel memory using the mem parameter +> (mem=2048M) and disabling the kernel address space randomization +> (nokaslr) with the same results. +> +> OS: Ubuntu 16.04.3 LTS (xenial) +> +> $ qemu-system-arm --version +> QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.16), Copyright +> (c) 2003-2008 Fabrice Bellard +> +> $ qemu-system-arm -kernel kernel/vmlinuz-4.10.0-42-generic -initrd +> kernel/initrd.img-4.10.0-42-generic -M vexpress-a15 -m 512 -append +> 'root=/dev/mmcblk0 rootwait console=tty0' -sd vexpress-4G.qcow2 -dtb +> device-tree/vexpress-v2p-ca15-tc1.dtb -loadvm Base0 +> pulseaudio: set_sink_input_volume() failed +> pulseaudio: Reason: Invalid argument +> pulseaudio: set_sink_input_mute() failed +> pulseaudio: Reason: Invalid argument +> qemu: fatal: Trying to execute code outside RAM or ROM at 0xc0321568 +> +> R00=00000001 R01=00000000 R02=00000000 R03=c0321560 +> R04=c1500000 R05=c150529c R06=c1505234 R07=c14384d0 +> R08=00000000 R09=00000000 R10=c1501f50 R11=c1501f3c +> R12=c1501f40 R13=c1501f30 R14=c030a184 R15=c0321568 +> PSR=60070093 -ZC- A S svc32 +> s00=6374652f s01=636f6c2f d00=636f6c2f6374652f +> s02=7273752f s03=6962732f d01=6962732f7273752f +> s04=6e612f6e s05=6f726361 d02=6f7263616e612f6e +> s06=7c7c206e s07=63202820 d03=632028207c7c206e +> s08=202f2064 s09=72202626 d04=72202626202f2064 +> s10=702d6e75 s11=73747261 d05=73747261702d6e75 +> s12=722d2d20 s13=726f7065 d06=726f7065722d2d20 +> s14=652f2074 s15=632f6374 d07=632f6374652f2074 +> s16=00000000 s17=00000000 d08=0000000000000000 +> s18=00000000 s19=00000000 d09=0000000000000000 +> s20=00000000 s21=00000000 d10=0000000000000000 +> s22=00000000 s23=00000000 d11=0000000000000000 +> s24=00000000 s25=00000000 d12=0000000000000000 +> s26=00000000 s27=00000000 d13=0000000000000000 +> s28=00000000 s29=00000000 d14=0000000000000000 +> s30=00000000 s31=00000000 d15=0000000000000000 +> s32=00000000 s33=00000000 d16=0000000000000000 +> s34=00000000 s35=00000000 d17=0000000000000000 +> s36=00000000 s37=00000000 d18=0000000000000000 +> s38=00000000 s39=00000000 d19=0000000000000000 +> s40=00000000 s41=00000000 d20=0000000000000000 +> s42=00000000 s43=00000000 d21=0000000000000000 +> s44=00000000 s45=00000000 d22=0000000000000000 +> s46=00000000 s47=00000000 d23=0000000000000000 +> s48=00000000 s49=00000000 d24=0000000000000000 +> s50=00000000 s51=00000000 d25=0000000000000000 +> s52=00000000 s53=00000000 d26=0000000000000000 +> s54=00000000 s55=00000000 d27=0000000000000000 +> s56=00000000 s57=00000000 d28=0000000000000000 +> s58=00000000 s59=00000000 d29=0000000000000000 +> s60=00000000 s61=00000000 d30=0000000000000000 +> s62=00000000 s63=00000000 d31=0000000000000000 +> FPSCR: 00000000 +> Aborted (core dumped) +> +> As I said above, the same happens when -m 1024 is used. +> +> I have a different issue when I use the qemu git master version, but +> I'm submiting a different ticket for that. +> +> Cheers, +> Gus +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1739371/+subscriptions +> + + +I'm going to close this bug as a duplicate of #1739378 (as noted in my earlier comment). + + + diff --git a/results/classifier/105/vnc/1752646 b/results/classifier/105/vnc/1752646 new file mode 100644 index 000000000..2b826fe15 --- /dev/null +++ b/results/classifier/105/vnc/1752646 @@ -0,0 +1,41 @@ +vnc: 0.938 +graphic: 0.866 +instruction: 0.818 +device: 0.814 +network: 0.790 +other: 0.781 +semantic: 0.668 +mistranslation: 0.660 +boot: 0.413 +socket: 0.387 +assembly: 0.273 +KVM: 0.131 + +Freezing VNC screen on some the UEFI framebuffer applications + +Hi folks! + +I use TianCore (UEFI) formware on the qemu (master version last commit a6e0344). +When kernel/linux is start, it using UEFI Framebuffer. Then I run UEFI application (which writes directly to the framebuffer) my VNS screen is freezing. Then I restart vnclient I see only one frame. + +When I run application, I getting in the file hw/display/vga.c on function 'vga_ioport_write' some commands, it change "s->ar_index" from 0x20 -> 0x10 + +In the function vga_update_display: +1751 if (!(s->ar_index & 0x20)) { +1752 graphic_mode = GMODE_BLANK; +1753 } else { + +And I got GMODE_BLANK mode. If I patch it: +1751 if (0) { + +my VNC not freezing. + +From "Hardware Level VGA and SVGA Video Programming Information Page" I saw, what ar_index is 0x3C0 (Attribute Controller Data Write Register), 0x20(5-bit) is PAS -- Palette Address Source + +If there is a output via the UEFI framebuffer, does the difference have a PAS or not? Why do we need to pause the output if the PAS is exposed? Especially when the application outputs via framebuffer. + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1762179 b/results/classifier/105/vnc/1762179 new file mode 100644 index 000000000..a0391b1f8 --- /dev/null +++ b/results/classifier/105/vnc/1762179 @@ -0,0 +1,103 @@ +vnc: 0.962 +assembly: 0.961 +socket: 0.961 +semantic: 0.958 +network: 0.958 +graphic: 0.957 +instruction: 0.955 +other: 0.954 +device: 0.952 +mistranslation: 0.950 +KVM: 0.950 +boot: 0.930 + +Record and replay replay fails with: "ERROR:replay/replay-time.c:49:replay_read_clock: assertion failed" + +QEMU master at 08e173f29461396575c85510eb41474b993cb1fb + +QEMU commands: + + +``` +#!/usr/bin/env bash +cmd="\ +time \ +./out/x86_64/buildroot/host/usr/bin/qemu-system-x86_64 \ +-M pc \ +-append 'root=/dev/sda console=ttyS0 nokaslr printk.time=y - lkmc_eval=\"/rand_check.out;/sbin/ifup -a;wget -S google.com;/poweroff.out;\"' \ +-kernel 'out/x86_64/buildroot/images/bzImage' \ +-nographic \ +\ +-drive file=out/x86_64/buildroot/images/rootfs.ext2.qcow2,if=none,id=img-direct,format=qcow2 \ +-drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay \ +-device ide-hd,drive=img-blkreplay \ +\ +-netdev user,id=net1 \ +-device rtl8139,netdev=net1 \ +-object filter-replay,id=replay,netdev=net1 \ +" +echo "$cmd" +eval "$cmd -icount 'shift=7,rr=record,rrfile=replay.bin'" +eval "$cmd -icount 'shift=7,rr=replay,rrfile=replay.bin'" +``` + +Images uploaded to: https://github.com/cirosantilli/linux-kernel-module-cheat/releases/download/test-replay-arm/images4.zip + +The replay failed straight out with: + +``` +ERROR:replay/replay-time.c:49:replay_read_clock: assertion failed: (replay_file && replay_mutex_locked()) +``` + +Images generated with: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/9513c162ef57e6cb70006dfe870856f94ee9a133 + +QEMU configure: + +``` +cd /home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/build/host-qemu-custom; PATH="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin:/home/ciro/bak/git/linux-kernel-modul +e-cheat/out/x86_64/buildroot/host/sbin:./node_modules/.bin:/usr/local/heroku/bin:/home/ciro/android-sdk/platform-tools:/home/ciro/android-sdk/tools:/home/ciro/android-studio//bin:/home/ciro/android-sdk/ndk-bundl +e:/home/ciro/android-sdk/ndk-bundle/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin:/home/ciro/bak/git/devbin:/home/ciro/bin:/usr/local/texlive/2013/bin/x86_64-linux:/home/ciro/.rvm/gems/ruby-2.4. +1/bin:/home/ciro/.rvm/gems/ruby-2.4.1@global/bin:/home/ciro/.rvm/rubies/ruby-2.4.1/bin:./node_modules/.bin:/usr/local/heroku/bin:/home/ciro/android-sdk/platform-tools:/home/ciro/android-sdk/tools:/home/ciro/andr +oid-studio//bin:/home/ciro/android-sdk/ndk-bundle:/home/ciro/android-sdk/ndk-bundle/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin:/home/ciro/bak/git/devbin:/home/ciro/bin:/usr/local/texlive/2013 +/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/bin:/snap/bin:/home/ciro/bak/git/latex:/home/ciro/.rvm/bin:/home/ciro/anaconda2/bin:/home/ciro/.cab +al/bin:/bin:/home/ciro/.go/bin:/home/ciro/.local/bin/:/home/ciro/bak/git/runlinux:/usr/bin:/home/ciro/bak/git/latex:/home/ciro/.rvm/bin:/home/ciro/anaconda2/bin:/home/ciro/.cabal/bin:/bin:/home/ciro/.go/bin:/home/ciro/.local/bin/:/home/ciro/bak/git/runlinux" PKG_CONFIG="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/pkg-config" PKG_CONFIG_SYSROOT_DIR="/" PKG_CONFIG_ALLOW_SYSTEM_CFLAGS=1 PKG_ +CONFIG_ALLOW_SYSTEM_LIBS=1 PKG_CONFIG_LIBDIR="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/lib/pkgconfig:/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/share/pkg +config" AR="/usr/bin/ar" AS="/usr/bin/as" LD="/usr/bin/ld" NM="/usr/bin/nm" CC="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/ccache /usr/bin/gcc" GCC="/home/ciro/bak/git/linux-kerne +l-module-cheat/out/x86_64/buildroot/host/bin/ccache /usr/bin/gcc" CXX="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/ccache /usr/bin/g++" CPP="/usr/bin/cpp" OBJCOPY="/usr/bin/objcopy +" RANLIB="/usr/bin/ranlib" CPPFLAGS="-I/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/include" CFLAGS="-O2 -I/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/include +" CXXFLAGS="-O2 -I/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/include" LDFLAGS="-L/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/lib -Wl,-rpath,/home/ciro/bak/g +it/linux-kernel-module-cheat/out/x86_64/buildroot/host/lib" INTLTOOL_PERL=/usr/bin/perl CPP="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/ccache /usr/bin/gcc -E" ./configure --targe +t-list="x86_64-softmmu" --prefix="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host" --interp-prefix=/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/x86_64-buildroot-l +inux-uclibc/sysroot --cc="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/ccache /usr/bin/gcc" --host-cc="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/cca +che /usr/bin/gcc" --python=/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin/python2 --extra-cflags="-O2 -I/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/include" + --extra-ldflags="-L/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/lib -Wl,-rpath,/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/lib" --enable-debug --extra-cflags +=-DDEBUG_PL061=1 --enable-trace-backends=simple --enable-sdl --with-sdlabi=2.0 +``` + +QEMU build: + +``` +PATH="/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/bin:/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/sbin:./node_modules/.bin:/usr/local/heroku/bin:/home/ciro/a +ndroid-sdk/platform-tools:/home/ciro/android-sdk/tools:/home/ciro/android-studio//bin:/home/ciro/android-sdk/ndk-bundle:/home/ciro/android-sdk/ndk-bundle/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_6 +4/bin:/home/ciro/bak/git/devbin:/home/ciro/bin:/usr/local/texlive/2013/bin/x86_64-linux:/home/ciro/.rvm/gems/ruby-2.4.1/bin:/home/ciro/.rvm/gems/ruby-2.4.1@global/bin:/home/ciro/.rvm/rubies/ruby-2.4.1/bin:./node +_modules/.bin:/usr/local/heroku/bin:/home/ciro/android-sdk/platform-tools:/home/ciro/android-sdk/tools:/home/ciro/android-studio//bin:/home/ciro/android-sdk/ndk-bundle:/home/ciro/android-sdk/ndk-bundle/toolchain +s/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin:/home/ciro/bak/git/devbin:/home/ciro/bin:/usr/local/texlive/2013/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/us +r/local/games:/usr/bin:/snap/bin:/home/ciro/bak/git/latex:/home/ciro/.rvm/bin:/home/ciro/anaconda2/bin:/home/ciro/.cabal/bin:/bin:/home/ciro/.go/bin:/home/ciro/.local/bin/:/home/ciro/bak/git/runlinux:/usr/bin:/h +ome/ciro/bak/git/latex:/home/ciro/.rvm/bin:/home/ciro/anaconda2/bin:/home/ciro/.cabal/bin:/bin:/home/ciro/.go/bin:/home/ciro/.local/bin/:/home/ciro/bak/git/runlinux" PKG_CONFIG="/home/ciro/bak/git/linux-kernel-m +odule-cheat/out/x86_64/buildroot/host/bin/pkg-config" PKG_CONFIG_SYSROOT_DIR="/" PKG_CONFIG_ALLOW_SYSTEM_CFLAGS=1 PKG_CONFIG_ALLOW_SYSTEM_LIBS=1 PKG_CONFIG_LIBDIR="/home/ciro/bak/git/linux-kernel-module-cheat/ou +t/x86_64/buildroot/host/lib/pkgconfig:/home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroot/host/share/pkgconfig" /usr/bin/make -j8 -C /home/ciro/bak/git/linux-kernel-module-cheat/out/x86_64/buildroo +t/build/host-qemu-custom +``` + +I am getting the same errors while doing a "replay". Are there any updates on the resolution/fix ? + +@arna35: I have tested this yet unmerged patch: https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg04286.html and it solves this problem, I will close this issue once it gets merged. + +@Ciro, + +I hope this solves the problem for me too. Thanks for highlighting the patch. + +Looks like the patches have been merged now (see commit cda382594b7ea50aff5f672f32767f9f9fef4c12 and earlier) + +Released with QEMU v5.2.0. + diff --git a/results/classifier/105/vnc/1766904 b/results/classifier/105/vnc/1766904 new file mode 100644 index 000000000..28ae09dd7 --- /dev/null +++ b/results/classifier/105/vnc/1766904 @@ -0,0 +1,60 @@ +vnc: 0.785 +KVM: 0.783 +other: 0.761 +instruction: 0.727 +network: 0.703 +graphic: 0.695 +semantic: 0.688 +boot: 0.683 +device: 0.683 +assembly: 0.682 +socket: 0.682 +mistranslation: 0.661 + +Creating high hdd load (with constant fsyncs) on a SATA disk leads to freezes and errors in guest dmesg + +After upgrading from qemu 2.10.0+dfsg-2 to 2.12~rc3+dfsg-2 (on debian sid host), centos 7 guest started to show freezes and ata errors in dmesg during hdd workloads with writing many small files and repeated fsyncs. + +Host kernel 4.15.0-3-amd64. +Guest kernel 3.10.0-693.21.1.el7.x86_64 (slightly older guest kernel was tested too with same result). + +Script that reproduces the bug (first run usualy goes smooth, second and later runs result in dmesg errors and freezes): + +http://paste.debian.net/hidden/472fb220/ + +Sample of error messages in guest dmesg: + +http://paste.debian.net/hidden/8219e234/ + +A workaround that I am using right now: I have detached this SATA storage and reattached the same .qcow2 file as SCSI - this has fixed the issue for me. + +Copying command line into bug so we don't lose it: + +LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=spice /usr/bin/kvm -name guest=myvm.local,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-3-myvm.local/master-key.aes -machine pc-i440fx-2.8,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu IvyBridge -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid b10ea3d4-410c-4dc3-b9b0-818d43314323 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-3-myvm.local/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device ahci,id=sata0,bus=pci.0,addr=0x7 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/home/user/data/work/virt-images/myvm.local.qcow2,format=qcow2,if=none,id=drive-sata0-0-0 -device ide-hd,bus=sata0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:39:66:3c,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-3-myvm.local/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on + +and ccing in jsnow + +Relevant bits appear to be: + +-M pc-i1440fx-2.8 +-cpu IvyBridge +-m 2048 +-realtime mlock=off +-smp 2,sockets=2,cores=1,threads=1 +-device ahci,id=sata0,bus=pci.0,addr=0x7 +-drive file=/home/user/data/work/virt-images/myvm.local.qcow2,format=qcow2,if=none,id=drive-sata0-0-0 +-device ide-hd,bus=sata0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1 + +So this is a 2.8 PC machine that we've configured to use AHCI instead. I see some blips about CHS being zero, but that's expected in response to a (successful) flush (0xE7) command, so it looks like it's stalling out. I'll have to try to reproduce and see if I can trigger the hang. + + +I am getting the exact same issue. The freeze occurred when I tried to install Ubuntu 18.04 with qemu-2.12. However, it seems to be working just fine with qemu-2.11.1. So it seems that something in between 2.11.1 and 2.12 is the culprit. + +It's still possible to reproduce this issue with qemu-master (a3ac12fba028df90f7b3dbec924995c126c41022). + +Jake, can you try the fix I posted in https://bugs.launchpad.net/qemu/+bug/1769189 ? I'm not actually confident it's the same bug, but it might be worth a shot. It fixes a bug that was made more prominent inbetween 2.11 and 2.12, so it fits the timeline presented here. + +@John Snow Thanks! After applying that patch, I couldn't reproduce this anymore. At least for me it seems that these two issues refer to the same bug. + +Great, thank you so much for helping! + diff --git a/results/classifier/105/vnc/1771042 b/results/classifier/105/vnc/1771042 new file mode 100644 index 000000000..050807a77 --- /dev/null +++ b/results/classifier/105/vnc/1771042 @@ -0,0 +1,81 @@ +vnc: 0.841 +other: 0.828 +semantic: 0.822 +assembly: 0.816 +mistranslation: 0.804 +device: 0.743 +instruction: 0.719 +graphic: 0.701 +KVM: 0.695 +network: 0.597 +boot: 0.593 +socket: 0.547 + +dataplane interrupt handler doesn't support msi + +hw/block/dataplane/virtio-blk.c commit 1010cadf62332017648abee0d7a3dc7f2eef9632 + +in the function notify_guest_bh, the function virtio_notify_irqfd is called +to deliver the interrupt corresponding to the vq + +however, without the dataplane, hw/block/virtio_blk_req_complete calls +virtio_notify to deliver the interrupt (immediately). this goes though +a slightly more involved path that calls virtio_pci_notify which includes +a case to handle msi interrupts. + +so, msi interrupts with block devices aren't serviced when using dataplane +batching. + +diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c +index 101f32c..31d9eb8 100644 +--- a/hw/block/dataplane/virtio-blk.c ++++ b/hw/block/dataplane/virtio-blk.c +@@ -73,7 +73,7 @@ static void notify_guest_bh(void *opaque) + unsigned i = j + ctzl(bits); + VirtQueue *vq = virtio_get_queue(s->vdev, i); + +- virtio_notify_irqfd(s->vdev, vq); ++ virtio_notify(s->vdev, vq); + + bits &= bits - 1; /* clear right-most bit */ + } + + +oh right, another note. this only manifests when using kvm. + + +On Mon, May 14, 2018 at 03:00:44AM -0000, eric hoffman wrote: +> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c +> index 101f32c..31d9eb8 100644 +> --- a/hw/block/dataplane/virtio-blk.c +> +++ b/hw/block/dataplane/virtio-blk.c +> @@ -73,7 +73,7 @@ static void notify_guest_bh(void *opaque) +> unsigned i = j + ctzl(bits); +> VirtQueue *vq = virtio_get_queue(s->vdev, i); +> +> - virtio_notify_irqfd(s->vdev, vq); +> + virtio_notify(s->vdev, vq); +> +> bits &= bits - 1; /* clear right-most bit */ +> } + +Please send patches to <email address hidden>. Guidelines for submitting +patches are here: +https://wiki.qemu.org/Contribute/SubmitAPatch + +The issue with this approach is that hw/pci/msi.c:msi_send_message() +invokes device emulation outside the QEMU global mutex (it calls into +the APIC to send MSIs). I've CCed Paolo Bonzini to check whether doing +this is thread-safe. + +Stefan + + +thanks for looking at this Stefan - since I don't have any context of exactly the kind of environmental issues like threading, the patch posted here isn't really a suggested fix. + +it does in general seem helpful if batched interrupts have the same delivery semantics as non-deferred. + +This bug is invalid. MSI/MSI-X interrupts are properly serviced when dataplane batching is used. The original problem was in the incorrect virtio driver initialization sequence (virtqueues and MSI-X interrupts configured after DRIVER_OK bit is set). + +This bug can be closed as INVALID. + diff --git a/results/classifier/105/vnc/1785203 b/results/classifier/105/vnc/1785203 new file mode 100644 index 000000000..7a7af5dab --- /dev/null +++ b/results/classifier/105/vnc/1785203 @@ -0,0 +1,59 @@ +vnc: 0.699 +other: 0.609 +KVM: 0.607 +instruction: 0.596 +semantic: 0.584 +mistranslation: 0.563 +assembly: 0.563 +graphic: 0.556 +device: 0.548 +socket: 0.542 +network: 0.536 +boot: 0.483 + +accel/tcg/translate-all.c:2511: page_check_range: Assertion `start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS)' failed. + +qemu-riscv64 version 2.12.93 crashes when mincore() is called with invalid pointer with the following message: + +qemu-riscv64: /opt/qemu/accel/tcg/translate-all.c:2511: page_check_range: Assertion `start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS)' failed. +qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x600014ef + +Testcase: + +#include <sys/mman.h> + +int main (void) +{ + unsigned char v; + return mincore ((void *) 0x00000010000000000, 1, &v); +} + +Backtrace: + +#0 raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 +#1 0x000000006000140a in abort () at abort.c:79 +#2 0x00000000600012ec in __assert_fail_base ( + fmt=0x6024eae8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", + assertion=0x601b9758 "start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS)", + file=0x601b9658 "/opt/qemu/accel/tcg/translate-all.c", line=2511, + function=0x601b9810 <__PRETTY_FUNCTION__.23867> "page_check_range") at assert.c:92 +#3 0x000000006010e10e in __assert_fail ( + assertion=assertion@entry=0x601b9758 "start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS)", file=file@entry=0x601b9658 "/opt/qemu/accel/tcg/translate-all.c", line=line@entry=2511, + function=function@entry=0x601b9810 <__PRETTY_FUNCTION__.23867> "page_check_range") + at assert.c:101 +#4 0x000000006003e916 in page_check_range (start=start@entry=1099511627776, len=len@entry=1, + flags=flags@entry=1) at /opt/qemu/accel/tcg/translate-all.c:2511 +#5 0x0000000060057717 in access_ok (size=1, addr=1099511627776, type=0) + at /opt/qemu/linux-user/qemu.h:567 +#6 lock_user (copy=0, len=1, guest_addr=1099511627776, type=0) + at /opt/qemu/linux-user/qemu.h:567 +#7 do_syscall (cpu_env=cpu_env@entry=0x622fca28, num=232, arg1=1099511627776, arg2=1, + arg3=274886298751, arg4=0, arg5=274886298808, arg6=66518, arg7=0, arg8=0) + at /opt/qemu/linux-user/syscall.c:11635 +#8 0x0000000060066c5c in cpu_loop (env=env@entry=0x622fca28) + at /opt/qemu/linux-user/riscv/cpu_loop.c:55 +#9 0x0000000060002156 in main (argc=<optimized out>, argv=0x7fffffffed68, + envp=<optimized out>) at /opt/qemu/linux-user/main.c:819 + +Fixed by 0acd4ab849827bbc20402e01c9da088207c0d236 ("linux-user: check valid address in access_ok()"), fix released in v5.0.0. + diff --git a/results/classifier/105/vnc/1785734 b/results/classifier/105/vnc/1785734 new file mode 100644 index 000000000..127f85c5e --- /dev/null +++ b/results/classifier/105/vnc/1785734 @@ -0,0 +1,129 @@ +vnc: 0.771 +KVM: 0.765 +mistranslation: 0.764 +other: 0.750 +device: 0.674 +graphic: 0.667 +instruction: 0.658 +assembly: 0.629 +socket: 0.621 +semantic: 0.612 +boot: 0.579 +network: 0.578 + +movdqu partial write at page boundary + +In TCG mode, when a 16-byte write instruction (such as movdqu) is executed at a page boundary and causes a page fault, a partial write is executed in the first page. See the attached code for an example. + +Tested on the qemu-3.0.0-rc1 release. + + +% gcc -m32 qemu-bug2.c && ./a.out && echo && qemu-i386 ./a.out +*(0x70000ff8+ 0) = aa +*(0x70000ff8+ 1) = aa +*(0x70000ff8+ 2) = aa +*(0x70000ff8+ 3) = aa +*(0x70000ff8+ 4) = aa +*(0x70000ff8+ 5) = aa +*(0x70000ff8+ 6) = aa +*(0x70000ff8+ 7) = aa +*(0x70000ff8+ 8) = 55 +*(0x70000ff8+ 9) = 55 +*(0x70000ff8+10) = 55 +*(0x70000ff8+11) = 55 +*(0x70000ff8+12) = 55 +*(0x70000ff8+13) = 55 +*(0x70000ff8+14) = 55 +*(0x70000ff8+15) = 55 +page fault: addr=0x70001000 err=0x7 +*(0x70000ff8+ 0) = aa +*(0x70000ff8+ 1) = aa +*(0x70000ff8+ 2) = aa +*(0x70000ff8+ 3) = aa +*(0x70000ff8+ 4) = aa +*(0x70000ff8+ 5) = aa +*(0x70000ff8+ 6) = aa +*(0x70000ff8+ 7) = aa +*(0x70000ff8+ 8) = 55 +*(0x70000ff8+ 9) = 55 +*(0x70000ff8+10) = 55 +*(0x70000ff8+11) = 55 +*(0x70000ff8+12) = 55 +*(0x70000ff8+13) = 55 +*(0x70000ff8+14) = 55 +*(0x70000ff8+15) = 55 + +*(0x70000ff8+ 0) = aa +*(0x70000ff8+ 1) = aa +*(0x70000ff8+ 2) = aa +*(0x70000ff8+ 3) = aa +*(0x70000ff8+ 4) = aa +*(0x70000ff8+ 5) = aa +*(0x70000ff8+ 6) = aa +*(0x70000ff8+ 7) = aa +*(0x70000ff8+ 8) = 55 +*(0x70000ff8+ 9) = 55 +*(0x70000ff8+10) = 55 +*(0x70000ff8+11) = 55 +*(0x70000ff8+12) = 55 +*(0x70000ff8+13) = 55 +*(0x70000ff8+14) = 55 +*(0x70000ff8+15) = 55 +page fault: addr=0x70001000 err=0x6 +*(0x70000ff8+ 0) = 77 +*(0x70000ff8+ 1) = 66 +*(0x70000ff8+ 2) = 55 +*(0x70000ff8+ 3) = 44 +*(0x70000ff8+ 4) = 33 +*(0x70000ff8+ 5) = 22 +*(0x70000ff8+ 6) = 11 +*(0x70000ff8+ 7) = 0 +*(0x70000ff8+ 8) = 55 +*(0x70000ff8+ 9) = 55 +*(0x70000ff8+10) = 55 +*(0x70000ff8+11) = 55 +*(0x70000ff8+12) = 55 +*(0x70000ff8+13) = 55 +*(0x70000ff8+14) = 55 +*(0x70000ff8+15) = 55 + + + +This is a part of a class of related problems for qemu linux-user, in that any non-atomic store is not validated before initiating a partial write. + +For instance, qemu-x86_64, built for arm32, would show this same partial store problem for any 64-bit write crossing a page boundary because we are forced by the limits of the host to split the store into two 32-bit pieces. + +While we could probably fix this specific case fairly easily, because it is implemented with an external helper in the first place, we would need some new infrastructure to handle the more general problem. Exactly what form that should take would need some thought and discussion. + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1786343 b/results/classifier/105/vnc/1786343 new file mode 100644 index 000000000..0a444f00d --- /dev/null +++ b/results/classifier/105/vnc/1786343 @@ -0,0 +1,55 @@ +vnc: 0.687 +instruction: 0.640 +graphic: 0.630 +network: 0.622 +device: 0.588 +boot: 0.585 +socket: 0.583 +other: 0.578 +KVM: 0.548 +mistranslation: 0.484 +assembly: 0.470 +semantic: 0.460 + +QEMU v3.0.0-rc4 configure fails with --enable-mpath on CentOS 7.5 + +QEMU v3.0.0-rc4 configure fails with --enable-mpath on CentOS 7.5. + +After commit b3f1c8c413bc83e4a2cc7a63e4eddf9fe6449052 "qemu-pr-helper: use new +libmultipath API", QEMU started using new libmultipath API, which is not +available on CentOS 7.5. Reverting this commit, configure passes. + +Steps to reproduce (fails on x86_64 and ppc64le architectures): + + $ git clone git://git.qemu.org/qemu.git + $ mkdir -p qemu/build && cd qemu/build + $ ../configure --enable-mpath + ERROR: Multipath requires libmpathpersist devel + + $ rpm -qa | grep device-mapper | sort + device-mapper-1.02.146-4.el7.ppc64le + device-mapper-devel-1.02.146-4.el7.ppc64le + device-mapper-libs-1.02.146-4.el7.ppc64le + device-mapper-multipath-0.4.9-119.el7.ppc64le + device-mapper-multipath-devel-0.4.9-119.el7.ppc64le + device-mapper-multipath-libs-0.4.9-119.el7.ppc64le + +Snippet from config.log: + + funcs: do_compiler do_cc compile_prog main + lines: 92 125 3580 0 + cc -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wno-missing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-strong -Wno-missing-braces -I/usr/include/p11-kit-1 -I/usr/include/libpng15 -o config-temp/qemu-conf.exe config-temp/qemu-conf.c -m64 -g -ludev -lmultipath -lmpathpersist + config-temp/qemu-conf.c: In function ‘main’: + config-temp/qemu-conf.c:15:5: error: too few arguments to function ‘mpath_lib_init’ + multipath_conf = mpath_lib_init(); + ^ + In file included from config-temp/qemu-conf.c:2:0: + /usr/include/mpath_persist.h:179:12: note: declared here + extern int mpath_lib_init (struct udev *udev); + ^ + +I'll work on a fix for configure. + +Fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=1b0578f5c455d5a95384 + diff --git a/results/classifier/105/vnc/1795100 b/results/classifier/105/vnc/1795100 new file mode 100644 index 000000000..bfe475405 --- /dev/null +++ b/results/classifier/105/vnc/1795100 @@ -0,0 +1,62 @@ +vnc: 0.920 +socket: 0.895 +graphic: 0.768 +network: 0.731 +device: 0.633 +semantic: 0.609 +instruction: 0.597 +mistranslation: 0.586 +other: 0.544 +boot: 0.396 +KVM: 0.191 +assembly: 0.133 + +VNC unix-domain socket unlink()ed prematurely + +With qemu 3.0.0 (I don't believe this happened with previous +versions), if I tell it `-vnc unix:/path/to/vnc.sock`, qemu will +unlink() that file when the first client disconnects, meaning that +once I disconnect, I can't ever reconnect without restarting the VM. + +A stupid testcase demonstrating the issue: + +In terminal A: + + $ qemu-system-x86_64 -vnc unix:$PWD/vnc.sock + +In terminal B: + + $ ls vnc.sock + vnc.sock + $ socat STDIO UNIX-CONNECT:vnc.sock <<<'' + RFB 003.008 + $ ls vnc.sock + ls: cannot access 'vnc.sock': No such file or directory + +I have determined that the offending unlink() call is the one in +io/channel-socket.c:qio_channel_socket_close(). That call was first +introduced in commit d66f78e1eaa832f73c771d9df1b606fe75d52a50, which +first appeared in version 3.0.0. + +This type of premature unlink() does not happen on monitor.sock with +`-monitor unix:/path/to/monitor.sock,server,nowait`. + +I am not familiar enough with the QIO subsystem to suggest a fix that +fixes VNC, but preserves the QMP fix targeted in the offending commit. + +This is still a problem with 3.1.0. + +Added Daniel to the bug. + +It only affects VNC, not chardevs because the chardevs fail to call qio_channel_close() and just rely on finalize() cleaning up the open socket. To fix this we just need to made the code conditional on it being a listener socket + + if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_LISTEN)) { + ... + } + +Patch proposed at + +https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg02774.html + +Fix merged to git master https://git.qemu.org/?p=qemu.git;a=commit;h=feff02089113839d233e40a9bd7134241de12d1d + diff --git a/results/classifier/105/vnc/1802465 b/results/classifier/105/vnc/1802465 new file mode 100644 index 000000000..f609f4e7b --- /dev/null +++ b/results/classifier/105/vnc/1802465 @@ -0,0 +1,80 @@ +vnc: 0.903 +KVM: 0.902 +mistranslation: 0.887 +device: 0.805 +boot: 0.712 +semantic: 0.694 +instruction: 0.691 +other: 0.654 +graphic: 0.553 +network: 0.496 +socket: 0.453 +assembly: 0.222 + +typing string via VNC is unreliable + +QEMU version is 3.0.0 + +# Description + +The problem is that, when typing string through VNC, it can be unreliable -- sometimes some key strokes get skipped, sometimes get swapped, sometimes get repeated. There's no problem when typing through VNC on physical hardware. + +# Steps to reproduce + +1. Launch virtual machine by: + + qemu-kvm -display vnc=:1 -m 2048 opensuse-leap-15.qcow2 + +2. Connect to VNC by: + + vncviewer -Shared :5901 + +3. Simulate a series of key strokes by "vncdotool" [1]: + + vncdotool -s 127.0.0.1::5901 typefile strings_to_be_typed.txt + +4. Usually after a few hundred keys are typed, something goes wrong. + +I attached a screenshot that it mistypes " hello" to "h ello". + +[1] https://github.com/sibson/vncdotool + + + +In my case the problem is quite subtle. +Nearly every time we send the key strokes, the guest os keeps receiving space or tab or new line character. And ending part of the text is truncated, where the truncated part is fixed depending on the keystrokes we are sending. +Additionally, the keystrokes are mis-ordered at a higher frequency than 1 out of a few hundreds as you said. + +In brief, +- Repeatedly receiving space, tab or new line character as ending +- Truncation regarding ending part of key strokes +- characters are mis-ordered, lost, repeated + +A question is, how can you make character transaction faster with vncdotool? In my case, vncdotool is typing fairly slow. + +I found this debian bug report quite useful, what do you think? https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=758881 + +Re-producing procedures +--- + +I re-produced the problem in a slightly different way since your method seems not working for me. + +> qemu-img create tumbleweed.img 40G +> qemu-system-x86_64 -drive file=tumbleweed.img,if=virtio -boot d -cdrom openSUSE-Tumbleweed-DVD-x86_64-Snapshot20181119-Media.iso -m 2048 --enable-kvm -display vnc=:1 +> qemu-system-x86_64 -drive file=tumbleweed.img,if=virtio -m 4G --enable-kvm -display vnc=:1 + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1806040 b/results/classifier/105/vnc/1806040 new file mode 100644 index 000000000..411db400c --- /dev/null +++ b/results/classifier/105/vnc/1806040 @@ -0,0 +1,42 @@ +vnc: 0.889 +device: 0.867 +instruction: 0.848 +graphic: 0.841 +network: 0.782 +other: 0.733 +mistranslation: 0.683 +assembly: 0.637 +semantic: 0.634 +socket: 0.586 +KVM: 0.576 +boot: 0.536 + +Nested VMX virtualization error on last Qemu versions + +Recently updated Qemu on a Sony VAIO sve14ag18m with Ubuntu Bionic 4.15.0-38 from Git + +After launching a few VMs, noticed that i could not create Snapshot due to this error: +"Nested VMX virtualization does not support live migration yet" + +I've created a new Windows 7 X64 machine with this compilation of Qemu and the problem persisted, so it's not because of the old machines. + +I launch Qemu with this params (I use them for malware analisys adn re...): +qemu-system-x86_64 -monitor stdio -display none -m 4096 -smp cpus=4 -usbdevice tablet -drive file=VM.img,index=0,media=disk,format=qcow2,cache=unsafe -net nic,macaddr="...." -net bridge,br=br0 -cpu host,-hypervisor,kvm=off -vnc 127.0.0.1:0 -enable-kvm + + +Deleting the changes made on this commit solved the problem, but I dont have idea what is this for, so... xDD +https://github.com/qemu/qemu/commit/d98f26073bebddcd3da0ba1b86c3a34e840c0fb8 + +Hi, + +the kernel you are using should not have nested virtualization enabled by default. Are you by chance using nested virtualization of some other virtual machines? If so, it's enough to add "-vmx" at the end of "-cpu host,-hypervisor,kvm=off". + +If you are not sure of the answer, please check if your /etc/modprobe.conf file, or any file in your /etc/modprobe.d directory, contains the line "options kvm_intel nested=1". + +Thanks! + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1816819 b/results/classifier/105/vnc/1816819 new file mode 100644 index 000000000..0a6fca87a --- /dev/null +++ b/results/classifier/105/vnc/1816819 @@ -0,0 +1,73 @@ +vnc: 0.985 +socket: 0.984 +network: 0.980 +device: 0.963 +instruction: 0.953 +other: 0.913 +graphic: 0.912 +KVM: 0.878 +boot: 0.872 +assembly: 0.859 +mistranslation: 0.822 +semantic: 0.814 + +Chardev websocket stops listening after first connection disconnects + +Using qemu option: + -chardev socket,id=websock0,websocket,port=13042,host=127.0.0.1,server,nowait -serial chardev:websock0 + +To have a websocket listening chardev. After the first connection disconnects (that does a full websocket handshake), subsequent connections aren't accepted. See below for a reproducing session kindly provided by Daniel: + +$ telnet localhost 13042 +Trying ::1... +telnet: connect to address ::1: Connection refused +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. +GET / HTTP/1.1 +Upgrade: websocket +Connection: Upgrade +Host: localhost:%s +Origin: http://localhost +Sec-WebSocket-Key: o9JHNiS3/0/0zYE1wa3yIw== +Sec-WebSocket-Version: 13 +Sec-WebSocket-Protocol: binary + +HTTP/1.1 101 Switching Protocols +Server: QEMU VNC +Date: Wed, 20 Feb 2019 16:52:04 GMT +Upgrade: websocket +Connection: Upgrade +Sec-WebSocket-Accept: b3DnPh7O8hyYE5sIjQxl/c1J+S8= +Sec-WebSocket-Protocol: binary + +sfsd +�&�only binary frames may be fragmentedConnection closed by foreign host. + +$ telnet localhost 13042 +Trying ::1... +telnet: connect to address ::1: Connection refused +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. +GET / HTTP/1.1 +Upgrade: websocket +Connection: Upgrade +Host: localhost:%s +Origin: http://localhost +Sec-WebSocket-Key: o9JHNiS3/0/0zYE1wa3yIw== +Sec-WebSocket-Version: 13 +Sec-WebSocket-Protocol: binary + + + +...no response..... + +Patch proposed + +https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg05556.html + +I can confirm that this patch fixes the issue. I can now reconnect after a client has disconnected. + +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=dd154c4d9f48a44ad24e1 + diff --git a/results/classifier/105/vnc/1819108 b/results/classifier/105/vnc/1819108 new file mode 100644 index 000000000..22c8d873e --- /dev/null +++ b/results/classifier/105/vnc/1819108 @@ -0,0 +1,52 @@ +vnc: 0.761 +device: 0.684 +instruction: 0.629 +KVM: 0.601 +mistranslation: 0.484 +semantic: 0.415 +network: 0.413 +socket: 0.391 +other: 0.336 +boot: 0.293 +graphic: 0.284 +assembly: 0.073 + +qemu-bridge-helper failure but qemu not exit + +When qemu-bridge-helper run failed, its parent process qemu is still alive. +This is my command line: + +qemu-system-x86_64 -curses -enable-kvm -cpu host -smp 4 -m 4096 \ + -vnc :1 \ + -kernel /data/xugang_vms/boot/vmlinuz \ + -initrd /data/xugang_vms/boot/initram \ + -append 'module_blacklist=drm,evbug net.ifnames=0 biosdevname=0 ROOTDEV=rootfs' \ + -drive file=/data/xugang_vms/instances/vn7/rootfs.img,format=qcow2,if=virtio \ + -monitor unix:/data/xugang_vms/var/monitor/vn7.sock,server,nowait \ + -netdev bridge,br=vmbr99,helper="/root/bridgehelper --ns=kvm_1 ",id=n1 -device virtio-net,netdev=n1,mac=92:99:98:76:01:07 + +"/root/bridgehelper" is self defined helper binary by me. But after bridge-helper exited with failure(not send fd to qemu process yet), the linux vm's console will be messed up. I checked the qemu source code(at net/tap.c) and found following snip: + +===> +do { + fd = recv_fd(sv[0]); + } while (fd == -1 && errno == EINTR); + saved_errno = errno; + + close(sv[0]); + + while (waitpid(pid, &status, 0) != pid) { + /* loop */ + } +<========= + +why recv_fd will infinitely wait for recv? Maybe it shall waitpid and then recv_fd ? + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/166 + + diff --git a/results/classifier/105/vnc/1829576 b/results/classifier/105/vnc/1829576 new file mode 100644 index 000000000..5547fcc54 --- /dev/null +++ b/results/classifier/105/vnc/1829576 @@ -0,0 +1,89 @@ +vnc: 0.864 +other: 0.847 +semantic: 0.836 +device: 0.828 +mistranslation: 0.816 +KVM: 0.801 +boot: 0.796 +graphic: 0.791 +instruction: 0.786 +assembly: 0.776 +socket: 0.755 +network: 0.725 + +QEMU-SYSTEM-PPC64 Regression QEMU-4.0.0 + +I have been using QEMU-SYSTEM-PPC64 v3.1.0 to run CentOS7 PPC emulated system. It stopped working when I upgraded to QEMU-4.0.0 . I downgraded back to QEMU-3.1.0 and it started working again. The problem is that my CentOS7 image will not boot up udner QEMU-4.0.0, but works fine under QEMU-3.1.0. + +I have an QCOW2 image available at https://www.mediafire.com/file/d8dda05ro85whn1/linux-centos7-ppc64.qcow2/file . NOTE: It is 15GB. Kind of large. + +I run it as follows: + + qemu-system-ppc64 \ + -name "CENTOS7-PPC64" \ + -cpu POWER7 -machine pseries \ + -m 4096 \ + -netdev bridge,id=netbr0,br=br0 \ + -device e1000,netdev=netbr0,mac=52:54:3c:13:21:33 \ + -hda "./linux-centos7-ppc64.qcow2" \ + -monitor stdio + +HOST: I am using Manjaro Linux on an Intel i7 machine with the QEMU packages installed via the package manager of the distribution. + +[jsantiago@jlsws0 ~]$ uname -a +Linux jlsws0.haivision.com 4.19.42-1-MANJARO #1 SMP PREEMPT Fri May 10 20:52:43 UTC 2019 x86_64 GNU/Linux + +jsantiago@jlsws0 ~]$ cpuinfo +Intel(R) processor family information utility, Version 2019 Update 3 Build 20190214 (id: b645a4a54) +Copyright (C) 2005-2019 Intel Corporation. All rights reserved. + +===== Processor composition ===== +Processor name : Intel(R) Core(TM) i7-6700K +Packages(sockets) : 1 +Cores : 4 +Processors(CPUs) : 8 +Cores per package : 4 +Threads per core : 2 + +===== Processor identification ===== +Processor Thread Id. Core Id. Package Id. +0 0 0 0 +1 0 1 0 +2 0 2 0 +3 0 3 0 +4 1 0 0 +5 1 1 0 +6 1 2 0 +7 1 3 0 +===== Placement on packages ===== +Package Id. Core Id. Processors +0 0,1,2,3 (0,4)(1,5)(2,6)(3,7) + +===== Cache sharing ===== +Cache Size Processors +L1 32 KB (0,4)(1,5)(2,6)(3,7) +L2 256 KB (0,4)(1,5)(2,6)(3,7) +L3 8 MB (0,1,2,3,4,5,6,7) + +I suspect that this may be related to the VSR register conversion. Can you try applying all of the patches below on top of 4.0 to see if they resolve the issue? + +https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg01254.html +https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg01256.html +https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg01257.html +https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg01260.html + + +I applied the four patches you indicated and the image boots up and runs. Everything seems to be working now. Thank You. + +I also have a regression issue between 3.1.0 and 4.0.0 (actually latest git) on qemu-system-ppc64 but it involves an AIX guest instead (fail to boot). Should I open a new ticket or hop on this one ? + +David has already queued the patches in his ppc-for-4.1 branch at https://github.com/dgibson/qemu/commits/ppc-for-4.1 so they will get merged soon. If you're working with git then I'd try testing the queued branch first and see if that resolves the issue. + +Once the patches have been applied to master we'll add a CC to the stable list so the fixes will be included in the next 4.0 update. + +Same thing here using https://github.com/dgibson/qemu/commits/ppc-for-4.1 ... It might be a completely different problem (athough it looks like a MMU problem). + +Is this fixed now? Can we mark as fix committed? + +It is fixed with the 4.1.0 release. Thank you. + diff --git a/results/classifier/105/vnc/1856549 b/results/classifier/105/vnc/1856549 new file mode 100644 index 000000000..b81213732 --- /dev/null +++ b/results/classifier/105/vnc/1856549 @@ -0,0 +1,128 @@ +vnc: 0.751 +graphic: 0.748 +KVM: 0.723 +other: 0.717 +semantic: 0.626 +assembly: 0.583 +instruction: 0.574 +device: 0.555 +network: 0.542 +socket: 0.540 +boot: 0.513 +mistranslation: 0.361 + +qemu-4.2.0/hw/misc/mac_via.c: 2 * bad test ? + +1. + +qemu-4.2.0/hw/misc/mac_via.c:417:27: style: Expression is always false because 'else if' condition matches previous condition at line 412. [multiCondition] + + } else if ((m->data_out & 0xf3) == 0xa1) { +... + } else if ((m->data_out & 0xf3) == 0xa1) { + +2. + +qemu-4.2.0/hw/misc/mac_via.c:467:27: style: Expression is always false because 'else if' condition matches previous condition at line 463. [multiCondition] + +Duplicate. + +gcc compiler flag -Wduplicated-cond will catch this kind of problem. + +You might want to switch it on in your builds. It has been available for over a year. + + +On 12/16/19 12:58 PM, dcb wrote: +> gcc compiler flag -Wduplicated-cond will catch this kind of problem. + +Interesting, thanks for sharing! + +> +> You might want to switch it on in your builds. It has been available for +> over a year. +> + + + +This code seems to emulate a RTC device connected via I2C to the VIA chipset. + +This might be the expected code (simply looking at this file, without checking the datasheet): +-- >8 -- +--- a/hw/misc/mac_via.c ++++ b/hw/misc/mac_via.c +@@ -409,7 +409,7 @@ static void via1_rtc_update(MacVIAState *m) + } else if (m->data_out == 0x8d) { /* seconds register 3 */ + m->data_in = (time >> 24) & 0xff; + m->data_in_cnt = 8; +- } else if ((m->data_out & 0xf3) == 0xa1) { ++ } else if ((m->data_out & 0xf3) == 0xa3) { + /* PRAM address 0x10 -> 0x13 */ + int addr = (m->data_out >> 2) & 0x03; + m->data_in = v1s->PRAM[addr]; +@@ -460,7 +460,7 @@ static void via1_rtc_update(MacVIAState *m) + } else if (m->cmd == 0x35) { + /* Write Protect register */ + m->wprotect = m->data_out & 1; +- } else if ((m->cmd & 0xf3) == 0xa1) { ++ } else if ((m->cmd & 0xf3) == 0xa3) { + /* PRAM address 0x10 -> 0x13 */ + int addr = (m->cmd >> 2) & 0x03; + v1s->PRAM[addr] = m->data_out; +--- + +This file won a "/* TODO port to I2CBus */" comment :) + +I think VIA RTC access method has been implemented earlier (Classic Mac, 1984-1989) than the I2C specification, so I'm not sure we can/should port this to an I2C bus. + +Specs are (from Apple Macintosh Family Hardware Reference Chapter 2, Classi Macitosh Processor and Control) + +z0000001 Seconds register 0 (lowest-order byte) +z0000101 Seconds register 1 +z0001001 Seconds register 2 +z0001101 Seconds register 3 (highest-order byte) +00110001 Test register (write-only) +00110101 Write-Protect Register (write-only) +z010aa01 RAM address 100aa ($10-$13) (first 20 bytes only) +z1aaaa01 RAM address 0aaaa ($00-$0F) (first 20 bytes only) +z0111aaa Extended memory designator and sector number + (Macintohs 512K enhanced and Macintosh plus only) + +For a read request, z=1, for a write z=0 +The letter a indicates bits whose value depend on what parameter RAM byte you want to address + +So I think the mask/values should be: + +diff --git a/hw/misc/mac_via.c b/hw/misc/mac_via.c +index f3f130ad96cc..7402cf3f1ee8 100644 +--- a/hw/misc/mac_via.c ++++ b/hw/misc/mac_via.c +@@ -414,7 +414,7 @@ static void via1_rtc_update(MacVIAState *m) + int addr = (m->data_out >> 2) & 0x03; + m->data_in = v1s->PRAM[addr]; + m->data_in_cnt = 8; +- } else if ((m->data_out & 0xf3) == 0xa1) { ++ } else if ((m->data_out & 0xc3) == 0xc1) { + /* PRAM address 0x00 -> 0x0f */ + int addr = (m->data_out >> 2) & 0x0f; + m->data_in = v1s->PRAM[addr]; +@@ -460,11 +460,11 @@ static void via1_rtc_update(MacVIAState *m) + } else if (m->cmd == 0x35) { + /* Write Protect register */ + m->wprotect = m->data_out & 1; +- } else if ((m->cmd & 0xf3) == 0xa1) { ++ } else if ((m->cmd & 0xf3) == 0x21) { + /* PRAM address 0x10 -> 0x13 */ + int addr = (m->cmd >> 2) & 0x03; + v1s->PRAM[addr] = m->data_out; +- } else if ((m->cmd & 0xf3) == 0xa1) { ++ } else if ((m->cmd & 0xc3) == 0x41) { + /* PRAM address 0x00 -> 0x0f */ + int addr = (m->cmd >> 2) & 0x0f; + v1s->PRAM[addr] = m->data_out; + + +Patch posted: https://<email address hidden>/msg666836.html + +Fixed here: +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=b2619c158ab5 + diff --git a/results/classifier/105/vnc/1867786 b/results/classifier/105/vnc/1867786 new file mode 100644 index 000000000..033568631 --- /dev/null +++ b/results/classifier/105/vnc/1867786 @@ -0,0 +1,148 @@ +assembly: 0.970 +vnc: 0.967 +semantic: 0.967 +mistranslation: 0.964 +device: 0.960 +other: 0.957 +socket: 0.954 +network: 0.952 +instruction: 0.947 +graphic: 0.943 +boot: 0.920 +KVM: 0.882 + +Qemu PPC64 freezes with multi-core CPU + +I installed Debian 10 on a Qemu PPC64 VM running with the following flags: + +qemu-system-ppc64 \ + -nographic -nodefaults -monitor pty -serial stdio \ + -M pseries -cpu POWER9 -smp cores=4,threads=1 -m 4G \ + -drive file=debian-ppc64el-qemu.qcow2,format=qcow2,if=virtio \ + -netdev user,id=network01,$ports -device rtl8139,netdev=network01 \ + + +Within a couple minutes on any operation (could be a Go application or simply changing the hostname with hostnamectl, the VM freezes and prints this on the console: + +``` +root@debian:~# [ 950.428255] rcu: INFO: rcu_sched self-detected stall on CPU +[ 950.428453] rcu: 3-....: (5318 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=2544 +[ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] + +Message from syslogd@debian at Mar 17 11:35:24 ... + kernel:[ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] +[ 980.110018] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-... } 5276 jiffies s: 93 root: 0x8/. +[ 980.111177] rcu: blocking rcu_node structures: +[ 1013.442268] rcu: INFO: rcu_sched self-detected stall on CPU +[ 1013.442365] rcu: 3-....: (21071 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=9342 +``` + +If I change to 1 core on the command line, I haven't seen these freezes. + +Is this with KVM or with TCG? +What is your hardware configuration? + +It's soft emulation, running Qemu 4.2.50 (from master branch) on MacOS Mojave. + +Do you have the problem with 4.2.0? +Can you identify the commit introducing the problem? + +I just reverted to 4.2.0 and it works fine. No freezes for the past hour. + +❯ qemu-system-ppc64 --version +QEMU emulator version 4.2.0 +Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers + +Couldn't bisect to find the bad commit. + +Carlos + +Thank you for the test. I'm going to try to reproduce the problem and bisect. + +I'm not able to reproduce (kernel 4.19.0-8-powerpc64le, qemu id d649689a8ecb) + +What is the kernel version in the guest? +What is the QEMU commit id you used to test with 4.2.50? + +Hi Laurent, I'm on a MacOS Mojave running Qemu installed by homebrew from master branch on the day I've opened the bug. + +The option to install was: `brew install --HEAD qemu -s --verbose`. + +Maybe it's a Mac related problem? + +Hi, any news about this? Can I provide any additional info since it might be a Mac issue. +Thanks + +I just built from latest master and got the kernel trace below. + +❯ qemu-system-ppc64 --version +QEMU emulator version 4.2.90 (v4.2.0-2811-g83019e81d1-dirty) +Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers + + +qemu-system-ppc64 \ + -nographic -nodefaults -monitor pty -serial stdio \ + -M pseries -cpu POWER9 -smp cores=4,threads=1 -m 4G \ + -drive file=debian-ppc64el-qemu.qcow2,format=qcow2,if=virtio \ + -netdev user,id=network01,hostfwd=tcp::$LocalSSHPort-:22 -device rtl8139,netdev=network01 \ + + +[ 376.219450] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [swapper/3:0] +[ 376.226712] Modules linked in: ctr(E) vmx_crypto(E) gf128mul(E) sunrpc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) virtio_blk(E) 8139too(E) virtio_pci(E) virtio_ring(E) 8139cp(E) virtio(E) mii(E) +[ 376.235692] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E 5.5.0-rc5-powerpc64le #1 Debian 5.5~rc5-1~exp1 +[ 376.236245] NIP: c00000000000af8c LR: c000000000019664 CTR: c000000000af2c80 +[ 376.236365] REGS: c0000000fffcf920 TRAP: 0901 Tainted: G E (5.5.0-rc5-powerpc64le Debian 5.5~rc5-1~exp1) +[ 376.236376] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002248 XER: 00000000 +[ 376.236479] CFAR: c000000000af2ce0 IRQMASK: 0 + GPR00: c000000000af2e38 c0000000fffcfbb0 c000000001365700 0000000000000500 + GPR04: 00000000fef90000 0000002be1f69c00 0000002beaa729fa c0000000fffec880 + GPR08: 0000000400000000 00000000000080ff 0000000000000001 c0080000004c6ff0 + GPR12: 0000000000002000 c0000000fffec880 +[ 376.238452] NIP [c00000000000af8c] replay_interrupt_return+0x0/0x4 +[ 376.238488] LR [c000000000019664] arch_local_irq_restore.part.0+0x54/0x70 +[ 376.238984] Call Trace: +[ 376.240707] [c0000000fffcfbb0] [c0000000008ce910] napi_gro_receive+0x1e0/0x210 (unreliable) +[ 376.240824] [c0000000fffcfbd0] [c000000000af2e38] _raw_spin_unlock_irqrestore+0x98/0xb0 +[ 376.242114] [c0000000fffcfbf0] [c0080000004c5588] cp_rx_poll+0x580/0x610 [8139cp] +[ 376.242131] [c0000000fffcfcf0] [c0000000008cf6c8] net_rx_action+0x1f8/0x550 +[ 376.242139] [c0000000fffcfe10] [c000000000af3a8c] __do_softirq+0x16c/0x3d8 +[ 376.242172] [c0000000fffcff30] [c0000000001329e8] irq_exit+0xd8/0x120 +[ 376.242181] [c0000000fffcff60] [c000000000019fb4] __do_irq+0x84/0x1c0 +[ 376.242193] [c0000000fffcff90] [c00000000002cbec] call_do_irq+0x14/0x24 +[ 376.242201] [c0000000fd4b7980] [c00000000001a178] do_IRQ+0x88/0xf0 +[ 376.242209] [c0000000fd4b79c0] [c000000000008d98] hardware_interrupt_common+0x158/0x160 +[ 376.242243] --- interrupt: 501 at plpar_hcall_norets+0x1c/0x28 + LR = check_and_cede_processor+0x48/0x60 +[ 376.243892] [c0000000fd4b7cc0] [c0000000fd4b7cf0] 0xc0000000fd4b7cf0 (unreliable) +[ 376.243922] [c0000000fd4b7d20] [c00000000086c710] shared_cede_loop+0x50/0x160 +[ 376.243942] [c0000000fd4b7d50] [c000000000868844] cpuidle_enter_state+0xa4/0x590 +[ 376.243953] [c0000000fd4b7dd0] [c000000000868dcc] cpuidle_enter+0x4c/0x70 +[ 376.243983] [c0000000fd4b7e10] [c000000000177d4c] call_cpuidle+0x4c/0x90 +[ 376.243991] [c0000000fd4b7e30] [c000000000178358] do_idle+0x2f8/0x400 +[ 376.243998] [c0000000fd4b7ed0] [c0000000001786a8] cpu_startup_entry+0x38/0x40 +[ 376.244011] [c0000000fd4b7f00] [c00000000004e910] start_secondary+0x640/0x670 +[ 376.244020] [c0000000fd4b7f90] [c00000000000b354] start_secondary_prolog+0x10/0x14 +[ 376.244093] Instruction dump: +[ 376.244751] 7d200026 618c8000 2c030900 4182e348 2c030500 4182dcd0 2c030f00 4182f318 +[ 376.244797] 2c030a00 4182ffc8 60000000 60000000 <4e800020> 7c781b78 480003d9 480003f1 + +Could you try to change the network card, with something like "-device e1000e,netdev=network01" or "-device virtio-net-pci,netdev=network01" or "-device spapr-vlan,netdev=network01"? + +Hi Laurent, confirm that after changing the network adapter to the e1000e it worked flawlessly for hours with 4 cores on Macbook Pro. + +Thanks! + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting older bugs to "Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1870911 b/results/classifier/105/vnc/1870911 new file mode 100644 index 000000000..3b69dee01 --- /dev/null +++ b/results/classifier/105/vnc/1870911 @@ -0,0 +1,104 @@ +vnc: 0.877 +other: 0.876 +instruction: 0.876 +mistranslation: 0.874 +semantic: 0.853 +socket: 0.840 +graphic: 0.794 +device: 0.783 +network: 0.713 +assembly: 0.701 +KVM: 0.622 +boot: 0.608 + +QEMU Crashes on Launch, Windows + +Hi, + +I an having no issues up to (and including) v5.0.0-rc0, but when I move to rc1 ... it won't even execute in Windows. If I just try to, for example, run + +qemu-system-x86_64.exe --version + +No output, it just exits. This seems to be new with this version. + +Thanks! + +On Sun, Apr 5, 2020 at 3:38 PM Russell Morris <email address hidden> wrote: + +> Public bug reported: +> +> Hi, +> +> I an having no issues up to (and including) v5.0.0-rc0, but when I move +> to rc1 ... it won't even execute in Windows. If I just try to, for +> example, run +> +> qemu-system-x86_64.exe --version +> +> No output, it just exits. This seems to be new with this version. +> +> Thanks! +> +> ** Affects: qemu +> Importance: Undecided +> Status: New +> +> -- +> You received this bug notification because you are a member of qemu- +> devel-ml, which is subscribed to QEMU. +> https://bugs.launchpad.net/bugs/1870911 +> +> Title: +> QEMU Crashes on Launch, Windows +> +> Status in QEMU: +> New +> +> Bug description: +> Hi, +> +> I an having no issues up to (and including) v5.0.0-rc0, but when I +> move to rc1 ... it won't even execute in Windows. If I just try to, +> for example, run +> +> qemu-system-x86_64.exe --version +> +> No output, it just exits. This seems to be new with this version. +> +> Thanks! +> +> To manage notifications about this bug go to: +> https://bugs.launchpad.net/qemu/+bug/1870911/+subscriptions +> +> + +Happens to me too with qemu-system-ppc. Earlier thread is here: +https://lists.nongnu.org/archive/html/qemu-ppc/2020-04/msg00027.html + +For now compiling with --disable-pie will produce a running executable. + +Best, +Howard + + +Thanks for the pointer! Yep, same here - if I --disable-pie, rebuild and try again => now no crash, at least checking --version ;-). + +Will continue testing here, report back if I see any other oddities. + +Thanks again. + +The QEMU project is currently considering to move its bug tracking to +another system. For this we need to know which bugs are still valid +and which could be closed already. Thus we are setting older bugs to +"Incomplete" now. + +If you still think this bug report here is valid, then please switch +the state back to "New" within the next 60 days, otherwise this report +will be marked as "Expired". Or please mark it as "Fix Released" if +the problem has been solved with a newer version of QEMU already. + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1872790 b/results/classifier/105/vnc/1872790 new file mode 100644 index 000000000..089faf0d5 --- /dev/null +++ b/results/classifier/105/vnc/1872790 @@ -0,0 +1,84 @@ +vnc: 0.701 +mistranslation: 0.691 +other: 0.645 +KVM: 0.619 +device: 0.615 +instruction: 0.595 +semantic: 0.530 +assembly: 0.504 +graphic: 0.503 +boot: 0.492 +socket: 0.450 +network: 0.440 + +empty qcow2 + +I plugged multiple qcow2 to a Windows guest. On the Windows disk manager all disks are listed perfectly, with their data, their real space, I even can explore all files on the Explorer, all cool + +On third party disk manager (all of them), I only have the C:\ HDD who act normally, all the other plugged qcow2 are seen as fully unallocated, so I can't manipulate them + +I want to move some partitions, create others, but on Windows disk manager I can't extend or create partition and on third party I didn't see the partitions at all + +Even guestfs doesn't recognize any partition table `libguestfs: error: inspect_os: /dev/sda: not a partitioned device` + +It sounds like maybe these disks have been partitioned in a format that only Windows understands. Can you tell me what the windows disk manager claims the partition table format to be? + +If you still think that maybe there's a QEMU bug, please give more details: + +- host kernel version + +- qemu version + +- qemu command line + +- how were these qcow2 files created? + +- What version of qcow2 file does `qemu-img info` say they are? + +- What version of windows? (10?) + +- Can you name one of the third party disk managers so we can try to reproduce it? + + +WDM claims it to be a MBR + +Linux 5.6.14 + +QEMU 5.0.0-6 + +`nobody 19023 109 21.1 7151512 3462300 ? Sl 13:18 0:32 /usr/bin/qemu-system-x86_64 -name guest=win10machine,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-win10machine/master-key.aes -machine pc-q35-4.2,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu Haswell-noTSX,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff -m 4096 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid db88f5fc-47f0-439c-9192-a5991df2d8f8 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=34,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/win10.qcow2","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-3-format","read-only":false,"driver":"qcow2","file":"libvirt-3-storage","backing":null} -device ide-hd,bus=ide.0,drive=libvirt-3-format,id=sata0-0-0,bootindex=1 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/dump1.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device ide-hd,bus=ide.1,drive=libvirt-2-format,id=sata0-0-1 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/dump2.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device ide-hd,bus=ide.2,drive=libvirt-1-format,id=sata0-0-2 -netdev tap,fd=36,id=hostnet0 -device e1000e,netdev=hostnet0,id=net0,mac=52:54:00:b5:3a:ca,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on` + +The qcow2 of the guest was created in VMM and the qcow2 that I can't manipulate was created with (if I remember well) something like `qemu-img convert /dev/sda2 -O image.qcow2` from a Windows physical machine + +Format specific information: + compat: 1.1 + lazy refcounts: false + refcount bits: 16 + corrupt: false + +W10 + +All the managers that i've tried were the same, but you can try for example MiniTool or EaseUS + +WDM claims it to be a MBR + +Linux 5.6.14 + +QEMU 5.0.0-6 + +`nobody 19023 109 21.1 7151512 3462300 ? Sl 13:18 0:32 /usr/bin/qemu-system-x86_64 -name guest=win10machine,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-win10machine/master-key.aes -machine pc-q35-4.2,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu Haswell-noTSX,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff -m 4096 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid db88f5fc-47f0-439c-9192-a5991df2d8f8 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=34,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/win10.qcow2","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-3-format","read-only":false,"driver":"qcow2","file":"libvirt-3-storage","backing":null} -device ide-hd,bus=ide.0,drive=libvirt-3-format,id=sata0-0-0,bootindex=1 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/dump1.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device ide-hd,bus=ide.1,drive=libvirt-2-format,id=sata0-0-1 -blockdev {"driver":"file","filename":"/home/user/nvme0n1/p1/dump2.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device ide-hd,bus=ide.2,drive=libvirt-1-format,id=sata0-0-2 -netdev tap,fd=36,id=hostnet0 -device e1000e,netdev=hostnet0,id=net0,mac=52:54:00:b5:3a:ca,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on` + +The qcow2 of the guest was created in VMM and the qcow2 that I can't manipulate was created with (if I remember well) something like `qemu-img convert -f raw /dev/sda2 -O image.qcow2` from a Windows physical machine + +Format specific information: + compat: 1.1 + lazy refcounts: false + refcount bits: 16 + corrupt: false + +W10 + +All the managers that i've tried were the same, but you can try for example MiniTool or EaseUS + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1888431 b/results/classifier/105/vnc/1888431 new file mode 100644 index 000000000..21581216a --- /dev/null +++ b/results/classifier/105/vnc/1888431 @@ -0,0 +1,98 @@ +vnc: 0.672 +instruction: 0.634 +graphic: 0.613 +other: 0.537 +assembly: 0.527 +KVM: 0.501 +semantic: 0.490 +device: 0.485 +socket: 0.481 +mistranslation: 0.453 +boot: 0.409 +network: 0.374 + +v5.1.0-rc1 build fails on Mac OS X 10.11.6 + +Hi all, + +build of tag v5.1.0-rc1 fails on Mac OS X 10.11.6 (El Capitan) with the following error: + +git clone https://git.qemu.org/git/qemu.git + <output elided, but all OK> +cd qemu +git submodule init + <output elided, but all OK> +git submodule update --recursive + <output elided, but all OK> +./configure + <output elided, but all OK> +make + <output elided, but all OK up until fail> + + CC trace/control.o +In file included from trace/control.c:29: +In file included from /Users/rtb/src/qemu/include/monitor/monitor.h:4: +In file included from /Users/rtb/src/qemu/include/block/block.h:4: +In file included from /Users/rtb/src/qemu/include/block/aio.h:23: +/Users/rtb/src/qemu/include/qemu/timer.h:843:9: warning: implicit declaration of function 'clock_gettime' is invalid in C99 + [-Wimplicit-function-declaration] + clock_gettime(CLOCK_MONOTONIC, &ts); + ^ +/Users/rtb/src/qemu/include/qemu/timer.h:843:23: error: use of undeclared identifier 'CLOCK_MONOTONIC' + clock_gettime(CLOCK_MONOTONIC, &ts); + ^ +1 warning and 1 error generated. +make: *** [trace/control.o] Error 1 + + +rtb:qemu rtb$ git log -n1 +commit c8004fe6bbfc0d9c2e7b942c418a85efb3ac4b00 (HEAD -> master, tag: v5.1.0-rc1, origin/master, origin/HEAD) +Author: Peter Maydell <email address hidden> +Date: Tue Jul 21 20:28:59 2020 +0100 + + Update version for v5.1.0-rc1 release + + Signed-off-by: Peter Maydell <email address hidden> +rtb:qemu rtb$ + + +Please find the full output of all the commands (from git clone of the repo, to the make) in the attached file "buildfail.txt". + +Thank you! + +Best regards, + +Robert Ball + + + +I'm sorry, but the QEMU project only supports the two most recent versions of macOS (see https://www.qemu.org/docs/master/system/build-platforms.html#macos ), i.e. everything that is older than macOS 10.14 is not supported anymore. + +OK, thank you for pointing that out. + +Question, can you help me identify the most recent release/tag/commit that I could back up to which would support Mac OS X 10.11.6? + +Thank you! + +Best regards, + +Robert Ball + + +Hmm, let's see ... the work-arounds for old Mac OS X versions have been removed here: + +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=483644c25b932360018 + +It mentiones that this commit has broken compilation earlier: + +https://git.qemu.org/?p=qemu.git;a=commitdiff;h=50290c002c045280f8d + +... so the newest version that still might be compilable is v4.0. + +Fantastic. Thank you Thomas, greatly appreciated! + +Best regards, + +Robert Ball + + diff --git a/results/classifier/105/vnc/1893744 b/results/classifier/105/vnc/1893744 new file mode 100644 index 000000000..344ca42c6 --- /dev/null +++ b/results/classifier/105/vnc/1893744 @@ -0,0 +1,140 @@ +vnc: 0.232 +KVM: 0.179 +other: 0.158 +semantic: 0.143 +instruction: 0.133 +assembly: 0.123 +device: 0.112 +graphic: 0.111 +socket: 0.111 +mistranslation: 0.109 +boot: 0.094 +network: 0.089 + +meson: incomplete 'make help' + +Since the meson switch, 'make help' doesn't list various targets. + +Diff before/after: + +--- + Generic targets: + all - Build all + dir/file.o - Build specified target only + install - Install QEMU + ctags/TAGS - Generate tags file for editors + cscope - Generate cscope index +- +-Architecture specific targets: +- aarch64-softmmu/all - Build for aarch64-softmmu +- alpha-softmmu/all - Build for alpha-softmmu +- arm-softmmu/all - Build for arm-softmmu +- avr-softmmu/all - Build for avr-softmmu +- cris-softmmu/all - Build for cris-softmmu +- hppa-softmmu/all - Build for hppa-softmmu +- i386-softmmu/all - Build for i386-softmmu +- lm32-softmmu/all - Build for lm32-softmmu +- m68k-softmmu/all - Build for m68k-softmmu +- microblazeel-softmmu/all - Build for microblazeel-softmmu +- microblaze-softmmu/all - Build for microblaze-softmmu +- mips64el-softmmu/all - Build for mips64el-softmmu +- mips64-softmmu/all - Build for mips64-softmmu +- mipsel-softmmu/all - Build for mipsel-softmmu +- mips-softmmu/all - Build for mips-softmmu +- moxie-softmmu/all - Build for moxie-softmmu +- nios2-softmmu/all - Build for nios2-softmmu +- or1k-softmmu/all - Build for or1k-softmmu +- ppc64-softmmu/all - Build for ppc64-softmmu +- ppc-softmmu/all - Build for ppc-softmmu +- riscv32-softmmu/all - Build for riscv32-softmmu +- riscv64-softmmu/all - Build for riscv64-softmmu +- rx-softmmu/all - Build for rx-softmmu +- s390x-softmmu/all - Build for s390x-softmmu +- sh4eb-softmmu/all - Build for sh4eb-softmmu +- sh4-softmmu/all - Build for sh4-softmmu +- sparc64-softmmu/all - Build for sparc64-softmmu +- sparc-softmmu/all - Build for sparc-softmmu +- tricore-softmmu/all - Build for tricore-softmmu +- unicore32-softmmu/all - Build for unicore32-softmmu +- x86_64-softmmu/all - Build for x86_64-softmmu +- xtensaeb-softmmu/all - Build for xtensaeb-softmmu +- xtensa-softmmu/all - Build for xtensa-softmmu +- aarch64_be-linux-user/all - Build for aarch64_be-linux-user +- aarch64-linux-user/all - Build for aarch64-linux-user +- alpha-linux-user/all - Build for alpha-linux-user +- armeb-linux-user/all - Build for armeb-linux-user +- arm-linux-user/all - Build for arm-linux-user +- cris-linux-user/all - Build for cris-linux-user +- hppa-linux-user/all - Build for hppa-linux-user +- i386-linux-user/all - Build for i386-linux-user +- m68k-linux-user/all - Build for m68k-linux-user +- microblazeel-linux-user/all - Build for microblazeel-linux-user +- microblaze-linux-user/all - Build for microblaze-linux-user +- mips64el-linux-user/all - Build for mips64el-linux-user +- mips64-linux-user/all - Build for mips64-linux-user +- mipsel-linux-user/all - Build for mipsel-linux-user +- mips-linux-user/all - Build for mips-linux-user +- mipsn32el-linux-user/all - Build for mipsn32el-linux-user +- mipsn32-linux-user/all - Build for mipsn32-linux-user +- nios2-linux-user/all - Build for nios2-linux-user +- or1k-linux-user/all - Build for or1k-linux-user +- ppc64abi32-linux-user/all - Build for ppc64abi32-linux-user +- ppc64le-linux-user/all - Build for ppc64le-linux-user +- ppc64-linux-user/all - Build for ppc64-linux-user +- ppc-linux-user/all - Build for ppc-linux-user +- riscv32-linux-user/all - Build for riscv32-linux-user +- riscv64-linux-user/all - Build for riscv64-linux-user +- s390x-linux-user/all - Build for s390x-linux-user +- sh4eb-linux-user/all - Build for sh4eb-linux-user +- sh4-linux-user/all - Build for sh4-linux-user +- sparc32plus-linux-user/all - Build for sparc32plus-linux-user +- sparc64-linux-user/all - Build for sparc64-linux-user +- sparc-linux-user/all - Build for sparc-linux-user +- tilegx-linux-user/all - Build for tilegx-linux-user +- x86_64-linux-user/all - Build for x86_64-linux-user +- xtensaeb-linux-user/all - Build for xtensaeb-linux-user +- xtensa-linux-user/all - Build for xtensa-linux-user +- +-Helper targets: +- fsdev/virtfs-proxy-helper - Build virtfs-proxy-helper +- scsi/qemu-pr-helper - Build qemu-pr-helper +- qemu-bridge-helper - Build qemu-bridge-helper +- vhost-user-gpu - Build vhost-user-gpu +- virtiofsd - Build virtiofsd +- +-Tools targets: +- qemu-ga - Build qemu-ga tool +- qemu-keymap - Build qemu-keymap tool +- elf2dmp - Build elf2dmp tool +- ivshmem-client - Build ivshmem-client tool +- ivshmem-server - Build ivshmem-server tool +- qemu-nbd - Build qemu-nbd tool +- qemu-storage-daemon - Build qemu-storage-daemon tool +- qemu-img - Build qemu-img tool +- qemu-io - Build qemu-io tool +- qemu-edid - Build qemu-edid tool ++ sparse - Run sparse on the QEMU source + + Cleaning targets: + clean - Remove most generated files but keep +the config +@@ -105,7 +20,7 @@ + vm-help - Help about targets running tests +inside VM + + Documentation targets: +- html info pdf txt - Build documentation in specified format ++ html info pdf txt man - Build documentation in specified format + + make [targets] - (quiet build, default) + make V=1 [targets] - (verbose build) +--- + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'invalid' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/227 + + diff --git a/results/classifier/105/vnc/1903752 b/results/classifier/105/vnc/1903752 new file mode 100644 index 000000000..5ad764789 --- /dev/null +++ b/results/classifier/105/vnc/1903752 @@ -0,0 +1,35 @@ +vnc: 0.764 +other: 0.714 +graphic: 0.705 +device: 0.631 +mistranslation: 0.596 +semantic: 0.458 +instruction: 0.452 +network: 0.426 +socket: 0.256 +boot: 0.122 +assembly: 0.042 +KVM: 0.023 + +qemu-system-avr error: qemu-system-avr: execution left flash memory + +I compiled QEMU 5.1 from source with target avr-softmmu. Running demo.elf from https://github.com/seharris/qemu-avr-tests/blob/master/free-rtos/Demo/AVR_ATMega2560_GCC/demo.elf (linked from https://www.qemu.org/docs/master/system/target-avr.html) yields the following error: + +$ ./qemu-5.1.0/avr-softmmu/qemu-system-avr -machine mega2560 -bios demo.elf +VNC server running on 127.0.0.1:5900 +qemu-system-avr: execution left flash memory +Aborted (core dumped) + +I compiled QEMU on Ubuntu Server 20.10 with gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 + +The error is not related to demo.elf. Running qemu-system-avr without a program (no "-bios program.elf" option) also yields the same error: + +$ ./qemu-5.1.0/avr-softmmu/qemu-system-avr -machine mega2560 +VNC server running on 127.0.0.1:5900 +qemu-system-avr: execution left flash memory +Aborted (core dumped) + +I can not reproduce. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1906516 b/results/classifier/105/vnc/1906516 new file mode 100644 index 000000000..6318e0892 --- /dev/null +++ b/results/classifier/105/vnc/1906516 @@ -0,0 +1,137 @@ +vnc: 0.799 +assembly: 0.675 +instruction: 0.668 +KVM: 0.667 +graphic: 0.629 +socket: 0.577 +other: 0.567 +mistranslation: 0.563 +network: 0.530 +device: 0.529 +semantic: 0.492 +boot: 0.455 + +[RISCV] sfence.vma need to end the translation block + +QEMU emulator version 5.0.0 + +sfence.vma will flush the tlb, so after this instruction, the translation block should be end. The following code will only work in single step mode: +``` +relocate: + li a0, OFFSET + + la t0, 1f + add t0, t0, a0 + csrw stvec, t0 + + la t0, early_pgtbl + srl t0, t0, PAGE_SHIFT + li t1, SATP_SV39 + or t0, t1, t0 + + csrw satp, t0 +1: + sfence.vma + la t0, trap_s + csrw stvec, t0 + ret +``` + +In this code, I want to relocate pc to virtual address with the OFFSET prefix, before writing to satp, pc run at physic address, stvec has been set a label 1 with a virtual prefix and virtual address has been mapping in early_pgtbl, after writing satp, there will throw a page fault, and pc will set to virtual address of label 1. + +The problem is that, in this situation, the translation block will not end after sfence.vma, and stvec will be set to trap_s, + +``` +---------------- +IN: +Priv: 1; Virt: 0 +0x00000000800000dc: 00a080b3 add ra,ra,a0 +0x00000000800000e0: 00007297 auipc t0,28672 # 0x800070e0 +0x00000000800000e4: f2028293 addi t0,t0,-224 +0x00000000800000e8: 00c2d293 srli t0,t0,12 +0x00000000800000ec: fff0031b addiw t1,zero,-1 +0x00000000800000f0: 03f31313 slli t1,t1,63 +0x00000000800000f4: 005362b3 or t0,t1,t0 +0x00000000800000f8: 18029073 csrrw zero,satp,t0 + +---------------- +IN: +Priv: 1; Virt: 0 +0x00000000800000fc: 12000073 sfence.vma zero,zero +0x0000000080000100: 00000297 auipc t0,0 # 0x80000100 +0x0000000080000104: 1c828293 addi t0,t0,456 +0x0000000080000108: 10529073 csrrw zero,stvec,t0 + +riscv_raise_exception: 12 +riscv_raise_exception: 12 +riscv_raise_exception: 12 +riscv_raise_exception: 12 +... +``` + +So, the program will crash, and the program will work in single step mode: +``` +---------------- +IN: +Priv: 1; Virt: 0 +0x00000000800000f8: 18029073 csrrw zero,satp,t0 + +---------------- +IN: +Priv: 1; Virt: 0 +0x00000000800000fc: 12000073 sfence.vma zero,zero + +riscv_raise_exception: 12 +---------------- +IN: +Priv: 1; Virt: 0 +0xffffffff800000fc: 12000073 sfence.vma zero,zero + +---------------- +IN: +Priv: 1; Virt: 0 +0xffffffff80000100: 00000297 auipc t0,0 # 0xffffffff80000100 + +``` +The pc will set to label 1, instead of trap_s. + +I try to patch the code in fence.i in trans_rvi.inc.c to sfence.vma: +``` + tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn); + exit_tb(ctx); + ctx->base.is_jmp = DISAS_NORETURN; +``` +This codes can help to end the tranlate block, since I'm not a qemu guy, I'm not sure if this is a corret method. + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/1912780 b/results/classifier/105/vnc/1912780 new file mode 100644 index 000000000..9ae95bc69 --- /dev/null +++ b/results/classifier/105/vnc/1912780 @@ -0,0 +1,197 @@ +vnc: 0.764 +KVM: 0.752 +other: 0.728 +mistranslation: 0.707 +graphic: 0.697 +instruction: 0.669 +boot: 0.661 +device: 0.658 +semantic: 0.656 +assembly: 0.631 +network: 0.610 +socket: 0.584 + +QEMU: Null Pointer Failure in fdctrl_read() in hw/block/fdc.c + +[via qemu-security list] + +This is Gaoning Pan from Zhejiang University & Ant Security Light-Year Lab. +I found a Null Pointer issue locates in fdctrl_read() in hw/block/fdc.c. +This flaw allows a malicious guest user or process in a denial of service condition. + +This issus was discovered in the latest Qemu-5.2.0. When using floppy device, there are several +choices to get specific drive in get_drv(), depending on fdctrl->cur_drv. But not all drives are +initialized properly, leaving fdctrl->drives[0]->blk as NULL. So when the drive was used in +blk_pread(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo, BDRV_SECTOR_SIZE) at line 1918, +null pointer access triggers, thus denial of service.My reproduced environment is as follows: + + Host: ubuntu 18.04 + Guest: ubuntu 18.04 + +My boot command is as follows: + + qemu-system-x86_64 -enable-kvm -boot c -m 2G -drive format=qcow2,file=./ubuntu.img \ + -nic user,hostfwd=tcp:0.0.0.0:5555-:22 -device floppy,unit=1,drive=mydrive \ + -drive id=mydrive,file=null-co://,size=2M,format=raw,if=none -display none + +ASAN output is as follows: +================================================================= +==14688==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000034c (pc 0x5636eee9bbaf bp 0x7ff2a53fdea0 sp 0x7ff2a53fde90 T3) +==14688==The signal is caused by a WRITE memory access. +==14688==Hint: address points to the zero page. + #0 0x5636eee9bbae in blk_inc_in_flight ../block/block-backend.c:1356 + #1 0x5636eee9b766 in blk_prw ../block/block-backend.c:1328 + #2 0x5636eee9cd76 in blk_pread ../block/block-backend.c:1491 + #3 0x5636ee1adf24 in fdctrl_read_data ../hw/block/fdc.c:1918 + #4 0x5636ee1a6654 in fdctrl_read ../hw/block/fdc.c:935 + #5 0x5636eebb84c8 in portio_read ../softmmu/ioport.c:179 + #6 0x5636ee9848c5 in memory_region_read_accessor ../softmmu/memory.c:442 + #7 0x5636ee9855c2 in access_with_adjusted_size ../softmmu/memory.c:552 + #8 0x5636ee98f0b7 in memory_region_dispatch_read1 ../softmmu/memory.c:1420 + #9 0x5636ee98f311 in memory_region_dispatch_read ../softmmu/memory.c:1449 + #10 0x5636ee8ff64a in flatview_read_continue ../softmmu/physmem.c:2822 + #11 0x5636ee8ff9e5 in flatview_read ../softmmu/physmem.c:2862 + #12 0x5636ee8ffb83 in address_space_read_full ../softmmu/physmem.c:2875 + #13 0x5636ee8ffdeb in address_space_rw ../softmmu/physmem.c:2903 + #14 0x5636eea6a924 in kvm_handle_io ../accel/kvm/kvm-all.c:2285 + #15 0x5636eea6c5e3 in kvm_cpu_exec ../accel/kvm/kvm-all.c:2531 + #16 0x5636eeca492b in kvm_vcpu_thread_fn ../accel/kvm/kvm-cpus.c:49 + #17 0x5636ef1bc296 in qemu_thread_start ../util/qemu-thread-posix.c:521 + #18 0x7ff337c736da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) + #19 0x7ff33799ca3e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x121a3e) + +AddressSanitizer can not provide additional info. +SUMMARY: AddressSanitizer: SEGV ../block/block-backend.c:1356 in blk_inc_in_flight +Thread T3 created by T0 here: + #0 0x7ff33c580d2f in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x37d2f) + #1 0x5636ef1bc673 in qemu_thread_create ../util/qemu-thread-posix.c:558 + #2 0x5636eeca4ce7 in kvm_start_vcpu_thread ../accel/kvm/kvm-cpus.c:73 + #3 0x5636ee9aa965 in qemu_init_vcpu ../softmmu/cpus.c:622 + #4 0x5636ee82a9b4 in x86_cpu_realizefn ../target/i386/cpu.c:6731 + #5 0x5636eed002f4 in device_set_realized ../hw/core/qdev.c:886 + #6 0x5636eecc59bc in property_set_bool ../qom/object.c:2251 + #7 0x5636eecc0c28 in object_property_set ../qom/object.c:1398 + #8 0x5636eecb6fb9 in object_property_set_qobject ../qom/qom-qobject.c:28 + #9 0x5636eecc1175 in object_property_set_bool ../qom/object.c:1465 + #10 0x5636eecfc286 in qdev_realize ../hw/core/qdev.c:399 + #11 0x5636ee739b34 in x86_cpu_new ../hw/i386/x86.c:111 + #12 0x5636ee739d6d in x86_cpus_init ../hw/i386/x86.c:138 + #13 0x5636ee6f843e in pc_init1 ../hw/i386/pc_piix.c:159 + #14 0x5636ee6fab1e in pc_init_v5_2 ../hw/i386/pc_piix.c:438 + #15 0x5636ee1cb4a7 in machine_run_board_init ../hw/core/machine.c:1134 + #16 0x5636ee9c323d in qemu_init ../softmmu/vl.c:4369 + #17 0x5636edd92c71 in main ../softmmu/main.c:49 + #18 0x7ff33789cb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96) + +==14688==ABORTING + +Reproducer is attached. + +Best regards. +Gaoning Pan of Zhejiang University & Ant Security Light-Year Lab + + + +The given reproducer does not seem to work as expected to trigger this issue. +IIUC, issue occurs because a privileged guest user may change the selected +floppy drive via FD_REG_DOR:fdctrl_write_dor() ioport write command + + static void fdctrl_write_dor(FDCtrl *fdctrl, uint32_t value) + { + ... + /* Selected drive */ + fdctrl->cur_drv = value & FD_DOR_SELMASK; <= selected drive changes based on 'value' + ... + } + +Little tweaking of parameters under gdb reproduces the crash + +$ gdb --args ./bin/qemu-system-x86_64 -runas test -nographic -enable-kvm -m 2048 + -drive file=fdc.img,format=qcow2,if=floppy,id=myfdc /var/lib/libvirt/images/f27vm.qcow2 +... +==541702==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000034c (pc 0x55555938377f bp 0x7fff6f3fdeb0 sp 0x7fff6f3fdea0 T3) +==541702==The signal is caused by a WRITE memory access. +==541702==Hint: address points to the zero page. + #0 0x55555938377f in blk_inc_in_flight ../block/block-backend.c:1356 + #1 0x55555938325b in blk_prw ../block/block-backend.c:1328 + #2 0x555559384ec5 in blk_pread ../block/block-backend.c:1491 + #3 0x555557d7c798 in fdctrl_read_data ../hw/block/fdc.c:1919 + #4 0x555557d7207c in fdctrl_read ../hw/block/fdc.c:936 + #5 0x555558ee7c40 in portio_read ../softmmu/ioport.c:179 + #6 0x555558c9a0c1 in memory_region_read_accessor ../softmmu/memory.c:442 + #7 0x555558c9af04 in access_with_adjusted_size ../softmmu/memory.c:552 + #8 0x555558ca7159 in memory_region_dispatch_read1 ../softmmu/memory.c:1420 + #9 0x555558ca7433 in memory_region_dispatch_read ../softmmu/memory.c:1449 + #10 0x555558f6214e in flatview_read_continue ../softmmu/physmem.c:2822 + #11 0x555558f62560 in flatview_read ../softmmu/physmem.c:2862 + #12 0x555558f62700 in address_space_read_full ../softmmu/physmem.c:2875 + #13 0x555558f62977 in address_space_rw ../softmmu/physmem.c:2903 + #14 0x555558d037b9 in kvm_handle_io ../accel/kvm/kvm-all.c:2285 + #15 0x555558d05a4b in kvm_cpu_exec ../accel/kvm/kvm-all.c:2531 + #16 0x555558ee0efa in kvm_vcpu_thread_fn ../accel/kvm/kvm-cpus.c:49 + #17 0x55555977ec18 in qemu_thread_start ../util/qemu-thread-posix.c:521 + #18 0x7ffff63323f8 in start_thread (/lib64/libpthread.so.0+0x93f8) + #19 0x7ffff625f902 in __GI___clone (/lib64/libc.so.6+0x101902) + +Proposed patch: + +$ git diff hw/block/ +diff --git a/hw/block/fdc.c b/hw/block/fdc.c +index 3636874432..13a9470d19 100644 +--- a/hw/block/fdc.c ++++ b/hw/block/fdc.c +@@ -1429,7 +1429,9 @@ static void fdctrl_write_dor(FDCtrl *fdctrl, uint32_t value) + } + } + /* Selected drive */ +- fdctrl->cur_drv = value & FD_DOR_SELMASK; ++ if (fdctrl->drives[value & FD_DOR_SELMASK].blk) { ++ fdctrl->cur_drv = value & FD_DOR_SELMASK; ++ } + + fdctrl->dor = value; + } +@@ -1894,6 +1896,10 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl) + uint32_t pos; + + cur_drv = get_cur_drv(fdctrl); ++ if (!cur_drv->blk) { ++ FLOPPY_DPRINTF("No drive connected\n"); ++ return 0; ++ } + fdctrl->dsr &= ~FD_DSR_PWRDOWN; + if (!(fdctrl->msr & FD_MSR_RQM) || !(fdctrl->msr & FD_MSR_DIO)) { + FLOPPY_DPRINTF("error: controller not ready for reading\n"); +@@ -2420,7 +2426,8 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value) + if (pos == FD_SECTOR_LEN - 1 || + fdctrl->data_pos == fdctrl->data_len) { + cur_drv = get_cur_drv(fdctrl); +- if (blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo, ++ if (cur_drv->blk == NULL ++ || blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo, + BDRV_SECTOR_SIZE, 0) < 0) { + FLOPPY_DPRINTF("error writing sector %d\n", + fd_sector(cur_drv)); + +On Friday, 22 January, 2021, 05:42:55 pm IST, 潘高宁 <email address hidden> wrote: +> This patch seems to work now. I've re-compiled and tested the QEMU, which showed the functional operation was working well. + + +CVE-2021-20196 assigned by Red Hat Inc. + + +Upstream patch: + -> https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05986.html + +Took a look at the patch today, I think it might need a change or two but it should be quick to do. I've asked Thomas to move this issue to gitlab so I can keep a closer eye on it. + +--js + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/338 + + diff --git a/results/classifier/105/vnc/1923648 b/results/classifier/105/vnc/1923648 new file mode 100644 index 000000000..745ec8ad0 --- /dev/null +++ b/results/classifier/105/vnc/1923648 @@ -0,0 +1,72 @@ +vnc: 0.884 +device: 0.870 +network: 0.849 +other: 0.846 +graphic: 0.808 +socket: 0.794 +instruction: 0.758 +assembly: 0.744 +mistranslation: 0.712 +KVM: 0.679 +semantic: 0.650 +boot: 0.628 + +macOS App Nap feature gradually freezes QEMU process + +macOS version: 10.15.2 +QEMU versions: 5.2.0 (from MacPorts) + 5.2.92 (v6.0.0-rc2-23-g9692c7b037) + +If the QEMU window is not visible (hidden, minimized or another application is in full screen mode), the QEMU process gradually freezes: it still runs, but the VM does not respond to external requests such as Telnet or SSH until the QEMU window is visible on the desktop. + +This behavior is due to the work of the macOS App Nap function: +https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/AppNap.html#//apple_ref/doc/uid/TP40013929-CH2-SW1 + +It doesn't matter how the process is started -- as a background job or as a foreground shell process in case QEMU has a desktop window. + +My VM does not have a display output, only a serial line, most likely if the VM was using OpenGL, or playing sound (or any other App Nap triggers), then the problem would never have been detected. + +In my case only one starting way without this problem: +sudo qemu-system-x86_64 -nodefaults \ +-cpu host -accel hvf -smp 1 -m 384 \ +-device virtio-blk-pci,drive=flash0 \ +-drive file=/vios-adventerprisek9-m.vmdk.SPA.156-1.T.vmdk,if=none,format=vmdk,id=flash0 \ +-device e1000,netdev=local -netdev tap,id=local,ifname=tap0,script=no,downscript=no \ +-serial stdio -display none + +The typical way from the internet to disable App Nap doesn't work: +defaults write NSGlobalDomain NSAppSleepDisabled -bool YES + +The QEMU project is currently moving its bug tracking to another system. +For this we need to know which bugs are still valid and which could be +closed already. Thus we are setting the bug state to "Incomplete" now. + +If the bug has already been fixed in the latest upstream version of QEMU, +then please close this ticket as "Fix released". + +If it is not fixed yet and you think that this bug report here is still +valid, then you have two options: + +1) If you already have an account on gitlab.com, please open a new ticket +for this problem in our new tracker here: + + https://gitlab.com/qemu-project/qemu/-/issues + +and then close this ticket here on Launchpad (or let it expire auto- +matically after 60 days). Please mention the URL of this bug ticket on +Launchpad in the new ticket on GitLab. + +2) If you don't have an account on gitlab.com and don't intend to get +one, but still would like to keep this ticket opened, then please switch +the state back to "New" or "Confirmed" within the next 60 days (other- +wise it will get closed as "Expired"). We will then eventually migrate +the ticket automatically to the new system (but you won't be the reporter +of the bug in the new system and thus you won't get notified on changes +anymore). + +Thank you and sorry for the inconvenience. + + +Moved here: +https://gitlab.com/qemu-project/qemu/-/issues/334 + diff --git a/results/classifier/105/vnc/1923861 b/results/classifier/105/vnc/1923861 new file mode 100644 index 000000000..b8717d05c --- /dev/null +++ b/results/classifier/105/vnc/1923861 @@ -0,0 +1,165 @@ +vnc: 0.945 +instruction: 0.925 +device: 0.911 +socket: 0.907 +graphic: 0.903 +assembly: 0.900 +other: 0.899 +mistranslation: 0.896 +KVM: 0.881 +semantic: 0.876 +boot: 0.852 +network: 0.844 + +Hardfault when accessing FPSCR register + +QEMU release version: v6.0.0-rc2 + +command line: +qemu-system-arm -machine mps3-an547 -nographic -kernel <my_project>.elf -semihosting -semihosting-config enable=on,target=native + +host operating system: Linux ISCNR90TMR1S 5.4.72-microsoft-standard-WSL2 #1 SMP Wed Oct 28 23:40:43 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux + +guest operating system: none (bare metal) + +Observation: +I am simulating embedded firmware for a Cortex-M55 device, using MPS3-AN547 machine. In the startup code I am accessing the FPSCR core register: + + unsigned int fpscr =__get_FPSCR(); + fpscr = fpscr & (~FPU_FPDSCR_AHP_Msk); + __set_FPSCR(fpscr); + +where the register access functions __get_FPSCR() and __set_FPSCR(fpscr) are taken from CMSIS_5 at ./CMSIS/Core/include/cmsis_gcc.h + +I observe hardfaults upon __get_FPSCR() and __set_FPSCR(fpscr). The same startup code works fine on the Arm Corstone-300 FVP. + +Does your code enable the FPU (via the CPACR and, if running in NonSecure) the NSACR? If not then a fault is exactly what you should expect. (I believe the FVP has a non-standard behaviour where it will enable the FPU by default even though real hardware does not behave that way.) + + +Yes, I think I did: + + SCB->NSACR |= (3U << 10U); /* enable Non-secure access to CP10 and CP11 coprocessors */ + __DSB(); + __ISB(); + + SCB->CPACR |= ((3U << 10U*2U) | /* enable CP10 Full Access */ + (3U << 11U*2U) ); /* enable CP11 Full Access */ + __DSB(); + __ISB(); + +But I get a NOCP (no coprocessor) hard fault. + +Does the qemu mps3-an547 model contain the FPU by default or do I have to select it via the command line? +Is there an example code / test case included in the qemu database where I can lookup the usage of mps3-an547 + FPU? + +Do you have a guest binary and QEMU commandline I can use to reproduce the issue ? + + +Command line is +qemu-system-arm -machine mps3-an547 -nographic -kernel test.elf -semihosting -semihosting-config enable=on,target=native + +Binary is attached. It does + +int main(int argc, char* argv[]) +{ + SCB->NSACR |= (3U << 10U); /* enable Non-secure access to CP10 and CP11 coprocessors */ + __DSB(); + __ISB(); + + SCB->CPACR |= ((3U << 10U*2U) | /* enable CP10 Full Access */ + (3U << 11U*2U) ); /* enable CP11 Full Access */ + __DSB(); + __ISB(); + +// enable DL branch cache + #define CCR (*((volatile unsigned int *)0xE000ED14)) + #define CCR_DL (1 << 19) + CCR |= CCR_DL; + __ISB(); + + uint32_t result; + __asm volatile ("VMRS %0, fpscr" : "=r" (result) ); // <-- NOCP hardfault + printf("fpscr = 0x%08lx\r\n", result); + __asm volatile ("VMRS %0, mvfr0" : "=r" (result) ); + printf("mvfr0 = 0x%08lx\r\n", result); + __asm volatile ("VMRS %0, mvfr1" : "=r" (result) ); + printf("mvfr1 = 0x%08lx\r\n", result); + __asm volatile ("VMRS %0, mvfr2" : "=r" (result) ); + printf("mvfr2 = 0x%08lx\r\n", result); + + exit(0); +} + +Thank you for your help! + + +Thanks. This is a bug in the AN547 model -- we were accidentally turning off the FPU. I'll write a patch. + +NB that with that bug fixed your code then hits an UNDEF trying to do: + 0x00000996: eef7 1a10 vmrs r1, mvfr0 + +Only A-profile CPUs have MVFR0 accessible via the vmrs instruction. For M-profile this register is memory-mapped, at 0xE000EF40. + + +The bug fix for the QEMU part of this is +https://<email address hidden>/ + + +Thanks for the fix. I applied it and +1. yes, the hard fault when reading FPSCR is gone. +2. yes, I also see the UNDEF. Note that on the Corstone-300 MPS3-AN547 FVP I can access mvfr0 via vmrs. + +I changed the vmrs to ldr. Now I can read the registers. The values differ from what the FVP tells me: +fpscr = 0x00000000 (qemu-system-arm) - 0x00040000 (Corstone FVP) +mvfr0 = 0x10110021 - 0x10110221 +mvfr1 = 0x11000011 - 0x12100211 +mvfr2 = 0x00000040 - 0x00000040 + +Using the FPU for some simple calculations + + volatile int nom_i, den_i; + nom_i = 7; + den_i = 3; + volatile float nom_f, den_f, div_f; + nom_f = (float)nom_i; + den_f = (float)den_i; + div_f = nom_f / den_f; + printf("%e / %f = %f\r\n", nom_f, den_f, div_f); + +I run into another UNDEF when executing + vcvt.f64.f32 d6, s12 + +Again, the FVP can execute the same elf. I attached it. Maybe you can have another look. + +Some of those ID register differences are expected; some I'm surprised by. The differences are: + * no MVE (expected, we don't implement it yet) + * no double-precision + * no FP16 + +So the missing double-precision is why your vcvt UNDEFs. Those last two ought to be present, but something is squashing them; I will investigate. + +The FPSCR difference is that we aren't reporting FPSCR.LTPSIZE for some reason -- that's a bug too. + + +I changed the compile options to single precision, only. Then, my small FP example works. Ok for my purposes, I don't need double. + +But I would need MVE. Are there any plans to implement MVE? + +Oops -- we were giving the AN547 a Cortex-M33 rather than the -M55 it ought to have :-( + +Yes, MVE is next on my todo list; it will probably be in 6.2, or maybe 7.0 depending how long it takes to implement it all. + + +https://<email address hidden>/ should fix the "not actually an M55" bug which will then give you the double-precision and FP16 (and the right FPSCR value). + + +I tried your fix. Yes, the fpscr and mvfr0/1/2 values do match the FVP, now (except for the MVE bit which is explained above). + +Thx for the updates. + +These fixes are now in master and will be in rc4 and the eventual 6.0 release. + + +https://gitlab.com/qemu-project/qemu/-/commit/330ef14e6e749919c5c +https://gitlab.com/qemu-project/qemu/-/commit/1df0878cff267128393 + diff --git a/results/classifier/105/vnc/1988 b/results/classifier/105/vnc/1988 new file mode 100644 index 000000000..2db4d0ec0 --- /dev/null +++ b/results/classifier/105/vnc/1988 @@ -0,0 +1,39 @@ +vnc: 0.920 +graphic: 0.901 +instruction: 0.842 +network: 0.837 +device: 0.829 +mistranslation: 0.757 +semantic: 0.714 +socket: 0.585 +other: 0.541 +boot: 0.435 +assembly: 0.200 +KVM: 0.165 + +8.2.0rc0 Regression: '-display vnc' opens gtk display as well +Description of problem: +A VNC display is requested, but a GTK frontend is opened as well. A VNC client is able to connect. +Steps to reproduce: +1. /configure --enable-fdt=internal --target-list=x86_64-softmmu +2. make +3. build/qemu-system-x86_64 -display vnc=:05 -k de +Additional information: +git bisect finally shows +``` +484629fc8141eaa257f961b5e5e310a1bbd0f1a2 is the first bad commit +commit 484629fc8141eaa257f961b5e5e310a1bbd0f1a2 +Author: Marc-André Lureau <marcandre.lureau@redhat.com> +Date: Wed Oct 25 17:21:17 2023 +0400 + + vl: simplify display_remote logic + + Bump the display_remote variable when the -vnc option is parsed, just + like -spice. + + Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> + Reviewed-by: Thomas Huth <thuth@redhat.com> + + system/vl.c | 6 +----- + 1 file changed, 1 insertion(+), 5 deletions(-) +``` diff --git a/results/classifier/105/vnc/2001 b/results/classifier/105/vnc/2001 new file mode 100644 index 000000000..ad3f14299 --- /dev/null +++ b/results/classifier/105/vnc/2001 @@ -0,0 +1,56 @@ +vnc: 0.929 +device: 0.924 +instruction: 0.896 +network: 0.880 +boot: 0.877 +graphic: 0.824 +assembly: 0.764 +KVM: 0.732 +socket: 0.724 +mistranslation: 0.695 +semantic: 0.496 +other: 0.483 + +qemu_img convert and drive mirror migrate the same raw disk to rbd volume, will get the different USED size in ceph cluster. +Description of problem: +qemu_img convert and drive mirror migrate the same raw disk to rbd volume, will get the different USED size in ceph cluster. +Steps to reproduce: +create raw and qcow2 disk + +1. qemu-img create -f raw lvm_volume_1.raw 12G +2. qemu-img create -f qcow2 lvm_volume_1.qcow2 12G + + install a centos OS + +3. qemu-system-x86_64 -m 4096 -drive file=lvm_volume_1.qcow2,format=qcow2,index=0 -nographic -cdrom CentOS-8.3.2011-x86_64-dvd1.iso -vnc :25 + + convert the qcow2 OS disk to q raw OS disk + +4. qemu-img convert -f qcow2 -O raw ./lvm_volume_1.qcow2 ./lvm_volume_1.raw + + create a qemu-rbd process + +5. qemu-nbd --fork -x node1 -p 1238 rbd:cephpool- test/volume_1:id=xxx:key=xxx:mon_host=xxx:auth_supported=cephx + + boot the raw OS disk + +6. qemu-system-x86_64 -hda ./lvm_volume_1.raw -m 4096 -smp 4 -vnc :25 -monitor stdio + + migrate the raw OS disk to a ceph volume + +7. drive_mirror -n -f #block125 nbd:localhost:1238:exportname=node1 raw + + check the rbd volume USED size in ceph cluster + "rbd du cephpool-test/volume_1" + the ceph rbd volume PROVISION and USED are the same size + + convert the raw OS disk to a ceph volume + +8. qemu-img convert -n -f raw -O raw ./lvm_volume_1.raw rbd:cephpool- +test/volume_2:id=xxx:key=xxx:mon_host=xxx:auth_supported=cephx + + check the rbd volume USED size in ceph cluster + "rbd du cephpool-test/volume_2" + the ceph rbd volume PROVISION and USED are different PROVISION > USED +Additional information: + diff --git a/results/classifier/105/vnc/2171 b/results/classifier/105/vnc/2171 new file mode 100644 index 000000000..3c50d8b96 --- /dev/null +++ b/results/classifier/105/vnc/2171 @@ -0,0 +1,38 @@ +vnc: 0.961 +device: 0.954 +graphic: 0.949 +socket: 0.854 +mistranslation: 0.843 +instruction: 0.789 +semantic: 0.698 +boot: 0.644 +other: 0.502 +network: 0.457 +KVM: 0.434 +assembly: 0.108 + +VPS Disk space over use +Description of problem: +\# qemu-img info -U v1001-dluw9EHRDbmMd8fQ-CACjC7FWnMhISeDM.qcow2 + +file format: qcow2 + +virtual size: 800G (858993459200 bytes) + +disk size: **812G** + +cluster_size: 65536 + +Format specific information: + +compat: 1.1 + +lazy refcounts: false + +refcount bits: 16 + +corrupt: false + +Disk size is using beyond the Virtual size. + +How is that even possible ? diff --git a/results/classifier/105/vnc/2311 b/results/classifier/105/vnc/2311 new file mode 100644 index 000000000..d14d8bda4 --- /dev/null +++ b/results/classifier/105/vnc/2311 @@ -0,0 +1,28 @@ +vnc: 0.874 +graphic: 0.874 +instruction: 0.774 +socket: 0.685 +semantic: 0.658 +network: 0.653 +device: 0.642 +other: 0.509 +boot: 0.374 +KVM: 0.248 +mistranslation: 0.231 +assembly: 0.206 + +Possible dereference of NULL +Description of problem: +There is possible dereference of NULL using macro QEMU_LOCK_GUARD(&q->lock) in: +1) /block/nvme.c line [326](https://github.com/qemu/qemu/blob/5da72194df36535d773c8bdc951529ecd5e31707/block/nvme.c#L326) +2) /include/qemu/ratelimit.h line [45](https://github.com/qemu/qemu/blob/5da72194df36535d773c8bdc951529ecd5e31707/include/qemu/ratelimit.h#L45) +3) /include/qemu/ratelimit.h line [88](https://github.com/qemu/qemu/blob/5da72194df36535d773c8bdc951529ecd5e31707/include/qemu/ratelimit.h#L88) + + +The QEMU_MAKE_LOCKABLE(x) macro provides a special case (line [71](https://github.com/qemu/qemu/blob/5da72194df36535d773c8bdc951529ecd5e31707/include/qemu/lockable.h#L71) of the lockable.h) if NULL gets into it. Then the macro will return NULL, which will get to the input of the qemu_lockable_auto_lock() function, then to the qemu_lockable_lock() function, where NULL dereference will occur (line [95](https://github.com/qemu/qemu/blob/5da72194df36535d773c8bdc951529ecd5e31707/include/qemu/lockable.h#L95)). + +It turns out that the NULL case is provided, but not handled properly. I think a NULL check should be added. + +Found by Linux Verification Center (portal.linuxtesting.ru) with SVACE. + +Author A. Burke. diff --git a/results/classifier/105/vnc/2490 b/results/classifier/105/vnc/2490 new file mode 100644 index 000000000..2c9117f33 --- /dev/null +++ b/results/classifier/105/vnc/2490 @@ -0,0 +1,64 @@ +vnc: 0.824 +graphic: 0.813 +boot: 0.794 +device: 0.783 +KVM: 0.750 +socket: 0.749 +instruction: 0.745 +semantic: 0.726 +other: 0.709 +network: 0.701 +assembly: 0.684 +mistranslation: 0.670 + +Windows: virtio-vga-gl no longer works with current virglrenderer version +Description of problem: +Error occurs, when executing QEMU with virtio-vga-gl device using current virglrenderer: +First the boot screen is shown as expected. +After a short while the screen shows and keeps showing "virtio-vga-gl: Display output is not active." +Console logs: +``` +qemu: GtkGLArea console lacks DMABUF support. +qemu: GtkGLArea console lacks DMABUF support. +qemu: GtkGLArea console lacks DMABUF support. +qemu: GtkGLArea console lacks DMABUF support. +Realize gdk gl context failed: GL-Kontext kann nicht erstellt werden +Realize gdk gl context failed: GL-Kontext kann nicht erstellt werden +virtio_gpu_virgl_process_cmd: ctrl 0x103, error 0x1203 +virtio_gpu_virgl_process_cmd: ctrl 0x103, error 0x1203 +virtio_gpu_virgl_process_cmd: ctrl 0x103, error 0x1203 +``` +Steps to reproduce: +1. Prepare current Msys2/Ucrt64 environment including virglrenderer 1.0.1 by installing QEMU as described in https://www.qemu.org/download/#windows +2. `wget https://download.opensuse.org/distribution/leap/15.3/live/openSUSE-Leap-15.3-GNOME-Live-x86_64-Media.iso` +3. `qemu-system-x86_64.exe -m 1024 -display gtk,gl=on -device virtio-vga-gl -cdrom openSUSE-Leap-15.3-GNOME-Live-x86_64-Media.iso` +Additional information: +virglrenderer may use certain D3D features starting with virglrenderer 1.0.0, see https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1103 for details + +Given virglrenderer >= 1.0.0, QEMU activates these D3D features since https://gitlab.com/qemu-project/qemu/-/commit/c1600f84ce011a056c9c432c8ad8d77f7f8b9e6f. + +But the current QEMU implementation is broken when using these D3D features. + +git bisect finishes with: +``` +574b64aa6754ba491f51024c5a823a674d48a658 is the first bad commit +commit 574b64aa6754ba491f51024c5a823a674d48a658 +Author: Dmitry Osipenko <dmitry.osipenko@collabora.com> +Date: Mon Jan 29 10:39:21 2024 +0300 + + virtio-gpu: Correct virgl_renderer_resource_get_info() error check + + virgl_renderer_resource_get_info() returns errno and not -1 on error. + Correct the return-value check. + + Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> + Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> + Message-Id: <20240129073921.446869-1-dmitry.osipenko@collabora.com> + Cc: qemu-stable@nongnu.org + Reviewed-by: Michael S. Tsirkin <mst@redhat.com> + Signed-off-by: Michael S. Tsirkin <mst@redhat.com> + + contrib/vhost-user-gpu/virgl.c | 6 +++--- + hw/display/virtio-gpu-virgl.c | 2 +- + 2 files changed, 4 insertions(+), 4 deletions(-) +``` diff --git a/results/classifier/105/vnc/2492 b/results/classifier/105/vnc/2492 new file mode 100644 index 000000000..906f2d836 --- /dev/null +++ b/results/classifier/105/vnc/2492 @@ -0,0 +1,33 @@ +vnc: 0.973 +device: 0.852 +network: 0.803 +instruction: 0.797 +graphic: 0.730 +socket: 0.711 +semantic: 0.657 +mistranslation: 0.488 +other: 0.365 +boot: 0.334 +assembly: 0.167 +KVM: 0.026 + +Unable to disable gvnc dependency during build +Description of problem: +The qtest tests will pick up a copy of gvnc if it happens to be installed and there does not appear +to be any way of disabling the dependency to ensure a reproducible build. We tripped over this in +bulk builds on OpenBSD. +Steps to reproduce: +1. Install gvnc +2. Build QEMU +Additional information: +From tests/qtest/meson.build + +``` +if vnc.found() + gvnc = dependency('gvnc-1.0', method: 'pkg-config', required: false) + if gvnc.found() + qtests += {'vnc-display-test': [gvnc]} + qtests_generic += [ 'vnc-display-test' ] + endif +endif +``` diff --git a/results/classifier/105/vnc/2608 b/results/classifier/105/vnc/2608 new file mode 100644 index 000000000..7d6ecc66a --- /dev/null +++ b/results/classifier/105/vnc/2608 @@ -0,0 +1,22 @@ +vnc: 0.981 +network: 0.972 +device: 0.925 +graphic: 0.895 +instruction: 0.778 +boot: 0.707 +semantic: 0.641 +socket: 0.579 +other: 0.354 +mistranslation: 0.312 +assembly: 0.276 +KVM: 0.007 + +Black screen in Windows XP +Description of problem: +When starting the installation of Windows XP (or Windows 2003) the screen in VNC stays black while the installer is in text-mode. During the second half of the installation, where it switches to graphical GUI, the display becomes visible again. + +This problem never happened on 9.0.1 and below, so is a regression in 9.1.0 +Steps to reproduce: +1. Start the Windows XP installer +2. Connect to VNC +3. Screen stays black diff --git a/results/classifier/105/vnc/2646 b/results/classifier/105/vnc/2646 new file mode 100644 index 000000000..ba5acfe85 --- /dev/null +++ b/results/classifier/105/vnc/2646 @@ -0,0 +1,38 @@ +vnc: 0.923 +device: 0.919 +instruction: 0.880 +boot: 0.840 +mistranslation: 0.683 +socket: 0.680 +graphic: 0.668 +semantic: 0.594 +network: 0.536 +other: 0.494 +KVM: 0.382 +assembly: 0.321 + +osx 10.6.8 guest on x86-64 macos 10.12 host can't boot on HVF, boots on tcg +Description of problem: +for some reason HVF acceleration does not work with mac-on-mac. Haiku beta5 (x64), win10 x64, Debian netinstall 12.7.0 - all works. +Steps to reproduce: +``` +1. get 10.6.8 image from archive.org +2. bin/qemu-system-x86_64 -device isa-applesmc,osk="well_known_string" -usb -M pc-q35-2.11 -device usb-kbd -device usb-tablet -m 1536 -smp 1 -cpu Penryn,vendor=GenuineIntel,+ssse3,+sse4.1,+sse4.2 -L /opt/local/share/qemu -device ac97 -vnc :3 --no-reboot -accel hvf -boot c -bios usr/share/edk2-ovmf-x64/OVMF_CODE.fd -hda osx-10.6-xcode-compressed-efi.qcow2 -d unimp +audio: Could not create a backend for voice `ac97.pi' +audio: Could not create a backend for voice `ac97.mc' +audio: Could not create a backend for voice `ac97.pi' +audio: Could not create a backend for voice `ac97.mc' +ahci: IRQ#0 level:1 +ahci: IRQ#0 level:1 + +{many more of those} +``` +and at this point qemu quits. + +without --no-reboot it reboots + +tried both UEFI boot (using https://github.com/khronokernel/khronokernel.github.io/blob/master/Binaries/OpenCore/EFI-LEGACY.img.zip?raw=true , currently integrated into hdd image) and Clover-5160-X64.iso + +if I remove -accel hvf and replace it with accel tcg guest boots. + +i tried to capture moment when it reboots on video but I can't catch anything :( diff --git a/results/classifier/105/vnc/2772 b/results/classifier/105/vnc/2772 new file mode 100644 index 000000000..c13c12ecc --- /dev/null +++ b/results/classifier/105/vnc/2772 @@ -0,0 +1,87 @@ +vnc: 0.906 +KVM: 0.897 +socket: 0.889 +graphic: 0.889 +device: 0.875 +assembly: 0.874 +other: 0.873 +semantic: 0.863 +instruction: 0.813 +boot: 0.771 +mistranslation: 0.760 +network: 0.705 + +qemu-img map command omits `offset` key in output for encrypted qcow2 files +Description of problem: +We use the `qemu-img map` command to retrieve metadata information from a qcow2 image. It functions as expected for non-encrypted qcow2 images. However, when the same command is executed on an encrypted qcow2 image, the output omits the `offset` key, which is critical for subsequent processing in our workflow. +Steps to reproduce: +1. Run qemu-img map on the encrypted incremental qcow2: +Command: + +``` + qemu-img map --object secret,id=sec0,data=trilio --output json -U --image-opts driver=qcow2,file.filename=incremental.qcow2,encrypt.format=luks,encrypt.key-secret=sec0 +``` +**Observed Output:** The command executes but does not include the offset key in the JSON output. +For example: +``` +[{ "start": 32191217664, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32191283200, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32193314816, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32193380352, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32195411968, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32195477504, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32197509120, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32197574656, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32199606272, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32199671808, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32201703424, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32201768960, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32203800576, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32203866112, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32205897728, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32205963264, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32207994880, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32208060416, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32210092032, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}, +{ "start": 32210157568, "length": 2031616, "depth": 1, "present": false, "zero": true, "data": false}, +{ "start": 32212189184, "length": 65536, "depth": 1, "present": true, "zero": false, "data": true}] +``` + +2. Decrypt the same encrypted incremental qcow2 image and re-run the qemu-img map command: +**Decryption command:** +``` +qemu-img convert -t writeback --object secret,id=sec0,data=trilio -O qcow2 --image-opts driver=qcow2,encrypt.key-secret=sec0,file.filename=incremental.qcow2 decrypt.qcow2 +``` +3. Run qemu-img map on the decrypted image: +**Command:** +``` +qemu-img map --output json -U decrypt.qcow2 +``` +Here, we don't need to pass the encryption key as we have already decrypted the qcow2. + +**Observed Output:** The JSON output includes the offset key as expected. Example: +``` +[{ "start": 0, "length": 106954752, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 106954752, "length": 2097152, "depth": 0, "present": true, "zero": false, "data": true, "offset": 327680}, +{ "start": 109051904, "length": 786432000, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 895483904, "length": 2097152, "depth": 0, "present": true, "zero": false, "data": true, "offset": 2490368}, +{ "start": 897581056, "length": 1866924032, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 2764505088, "length": 1638400, "depth": 0, "present": true, "zero": false, "data": true, "offset": 4653056}, +{ "start": 2766143488, "length": 402587648, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 3168731136, "length": 2162688, "depth": 0, "present": true, "zero": false, "data": true, "offset": 6291456}, +{ "start": 3170893824, "length": 140443648, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 3311337472, "length": 54394880, "depth": 0, "present": true, "zero": false, "data": true, "offset": 8519680}, +{ "start": 3365732352, "length": 2056388608, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 5422120960, "length": 1114112, "depth": 0, "present": true, "zero": false, "data": true, "offset": 62980096}, +{ "start": 5423235072, "length": 4128768, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 5427363840, "length": 2162688, "depth": 0, "present": true, "zero": false, "data": true, "offset": 64094208}, +{ "start": 5429526528, "length": 469696512, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 5899223040, "length": 2162688, "depth": 0, "present": true, "zero": false, "data": true, "offset": 66256896}, +{ "start": 5901385728, "length": 90112000, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 5991497728, "length": 1638400, "depth": 0, "present": true, "zero": false, "data": true, "offset": 68485120}, +{ "start": 5993136128, "length": 2086600704, "depth": 0, "present": false, "zero": true, "data": false}, +{ "start": 8079736832, "length": 2686976, "depth": 0, "present": true, "zero": false, "data": true, "offset": 70189056}, +{ "start": 8082423808, "length": 24129830912, "depth": 0, "present": false, "zero": true, "data": false}] +``` + +The missing `offset` key in the output of the `qemu-img map` command for encrypted qcow2 images disrupts downstream processes that rely on this metadata. diff --git a/results/classifier/105/vnc/327 b/results/classifier/105/vnc/327 new file mode 100644 index 000000000..013185d68 --- /dev/null +++ b/results/classifier/105/vnc/327 @@ -0,0 +1,14 @@ +vnc: 0.368 +graphic: 0.320 +device: 0.281 +boot: 0.280 +KVM: 0.149 +mistranslation: 0.099 +semantic: 0.055 +other: 0.054 +instruction: 0.019 +assembly: 0.006 +socket: 0.005 +network: 0.001 + +Storage | Two decimal digits precision diff --git a/results/classifier/105/vnc/33802194 b/results/classifier/105/vnc/33802194 new file mode 100644 index 000000000..30fb4c3d1 --- /dev/null +++ b/results/classifier/105/vnc/33802194 @@ -0,0 +1,4947 @@ +vnc: 0.728 +KVM: 0.725 +instruction: 0.693 +device: 0.691 +mistranslation: 0.687 +assembly: 0.657 +semantic: 0.656 +socket: 0.655 +network: 0.644 +graphic: 0.640 +other: 0.637 +boot: 0.631 + +[BUG] cxl can not create region + +Hi list + +I want to test cxl functions in arm64, and found some problems I can't +figure out. + +My test environment: + +1. build latest bios from +https://github.com/tianocore/edk2.git +master +branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) +2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git +master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm +support patch: +https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ +3. build Linux kernel from +https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git +preview +branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) +4. build latest ndctl tools from +https://github.com/pmem/ndctl +create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) + +And my qemu test commands: +sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ + -cpu max -smp 8 -nographic -no-reboot \ + -kernel $KERNEL -bios $BIOS_BIN \ + -drive if=none,file=$ROOTFS,format=qcow2,id=hd \ + -device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 +nokaslr dyndbg="module cxl* +p"' \ + -object memory-backend-ram,size=4G,id=mem0 \ + -numa node,nodeid=0,cpus=0-7,memdev=mem0 \ + -net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ + -object +memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M +\ + -object +memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M +\ + -object +memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M +\ + -object +memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M +\ + -object +memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M +\ + -object +memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M +\ + -object +memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M +\ + -object +memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M +\ + -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ + -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ + -device cxl-upstream,bus=root_port0,id=us0 \ + -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ + -device +cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ + -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ + -device +cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ + -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ + -device +cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ + -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ + -device +cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ + -M +cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k + +And I have got two problems. +1. When I want to create x1 region with command: "cxl create-region -d +decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer +reference. Crash log: + +[ 534.697324] cxl_region region0: config state: 0 +[ 534.697346] cxl_region region0: probe: -6 +[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 +[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +for mem0:decoder3.0 @ 0 +[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 +[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = +0000:0e:00.0 for mem0:decoder3.0 @ 0 +[ 534.699405] Unable to handle kernel NULL pointer dereference at +virtual address 0000000000000000 +[ 534.701474] Mem abort info: +[ 534.701994] ESR = 0x0000000086000004 +[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits +[ 534.703616] SET = 0, FnV = 0 +[ 534.704174] EA = 0, S1PTW = 0 +[ 534.704803] FSC = 0x04: level 0 translation fault +[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 +[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 +[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP +[ 534.710301] Modules linked in: +[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted +5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 +[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 +[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +[ 534.719190] pc : 0x0 +[ 534.719928] lr : commit_store+0x118/0x2cc +[ 534.721007] sp : ffff80000aec3c30 +[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: ffff0000c0c06b30 +[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: ffff0000c0a29400 +[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: ffff0000c0c06800 +[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: 0000000000000000 +[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffd41fe838 +[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 +[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 +[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff0000c0906e80 +[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff80000aec3bf0 +[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c155a000 +[ 534.738878] Call trace: +[ 534.739368] 0x0 +[ 534.739713] dev_attr_store+0x1c/0x30 +[ 534.740186] sysfs_kf_write+0x48/0x58 +[ 534.740961] kernfs_fop_write_iter+0x128/0x184 +[ 534.741872] new_sync_write+0xdc/0x158 +[ 534.742706] vfs_write+0x1ac/0x2a8 +[ 534.743440] ksys_write+0x68/0xf0 +[ 534.744328] __arm64_sys_write+0x1c/0x28 +[ 534.745180] invoke_syscall+0x44/0xf0 +[ 534.745989] el0_svc_common+0x4c/0xfc +[ 534.746661] do_el0_svc+0x60/0xa8 +[ 534.747378] el0_svc+0x2c/0x78 +[ 534.748066] el0t_64_sync_handler+0xb8/0x12c +[ 534.748919] el0t_64_sync+0x18c/0x190 +[ 534.749629] Code: bad PC value +[ 534.750169] ---[ end trace 0000000000000000 ]--- + +2. When I want to create x4 region with command: "cxl create-region -d +decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: + +cxl region: create_region: region0: failed to set target3 to mem3 +cxl region: cmd_create_region: created 0 regions + +And kernel log as below: +[ 60.536663] cxl_region region0: config state: 0 +[ 60.536675] cxl_region region0: probe: -6 +[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 +[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: +mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 +[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 +[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: +mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 +[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: +mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 +[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 +[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: +mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 +[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: +mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 +[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 +[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: +mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 +[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +for mem0:decoder3.0 @ 0 +[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 +[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = +0000:0e:00.0 for mem0:decoder3.0 @ 0 +[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at 1 + +I have tried to write sysfs node manually, got same errors. + +Hope I can get some helps here. + +Bob + +On Fri, 5 Aug 2022 10:20:23 +0800 +Bobo WL <lmw.bobo@gmail.com> wrote: + +> +Hi list +> +> +I want to test cxl functions in arm64, and found some problems I can't +> +figure out. +Hi Bob, + +Glad to see people testing this code. + +> +> +My test environment: +> +> +1. build latest bios from +https://github.com/tianocore/edk2.git +master +> +branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) +> +2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git +> +master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm +> +support patch: +> +https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ +> +3. build Linux kernel from +> +https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git +preview +> +branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) +> +4. build latest ndctl tools from +https://github.com/pmem/ndctl +> +create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) +> +> +And my qemu test commands: +> +sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ +> +-cpu max -smp 8 -nographic -no-reboot \ +> +-kernel $KERNEL -bios $BIOS_BIN \ +> +-drive if=none,file=$ROOTFS,format=qcow2,id=hd \ +> +-device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 +> +nokaslr dyndbg="module cxl* +p"' \ +> +-object memory-backend-ram,size=4G,id=mem0 \ +> +-numa node,nodeid=0,cpus=0-7,memdev=mem0 \ +> +-net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ +> +-object +> +memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M +> +\ +> +-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ +> +-device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ +Probably not related to your problem, but there is a disconnect in QEMU / +kernel assumptionsaround the presence of an HDM decoder when a HB only +has a single root port. Spec allows it to be provided or not as an +implementation choice. +Kernel assumes it isn't provide. Qemu assumes it is. + +The temporary solution is to throw in a second root port on the HB and not +connect anything to it. Longer term I may special case this so that the +particular +decoder defaults to pass through settings in QEMU if there is only one root +port. + +> +-device cxl-upstream,bus=root_port0,id=us0 \ +> +-device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ +> +-device +> +cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ +> +-device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ +> +-device +> +cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ +> +-device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ +> +-device +> +cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ +> +-device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ +> +-device +> +cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ +> +-M +> +cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k +> +> +And I have got two problems. +> +1. When I want to create x1 region with command: "cxl create-region -d +> +decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer +> +reference. Crash log: +> +> +[ 534.697324] cxl_region region0: config state: 0 +> +[ 534.697346] cxl_region region0: probe: -6 +Seems odd this is up here. But maybe fine. + +> +[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 +> +[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: +> +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +> +[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +> +[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +> +[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +> +[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +> +for mem0:decoder3.0 @ 0 +> +[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 +> +[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = +> +0000:0e:00.0 for mem0:decoder3.0 @ 0 +> +[ 534.699405] Unable to handle kernel NULL pointer dereference at +> +virtual address 0000000000000000 +> +[ 534.701474] Mem abort info: +> +[ 534.701994] ESR = 0x0000000086000004 +> +[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits +> +[ 534.703616] SET = 0, FnV = 0 +> +[ 534.704174] EA = 0, S1PTW = 0 +> +[ 534.704803] FSC = 0x04: level 0 translation fault +> +[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 +> +[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 +> +[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP +> +[ 534.710301] Modules linked in: +> +[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted +> +5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 +> +[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 +> +[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +> +[ 534.719190] pc : 0x0 +> +[ 534.719928] lr : commit_store+0x118/0x2cc +> +[ 534.721007] sp : ffff80000aec3c30 +> +[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: +> +ffff0000c0c06b30 +> +[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: +> +ffff0000c0a29400 +> +[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: +> +ffff0000c0c06800 +> +[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: +> +0000000000000000 +> +[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: +> +0000ffffd41fe838 +> +[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: +> +0000000000000000 +> +[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : +> +0000000000000000 +> +[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : +> +ffff0000c0906e80 +> +[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : +> +ffff80000aec3bf0 +> +[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : +> +ffff0000c155a000 +> +[ 534.738878] Call trace: +> +[ 534.739368] 0x0 +> +[ 534.739713] dev_attr_store+0x1c/0x30 +> +[ 534.740186] sysfs_kf_write+0x48/0x58 +> +[ 534.740961] kernfs_fop_write_iter+0x128/0x184 +> +[ 534.741872] new_sync_write+0xdc/0x158 +> +[ 534.742706] vfs_write+0x1ac/0x2a8 +> +[ 534.743440] ksys_write+0x68/0xf0 +> +[ 534.744328] __arm64_sys_write+0x1c/0x28 +> +[ 534.745180] invoke_syscall+0x44/0xf0 +> +[ 534.745989] el0_svc_common+0x4c/0xfc +> +[ 534.746661] do_el0_svc+0x60/0xa8 +> +[ 534.747378] el0_svc+0x2c/0x78 +> +[ 534.748066] el0t_64_sync_handler+0xb8/0x12c +> +[ 534.748919] el0t_64_sync+0x18c/0x190 +> +[ 534.749629] Code: bad PC value +> +[ 534.750169] ---[ end trace 0000000000000000 ]--- +> +> +2. When I want to create x4 region with command: "cxl create-region -d +> +decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: +> +> +cxl region: create_region: region0: failed to set target3 to mem3 +> +cxl region: cmd_create_region: created 0 regions +> +> +And kernel log as below: +> +[ 60.536663] cxl_region region0: config state: 0 +> +[ 60.536675] cxl_region region0: probe: -6 +> +[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 +> +[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: +> +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +> +[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +> +[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: +> +mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 +> +[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 +> +[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: +> +mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 +> +[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 +> +[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: +> +mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 +> +[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 +> +[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +> +[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +> +for mem0:decoder3.0 @ 0 +> +[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 +This looks like off by 1 that should be fixed in the below mentioned +cxl/pending branch. That ig should be 256. Note the fix was +for a test case with a fat HB and no switch, but certainly looks +like this is the same issue. + +> +[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = +> +0000:0e:00.0 for mem0:decoder3.0 @ 0 +> +[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at +> +1 +> +> +I have tried to write sysfs node manually, got same errors. +When stepping through by hand, which sysfs write triggers the crash above? + +Not sure it's related, but I've just sent out a fix to the +target register handling in QEMU. +20220808122051.14822-1-Jonathan.Cameron@huawei.com +/T/#m47ff985412ce44559e6b04d677c302f8cd371330">https://lore.kernel.org/linux-cxl/ +20220808122051.14822-1-Jonathan.Cameron@huawei.com +/T/#m47ff985412ce44559e6b04d677c302f8cd371330 +I did have one instance last week of triggering what looked to be a race +condition but +the stack trace doesn't looks related to what you've hit. + +It will probably be a few days before I have time to take a look at replicating +what you have seen. + +If you have time, try using the kernel.org cxl/pending branch as there are +a few additional fixes on there since you sent this email. Optimistic to hope +this is covered by one of those, but at least it will mean we are trying to +replicate +on same branch. + +Jonathan + + +> +> +Hope I can get some helps here. +> +> +Bob + +Hi Jonathan + +Thanks for your reply! + +On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +<Jonathan.Cameron@huawei.com> wrote: +> +> +Probably not related to your problem, but there is a disconnect in QEMU / +> +kernel assumptionsaround the presence of an HDM decoder when a HB only +> +has a single root port. Spec allows it to be provided or not as an +> +implementation choice. +> +Kernel assumes it isn't provide. Qemu assumes it is. +> +> +The temporary solution is to throw in a second root port on the HB and not +> +connect anything to it. Longer term I may special case this so that the +> +particular +> +decoder defaults to pass through settings in QEMU if there is only one root +> +port. +> +You are right! After adding an extra HB in qemu, I can create a x1 +region successfully. +But have some errors in Nvdimm: + +[ 74.925838] Unknown online node for memory at 0x10000000000, assuming node 0 +[ 74.925846] Unknown target node for memory at 0x10000000000, assuming node 0 +[ 74.927470] nd_region region0: nmem0: is disabled, failing probe + +And x4 region still failed with same errors, using latest cxl/preview +branch don't work. +I have picked "Two CXL emulation fixes" patches in qemu, still not working. + +Bob + +On Tue, 9 Aug 2022 21:07:06 +0800 +Bobo WL <lmw.bobo@gmail.com> wrote: + +> +Hi Jonathan +> +> +Thanks for your reply! +> +> +On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +<Jonathan.Cameron@huawei.com> wrote: +> +> +> +> Probably not related to your problem, but there is a disconnect in QEMU / +> +> kernel assumptionsaround the presence of an HDM decoder when a HB only +> +> has a single root port. Spec allows it to be provided or not as an +> +> implementation choice. +> +> Kernel assumes it isn't provide. Qemu assumes it is. +> +> +> +> The temporary solution is to throw in a second root port on the HB and not +> +> connect anything to it. Longer term I may special case this so that the +> +> particular +> +> decoder defaults to pass through settings in QEMU if there is only one root +> +> port. +> +> +> +> +You are right! After adding an extra HB in qemu, I can create a x1 +> +region successfully. +> +But have some errors in Nvdimm: +> +> +[ 74.925838] Unknown online node for memory at 0x10000000000, assuming node > 0 +> +[ 74.925846] Unknown target node for memory at 0x10000000000, assuming node > 0 +> +[ 74.927470] nd_region region0: nmem0: is disabled, failing probe +Ah. I've seen this one, but not chased it down yet. Was on my todo list to +chase +down. Once I reach this state I can verify the HDM Decode is correct which is +what +I've been using to test (Which wasn't true until earlier this week). +I'm currently testing via devmem, more for historical reasons than because it +makes +that much sense anymore. + +> +> +And x4 region still failed with same errors, using latest cxl/preview +> +branch don't work. +> +I have picked "Two CXL emulation fixes" patches in qemu, still not working. +> +> +Bob + +On Tue, 9 Aug 2022 17:08:25 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Tue, 9 Aug 2022 21:07:06 +0800 +> +Bobo WL <lmw.bobo@gmail.com> wrote: +> +> +> Hi Jonathan +> +> +> +> Thanks for your reply! +> +> +> +> On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> <Jonathan.Cameron@huawei.com> wrote: +> +> > +> +> > Probably not related to your problem, but there is a disconnect in QEMU / +> +> > kernel assumptionsaround the presence of an HDM decoder when a HB only +> +> > has a single root port. Spec allows it to be provided or not as an +> +> > implementation choice. +> +> > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > +> +> > The temporary solution is to throw in a second root port on the HB and not +> +> > connect anything to it. Longer term I may special case this so that the +> +> > particular +> +> > decoder defaults to pass through settings in QEMU if there is only one +> +> > root port. +> +> > +> +> +> +> You are right! After adding an extra HB in qemu, I can create a x1 +> +> region successfully. +> +> But have some errors in Nvdimm: +> +> +> +> [ 74.925838] Unknown online node for memory at 0x10000000000, assuming +> +> node 0 +> +> [ 74.925846] Unknown target node for memory at 0x10000000000, assuming +> +> node 0 +> +> [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> +Ah. I've seen this one, but not chased it down yet. Was on my todo list to +> +chase +> +down. Once I reach this state I can verify the HDM Decode is correct which is +> +what +> +I've been using to test (Which wasn't true until earlier this week). +> +I'm currently testing via devmem, more for historical reasons than because it +> +makes +> +that much sense anymore. +*embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +I'd forgotten that was still on the todo list. I don't think it will +be particularly hard to do and will take a look in next few days. + +Very very indirectly this error is causing a driver probe fail that means that +we hit a code path that has a rather odd looking check on NDD_LABELING. +Should not have gotten near that path though - hence the problem is actually +when we call cxl_pmem_get_config_data() and it returns an error because +we haven't fully connected up the command in QEMU. + +Jonathan + + +> +> +> +> +> And x4 region still failed with same errors, using latest cxl/preview +> +> branch don't work. +> +> I have picked "Two CXL emulation fixes" patches in qemu, still not working. +> +> +> +> Bob + +On Thu, 11 Aug 2022 18:08:57 +0100 +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: + +> +On Tue, 9 Aug 2022 17:08:25 +0100 +> +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> On Tue, 9 Aug 2022 21:07:06 +0800 +> +> Bobo WL <lmw.bobo@gmail.com> wrote: +> +> +> +> > Hi Jonathan +> +> > +> +> > Thanks for your reply! +> +> > +> +> > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > <Jonathan.Cameron@huawei.com> wrote: +> +> > > +> +> > > Probably not related to your problem, but there is a disconnect in QEMU +> +> > > / +> +> > > kernel assumptionsaround the presence of an HDM decoder when a HB only +> +> > > has a single root port. Spec allows it to be provided or not as an +> +> > > implementation choice. +> +> > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > +> +> > > The temporary solution is to throw in a second root port on the HB and +> +> > > not +> +> > > connect anything to it. Longer term I may special case this so that +> +> > > the particular +> +> > > decoder defaults to pass through settings in QEMU if there is only one +> +> > > root port. +> +> > > +> +> > +> +> > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > region successfully. +> +> > But have some errors in Nvdimm: +> +> > +> +> > [ 74.925838] Unknown online node for memory at 0x10000000000, assuming +> +> > node 0 +> +> > [ 74.925846] Unknown target node for memory at 0x10000000000, assuming +> +> > node 0 +> +> > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> +> +> Ah. I've seen this one, but not chased it down yet. Was on my todo list to +> +> chase +> +> down. Once I reach this state I can verify the HDM Decode is correct which +> +> is what +> +> I've been using to test (Which wasn't true until earlier this week). +> +> I'm currently testing via devmem, more for historical reasons than because +> +> it makes +> +> that much sense anymore. +> +> +*embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +I'd forgotten that was still on the todo list. I don't think it will +> +be particularly hard to do and will take a look in next few days. +> +> +Very very indirectly this error is causing a driver probe fail that means that +> +we hit a code path that has a rather odd looking check on NDD_LABELING. +> +Should not have gotten near that path though - hence the problem is actually +> +when we call cxl_pmem_get_config_data() and it returns an error because +> +we haven't fully connected up the command in QEMU. +So a least one bug in QEMU. We were not supporting variable length payloads on +mailbox +inputs (but were on outputs). That hasn't mattered until we get to LSA writes. +We just need to relax condition on the supplied length. + +diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +index c352a935c4..fdda9529fe 100644 +--- a/hw/cxl/cxl-mailbox-utils.c ++++ b/hw/cxl/cxl-mailbox-utils.c +@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) + cxl_cmd = &cxl_cmd_set[set][cmd]; + h = cxl_cmd->handler; + if (h) { +- if (len == cxl_cmd->in) { ++ if (len == cxl_cmd->in || !cxl_cmd->in) { + cxl_cmd->payload = cxl_dstate->mbox_reg_state + + A_CXL_DEV_CMD_PAYLOAD; + ret = (*h)(cxl_cmd, cxl_dstate, &len); + + +This lets the nvdimm/region probe fine, but I'm getting some issues with +namespace capacity so I'll look at what is causing that next. +Unfortunately I'm not that familiar with the driver/nvdimm side of things +so it's take a while to figure out what kicks off what! + +Jonathan + +> +> +Jonathan +> +> +> +> +> +> > +> +> > And x4 region still failed with same errors, using latest cxl/preview +> +> > branch don't work. +> +> > I have picked "Two CXL emulation fixes" patches in qemu, still not +> +> > working. +> +> > +> +> > Bob +> +> + +Jonathan Cameron wrote: +> +On Thu, 11 Aug 2022 18:08:57 +0100 +> +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> On Tue, 9 Aug 2022 17:08:25 +0100 +> +> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> +> > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > +> +> > > Hi Jonathan +> +> > > +> +> > > Thanks for your reply! +> +> > > +> +> > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > +> +> > > > Probably not related to your problem, but there is a disconnect in +> +> > > > QEMU / +> +> > > > kernel assumptionsaround the presence of an HDM decoder when a HB only +> +> > > > has a single root port. Spec allows it to be provided or not as an +> +> > > > implementation choice. +> +> > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > +> +> > > > The temporary solution is to throw in a second root port on the HB +> +> > > > and not +> +> > > > connect anything to it. Longer term I may special case this so that +> +> > > > the particular +> +> > > > decoder defaults to pass through settings in QEMU if there is only +> +> > > > one root port. +> +> > > > +> +> > > +> +> > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > region successfully. +> +> > > But have some errors in Nvdimm: +> +> > > +> +> > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > assuming node 0 +> +> > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > assuming node 0 +> +> > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > +> +> > Ah. I've seen this one, but not chased it down yet. Was on my todo list +> +> > to chase +> +> > down. Once I reach this state I can verify the HDM Decode is correct +> +> > which is what +> +> > I've been using to test (Which wasn't true until earlier this week). +> +> > I'm currently testing via devmem, more for historical reasons than +> +> > because it makes +> +> > that much sense anymore. +> +> +> +> *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> I'd forgotten that was still on the todo list. I don't think it will +> +> be particularly hard to do and will take a look in next few days. +> +> +> +> Very very indirectly this error is causing a driver probe fail that means +> +> that +> +> we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> Should not have gotten near that path though - hence the problem is actually +> +> when we call cxl_pmem_get_config_data() and it returns an error because +> +> we haven't fully connected up the command in QEMU. +> +> +So a least one bug in QEMU. We were not supporting variable length payloads +> +on mailbox +> +inputs (but were on outputs). That hasn't mattered until we get to LSA +> +writes. +> +We just need to relax condition on the supplied length. +> +> +diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +index c352a935c4..fdda9529fe 100644 +> +--- a/hw/cxl/cxl-mailbox-utils.c +> ++++ b/hw/cxl/cxl-mailbox-utils.c +> +@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +cxl_cmd = &cxl_cmd_set[set][cmd]; +> +h = cxl_cmd->handler; +> +if (h) { +> +- if (len == cxl_cmd->in) { +> ++ if (len == cxl_cmd->in || !cxl_cmd->in) { +> +cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +A_CXL_DEV_CMD_PAYLOAD; +> +ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> +> +This lets the nvdimm/region probe fine, but I'm getting some issues with +> +namespace capacity so I'll look at what is causing that next. +> +Unfortunately I'm not that familiar with the driver/nvdimm side of things +> +so it's take a while to figure out what kicks off what! +The whirlwind tour is that 'struct nd_region' instances that represent a +persitent memory address range are composed of one more mappings of +'struct nvdimm' objects. The nvdimm object is driven by the dimm driver +in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking +the dimm (if locked) and interrogating the label area to look for +namespace labels. + +The label command calls are routed to the '->ndctl()' callback that was +registered when the CXL nvdimm_bus_descriptor was created. That callback +handles both 'bus' scope calls, currently none for CXL, and per nvdimm +calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands +to CXL commands. + +The 'struct nvdimm' objects that the CXL side registers have the +NDD_LABELING flag set which means that namespaces need to be explicitly +created / provisioned from region capacity. Otherwise, if +drivers/nvdimm/dimm.c does not find a namespace-label-index block then +the region reverts to label-less mode and a default namespace equal to +the size of the region is instantiated. + +If you are seeing small mismatches in namespace capacity then it may +just be the fact that by default 'ndctl create-namespace' results in an +'fsdax' mode namespace which just means that it is a block device where +1.5% of the capacity is reserved for 'struct page' metadata. You should +be able to see namespace capacity == region capacity by doing "ndctl +create-namespace -m raw", and disable DAX operation. + +Hope that helps. + +On Fri, 12 Aug 2022 09:03:02 -0700 +Dan Williams <dan.j.williams@intel.com> wrote: + +> +Jonathan Cameron wrote: +> +> On Thu, 11 Aug 2022 18:08:57 +0100 +> +> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> +> > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > +> +> > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > +> +> > > > Hi Jonathan +> +> > > > +> +> > > > Thanks for your reply! +> +> > > > +> +> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > +> +> > > > > Probably not related to your problem, but there is a disconnect in +> +> > > > > QEMU / +> +> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB +> +> > > > > only +> +> > > > > has a single root port. Spec allows it to be provided or not as an +> +> > > > > implementation choice. +> +> > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > +> +> > > > > The temporary solution is to throw in a second root port on the HB +> +> > > > > and not +> +> > > > > connect anything to it. Longer term I may special case this so +> +> > > > > that the particular +> +> > > > > decoder defaults to pass through settings in QEMU if there is only +> +> > > > > one root port. +> +> > > > > +> +> > > > +> +> > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > region successfully. +> +> > > > But have some errors in Nvdimm: +> +> > > > +> +> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > > +> +> > > +> +> > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > list to chase +> +> > > down. Once I reach this state I can verify the HDM Decode is correct +> +> > > which is what +> +> > > I've been using to test (Which wasn't true until earlier this week). +> +> > > I'm currently testing via devmem, more for historical reasons than +> +> > > because it makes +> +> > > that much sense anymore. +> +> > +> +> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > I'd forgotten that was still on the todo list. I don't think it will +> +> > be particularly hard to do and will take a look in next few days. +> +> > +> +> > Very very indirectly this error is causing a driver probe fail that means +> +> > that +> +> > we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> > Should not have gotten near that path though - hence the problem is +> +> > actually +> +> > when we call cxl_pmem_get_config_data() and it returns an error because +> +> > we haven't fully connected up the command in QEMU. +> +> +> +> So a least one bug in QEMU. We were not supporting variable length payloads +> +> on mailbox +> +> inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> writes. +> +> We just need to relax condition on the supplied length. +> +> +> +> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> index c352a935c4..fdda9529fe 100644 +> +> --- a/hw/cxl/cxl-mailbox-utils.c +> +> +++ b/hw/cxl/cxl-mailbox-utils.c +> +> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> h = cxl_cmd->handler; +> +> if (h) { +> +> - if (len == cxl_cmd->in) { +> +> + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +> cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +> A_CXL_DEV_CMD_PAYLOAD; +> +> ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> +> +> +> +> This lets the nvdimm/region probe fine, but I'm getting some issues with +> +> namespace capacity so I'll look at what is causing that next. +> +> Unfortunately I'm not that familiar with the driver/nvdimm side of things +> +> so it's take a while to figure out what kicks off what! +> +> +The whirlwind tour is that 'struct nd_region' instances that represent a +> +persitent memory address range are composed of one more mappings of +> +'struct nvdimm' objects. The nvdimm object is driven by the dimm driver +> +in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking +> +the dimm (if locked) and interrogating the label area to look for +> +namespace labels. +> +> +The label command calls are routed to the '->ndctl()' callback that was +> +registered when the CXL nvdimm_bus_descriptor was created. That callback +> +handles both 'bus' scope calls, currently none for CXL, and per nvdimm +> +calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands +> +to CXL commands. +> +> +The 'struct nvdimm' objects that the CXL side registers have the +> +NDD_LABELING flag set which means that namespaces need to be explicitly +> +created / provisioned from region capacity. Otherwise, if +> +drivers/nvdimm/dimm.c does not find a namespace-label-index block then +> +the region reverts to label-less mode and a default namespace equal to +> +the size of the region is instantiated. +> +> +If you are seeing small mismatches in namespace capacity then it may +> +just be the fact that by default 'ndctl create-namespace' results in an +> +'fsdax' mode namespace which just means that it is a block device where +> +1.5% of the capacity is reserved for 'struct page' metadata. You should +> +be able to see namespace capacity == region capacity by doing "ndctl +> +create-namespace -m raw", and disable DAX operation. +Currently ndctl create-namespace crashes qemu ;) +Which isn't ideal! + +> +> +Hope that helps. +Got me looking at the right code. Thanks! + +Jonathan + +On Fri, 12 Aug 2022 17:15:09 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Fri, 12 Aug 2022 09:03:02 -0700 +> +Dan Williams <dan.j.williams@intel.com> wrote: +> +> +> Jonathan Cameron wrote: +> +> > On Thu, 11 Aug 2022 18:08:57 +0100 +> +> > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> > +> +> > > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > > +> +> > > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > > +> +> > > > > Hi Jonathan +> +> > > > > +> +> > > > > Thanks for your reply! +> +> > > > > +> +> > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > > +> +> > > > > > Probably not related to your problem, but there is a disconnect +> +> > > > > > in QEMU / +> +> > > > > > kernel assumptionsaround the presence of an HDM decoder when a HB +> +> > > > > > only +> +> > > > > > has a single root port. Spec allows it to be provided or not as +> +> > > > > > an implementation choice. +> +> > > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > > +> +> > > > > > The temporary solution is to throw in a second root port on the +> +> > > > > > HB and not +> +> > > > > > connect anything to it. Longer term I may special case this so +> +> > > > > > that the particular +> +> > > > > > decoder defaults to pass through settings in QEMU if there is +> +> > > > > > only one root port. +> +> > > > > > +> +> > > > > +> +> > > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > > region successfully. +> +> > > > > But have some errors in Nvdimm: +> +> > > > > +> +> > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > > assuming node 0 +> +> > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > > assuming node 0 +> +> > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > > > +> +> > > > +> +> > > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > > list to chase +> +> > > > down. Once I reach this state I can verify the HDM Decode is correct +> +> > > > which is what +> +> > > > I've been using to test (Which wasn't true until earlier this week). +> +> > > > I'm currently testing via devmem, more for historical reasons than +> +> > > > because it makes +> +> > > > that much sense anymore. +> +> > > +> +> > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > > I'd forgotten that was still on the todo list. I don't think it will +> +> > > be particularly hard to do and will take a look in next few days. +> +> > > +> +> > > Very very indirectly this error is causing a driver probe fail that +> +> > > means that +> +> > > we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> > > Should not have gotten near that path though - hence the problem is +> +> > > actually +> +> > > when we call cxl_pmem_get_config_data() and it returns an error because +> +> > > we haven't fully connected up the command in QEMU. +> +> > +> +> > So a least one bug in QEMU. We were not supporting variable length +> +> > payloads on mailbox +> +> > inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> > writes. +> +> > We just need to relax condition on the supplied length. +> +> > +> +> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> > index c352a935c4..fdda9529fe 100644 +> +> > --- a/hw/cxl/cxl-mailbox-utils.c +> +> > +++ b/hw/cxl/cxl-mailbox-utils.c +> +> > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> > cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> > h = cxl_cmd->handler; +> +> > if (h) { +> +> > - if (len == cxl_cmd->in) { +> +> > + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +> > cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +> > A_CXL_DEV_CMD_PAYLOAD; +> +> > ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> > +> +> > +> +> > This lets the nvdimm/region probe fine, but I'm getting some issues with +> +> > namespace capacity so I'll look at what is causing that next. +> +> > Unfortunately I'm not that familiar with the driver/nvdimm side of things +> +> > so it's take a while to figure out what kicks off what! +> +> +> +> The whirlwind tour is that 'struct nd_region' instances that represent a +> +> persitent memory address range are composed of one more mappings of +> +> 'struct nvdimm' objects. The nvdimm object is driven by the dimm driver +> +> in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking +> +> the dimm (if locked) and interrogating the label area to look for +> +> namespace labels. +> +> +> +> The label command calls are routed to the '->ndctl()' callback that was +> +> registered when the CXL nvdimm_bus_descriptor was created. That callback +> +> handles both 'bus' scope calls, currently none for CXL, and per nvdimm +> +> calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands +> +> to CXL commands. +> +> +> +> The 'struct nvdimm' objects that the CXL side registers have the +> +> NDD_LABELING flag set which means that namespaces need to be explicitly +> +> created / provisioned from region capacity. Otherwise, if +> +> drivers/nvdimm/dimm.c does not find a namespace-label-index block then +> +> the region reverts to label-less mode and a default namespace equal to +> +> the size of the region is instantiated. +> +> +> +> If you are seeing small mismatches in namespace capacity then it may +> +> just be the fact that by default 'ndctl create-namespace' results in an +> +> 'fsdax' mode namespace which just means that it is a block device where +> +> 1.5% of the capacity is reserved for 'struct page' metadata. You should +> +> be able to see namespace capacity == region capacity by doing "ndctl +> +> create-namespace -m raw", and disable DAX operation. +> +> +Currently ndctl create-namespace crashes qemu ;) +> +Which isn't ideal! +> +Found a cause for this one. Mailbox payload may be as small as 256 bytes. +We have code in kernel sanity checking that output payload fits in the +mailbox, but nothing on the input payload. Symptom is that we write just +off the end whatever size the payload is. Note doing this shouldn't crash +qemu - so I need to fix a range check somewhere. + +I think this is because cxl_pmem_get_config_size() returns the mailbox +payload size as being the available LSA size, forgetting to remove the +size of the headers on the set_lsa side of things. +https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/pmem.c?h=next#n110 +I've hacked the max_payload to be -8 + +Now we still don't succeed in creating the namespace, but bonus is it doesn't +crash any more. + + +Jonathan + + + +> +> +> +> Hope that helps. +> +Got me looking at the right code. Thanks! +> +> +Jonathan +> +> + +On Mon, 15 Aug 2022 15:18:09 +0100 +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: + +> +On Fri, 12 Aug 2022 17:15:09 +0100 +> +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> On Fri, 12 Aug 2022 09:03:02 -0700 +> +> Dan Williams <dan.j.williams@intel.com> wrote: +> +> +> +> > Jonathan Cameron wrote: +> +> > > On Thu, 11 Aug 2022 18:08:57 +0100 +> +> > > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> > > +> +> > > > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > > > +> +> > > > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > > > +> +> > > > > > Hi Jonathan +> +> > > > > > +> +> > > > > > Thanks for your reply! +> +> > > > > > +> +> > > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > > > +> +> > > > > > > Probably not related to your problem, but there is a disconnect +> +> > > > > > > in QEMU / +> +> > > > > > > kernel assumptionsaround the presence of an HDM decoder when a +> +> > > > > > > HB only +> +> > > > > > > has a single root port. Spec allows it to be provided or not as +> +> > > > > > > an implementation choice. +> +> > > > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > > > +> +> > > > > > > The temporary solution is to throw in a second root port on the +> +> > > > > > > HB and not +> +> > > > > > > connect anything to it. Longer term I may special case this so +> +> > > > > > > that the particular +> +> > > > > > > decoder defaults to pass through settings in QEMU if there is +> +> > > > > > > only one root port. +> +> > > > > > > +> +> > > > > > +> +> > > > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > > > region successfully. +> +> > > > > > But have some errors in Nvdimm: +> +> > > > > > +> +> > > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > > > assuming node 0 +> +> > > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > > > assuming node 0 +> +> > > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing +> +> > > > > > probe +> +> > > > > +> +> > > > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > > > list to chase +> +> > > > > down. Once I reach this state I can verify the HDM Decode is +> +> > > > > correct which is what +> +> > > > > I've been using to test (Which wasn't true until earlier this +> +> > > > > week). +> +> > > > > I'm currently testing via devmem, more for historical reasons than +> +> > > > > because it makes +> +> > > > > that much sense anymore. +> +> > > > +> +> > > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > > > I'd forgotten that was still on the todo list. I don't think it will +> +> > > > be particularly hard to do and will take a look in next few days. +> +> > > > +> +> > > > Very very indirectly this error is causing a driver probe fail that +> +> > > > means that +> +> > > > we hit a code path that has a rather odd looking check on +> +> > > > NDD_LABELING. +> +> > > > Should not have gotten near that path though - hence the problem is +> +> > > > actually +> +> > > > when we call cxl_pmem_get_config_data() and it returns an error +> +> > > > because +> +> > > > we haven't fully connected up the command in QEMU. +> +> > > +> +> > > So a least one bug in QEMU. We were not supporting variable length +> +> > > payloads on mailbox +> +> > > inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> > > writes. +> +> > > We just need to relax condition on the supplied length. +> +> > > +> +> > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> > > index c352a935c4..fdda9529fe 100644 +> +> > > --- a/hw/cxl/cxl-mailbox-utils.c +> +> > > +++ b/hw/cxl/cxl-mailbox-utils.c +> +> > > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> > > cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> > > h = cxl_cmd->handler; +> +> > > if (h) { +> +> > > - if (len == cxl_cmd->in) { +> +> > > + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +> > > cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +> > > A_CXL_DEV_CMD_PAYLOAD; +> +> > > ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> > > +> +> > > +> +> > > This lets the nvdimm/region probe fine, but I'm getting some issues with +> +> > > namespace capacity so I'll look at what is causing that next. +> +> > > Unfortunately I'm not that familiar with the driver/nvdimm side of +> +> > > things +> +> > > so it's take a while to figure out what kicks off what! +> +> > +> +> > The whirlwind tour is that 'struct nd_region' instances that represent a +> +> > persitent memory address range are composed of one more mappings of +> +> > 'struct nvdimm' objects. The nvdimm object is driven by the dimm driver +> +> > in drivers/nvdimm/dimm.c. That driver is mainly charged with unlocking +> +> > the dimm (if locked) and interrogating the label area to look for +> +> > namespace labels. +> +> > +> +> > The label command calls are routed to the '->ndctl()' callback that was +> +> > registered when the CXL nvdimm_bus_descriptor was created. That callback +> +> > handles both 'bus' scope calls, currently none for CXL, and per nvdimm +> +> > calls. cxl_pmem_nvdimm_ctl() translates those generic LIBNVDIMM commands +> +> > to CXL commands. +> +> > +> +> > The 'struct nvdimm' objects that the CXL side registers have the +> +> > NDD_LABELING flag set which means that namespaces need to be explicitly +> +> > created / provisioned from region capacity. Otherwise, if +> +> > drivers/nvdimm/dimm.c does not find a namespace-label-index block then +> +> > the region reverts to label-less mode and a default namespace equal to +> +> > the size of the region is instantiated. +> +> > +> +> > If you are seeing small mismatches in namespace capacity then it may +> +> > just be the fact that by default 'ndctl create-namespace' results in an +> +> > 'fsdax' mode namespace which just means that it is a block device where +> +> > 1.5% of the capacity is reserved for 'struct page' metadata. You should +> +> > be able to see namespace capacity == region capacity by doing "ndctl +> +> > create-namespace -m raw", and disable DAX operation. +> +> +> +> Currently ndctl create-namespace crashes qemu ;) +> +> Which isn't ideal! +> +> +> +> +Found a cause for this one. Mailbox payload may be as small as 256 bytes. +> +We have code in kernel sanity checking that output payload fits in the +> +mailbox, but nothing on the input payload. Symptom is that we write just +> +off the end whatever size the payload is. Note doing this shouldn't crash +> +qemu - so I need to fix a range check somewhere. +> +> +I think this is because cxl_pmem_get_config_size() returns the mailbox +> +payload size as being the available LSA size, forgetting to remove the +> +size of the headers on the set_lsa side of things. +> +https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/pmem.c?h=next#n110 +> +> +I've hacked the max_payload to be -8 +> +> +Now we still don't succeed in creating the namespace, but bonus is it doesn't +> +crash any more. +In the interests of defensive / correct handling from QEMU I took a +look into why it was crashing. Turns out that providing a NULL write callback +for +the memory device region (that the above overlarge write was spilling into) +isn't +a safe thing to do. Needs a stub. Oops. + +On plus side we might never have noticed this was going wrong without the crash +*silver lining in every cloud* + +Fix to follow... + +Jonathan + + +> +> +> +Jonathan +> +> +> +> +> > +> +> > Hope that helps. +> +> Got me looking at the right code. Thanks! +> +> +> +> Jonathan +> +> +> +> +> +> + +On Mon, 15 Aug 2022 at 15:55, Jonathan Cameron via <qemu-arm@nongnu.org> wrote: +> +In the interests of defensive / correct handling from QEMU I took a +> +look into why it was crashing. Turns out that providing a NULL write +> +callback for +> +the memory device region (that the above overlarge write was spilling into) +> +isn't +> +a safe thing to do. Needs a stub. Oops. +Yeah. We've talked before about adding an assert so that that kind of +"missing function" bug is caught at device creation rather than only +if the guest tries to access the device, but we never quite got around +to it... + +-- PMM + +On Fri, 12 Aug 2022 16:44:03 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Thu, 11 Aug 2022 18:08:57 +0100 +> +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> On Tue, 9 Aug 2022 17:08:25 +0100 +> +> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> +> > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > +> +> > > Hi Jonathan +> +> > > +> +> > > Thanks for your reply! +> +> > > +> +> > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > +> +> > > > Probably not related to your problem, but there is a disconnect in +> +> > > > QEMU / +> +> > > > kernel assumptionsaround the presence of an HDM decoder when a HB only +> +> > > > has a single root port. Spec allows it to be provided or not as an +> +> > > > implementation choice. +> +> > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > +> +> > > > The temporary solution is to throw in a second root port on the HB +> +> > > > and not +> +> > > > connect anything to it. Longer term I may special case this so that +> +> > > > the particular +> +> > > > decoder defaults to pass through settings in QEMU if there is only +> +> > > > one root port. +> +> > > > +> +> > > +> +> > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > region successfully. +> +> > > But have some errors in Nvdimm: +> +> > > +> +> > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > assuming node 0 +> +> > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > assuming node 0 +> +> > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > +> +> > +> +> > Ah. I've seen this one, but not chased it down yet. Was on my todo list +> +> > to chase +> +> > down. Once I reach this state I can verify the HDM Decode is correct +> +> > which is what +> +> > I've been using to test (Which wasn't true until earlier this week). +> +> > I'm currently testing via devmem, more for historical reasons than +> +> > because it makes +> +> > that much sense anymore. +> +> +> +> *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> I'd forgotten that was still on the todo list. I don't think it will +> +> be particularly hard to do and will take a look in next few days. +> +> +> +> Very very indirectly this error is causing a driver probe fail that means +> +> that +> +> we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> Should not have gotten near that path though - hence the problem is actually +> +> when we call cxl_pmem_get_config_data() and it returns an error because +> +> we haven't fully connected up the command in QEMU. +> +> +So a least one bug in QEMU. We were not supporting variable length payloads +> +on mailbox +> +inputs (but were on outputs). That hasn't mattered until we get to LSA +> +writes. +> +We just need to relax condition on the supplied length. +> +> +diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +index c352a935c4..fdda9529fe 100644 +> +--- a/hw/cxl/cxl-mailbox-utils.c +> ++++ b/hw/cxl/cxl-mailbox-utils.c +> +@@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +cxl_cmd = &cxl_cmd_set[set][cmd]; +> +h = cxl_cmd->handler; +> +if (h) { +> +- if (len == cxl_cmd->in) { +> ++ if (len == cxl_cmd->in || !cxl_cmd->in) { +Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. + +With that fixed we hit new fun paths - after some errors we get the +worrying - not totally sure but looks like a failure on an error cleanup. +I'll chase down the error source, but even then this is probably triggerable by +hardware problem or similar. Some bonus prints in here from me chasing +error paths, but it's otherwise just cxl/next + the fix I posted earlier today. + +[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) +[ 69.920108] nd_region_probe +[ 69.920623] ------------[ cut here ]------------ +[ 69.920675] refcount_t: addition on 0; use-after-free. +[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 +refcount_warn_saturate+0xa0/0x144 +[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi +cxl_core +[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 +[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 +[ 69.931482] Workqueue: events_unbound async_run_entry_fn +[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 +[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 +[ 69.936541] sp : ffff80000890b960 +[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: 0000000000000000 +[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: 0000000000000000 +[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: ffff0000c5254800 +[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: ffffffffffffffff +[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: 0000000000000000 +[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: 657466612d657375 +[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : ffffa54a8f63d288 +[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : 00000000fffff31e +[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : ffff5ab66e5ef000 +root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : +0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 +[ 69.957098] Call trace: +[ 69.957959] refcount_warn_saturate+0xa0/0x144 +[ 69.958773] get_ndd+0x5c/0x80 +[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 +[ 69.960253] nd_region_probe+0x100/0x290 +[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 +[ 69.962087] really_probe+0x19c/0x3f0 +[ 69.962620] __driver_probe_device+0x11c/0x190 +[ 69.963258] driver_probe_device+0x44/0xf4 +[ 69.963773] __device_attach_driver+0xa4/0x140 +[ 69.964471] bus_for_each_drv+0x84/0xe0 +[ 69.965068] __device_attach+0xb0/0x1f0 +[ 69.966101] device_initial_probe+0x20/0x30 +[ 69.967142] bus_probe_device+0xa4/0xb0 +[ 69.968104] device_add+0x3e8/0x910 +[ 69.969111] nd_async_device_register+0x24/0x74 +[ 69.969928] async_run_entry_fn+0x40/0x150 +[ 69.970725] process_one_work+0x1dc/0x450 +[ 69.971796] worker_thread+0x154/0x450 +[ 69.972700] kthread+0x118/0x120 +[ 69.974141] ret_from_fork+0x10/0x20 +[ 69.975141] ---[ end trace 0000000000000000 ]--- +[ 70.117887] Into nd_namespace_pmem_set_resource() + +> +cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +A_CXL_DEV_CMD_PAYLOAD; +> +ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> +> +This lets the nvdimm/region probe fine, but I'm getting some issues with +> +namespace capacity so I'll look at what is causing that next. +> +Unfortunately I'm not that familiar with the driver/nvdimm side of things +> +so it's take a while to figure out what kicks off what! +> +> +Jonathan +> +> +> +> +> Jonathan +> +> +> +> +> +> > +> +> > > +> +> > > And x4 region still failed with same errors, using latest cxl/preview +> +> > > branch don't work. +> +> > > I have picked "Two CXL emulation fixes" patches in qemu, still not +> +> > > working. +> +> > > +> +> > > Bob +> +> +> +> +> + +On Mon, 15 Aug 2022 18:04:44 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Fri, 12 Aug 2022 16:44:03 +0100 +> +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> On Thu, 11 Aug 2022 18:08:57 +0100 +> +> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> +> > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > +> +> > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > +> +> > > > Hi Jonathan +> +> > > > +> +> > > > Thanks for your reply! +> +> > > > +> +> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > +> +> > > > > Probably not related to your problem, but there is a disconnect in +> +> > > > > QEMU / +> +> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB +> +> > > > > only +> +> > > > > has a single root port. Spec allows it to be provided or not as an +> +> > > > > implementation choice. +> +> > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > +> +> > > > > The temporary solution is to throw in a second root port on the HB +> +> > > > > and not +> +> > > > > connect anything to it. Longer term I may special case this so +> +> > > > > that the particular +> +> > > > > decoder defaults to pass through settings in QEMU if there is only +> +> > > > > one root port. +> +> > > > > +> +> > > > +> +> > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > region successfully. +> +> > > > But have some errors in Nvdimm: +> +> > > > +> +> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > > +> +> > > +> +> > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > list to chase +> +> > > down. Once I reach this state I can verify the HDM Decode is correct +> +> > > which is what +> +> > > I've been using to test (Which wasn't true until earlier this week). +> +> > > I'm currently testing via devmem, more for historical reasons than +> +> > > because it makes +> +> > > that much sense anymore. +> +> > +> +> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > I'd forgotten that was still on the todo list. I don't think it will +> +> > be particularly hard to do and will take a look in next few days. +> +> > +> +> > Very very indirectly this error is causing a driver probe fail that means +> +> > that +> +> > we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> > Should not have gotten near that path though - hence the problem is +> +> > actually +> +> > when we call cxl_pmem_get_config_data() and it returns an error because +> +> > we haven't fully connected up the command in QEMU. +> +> +> +> So a least one bug in QEMU. We were not supporting variable length payloads +> +> on mailbox +> +> inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> writes. +> +> We just need to relax condition on the supplied length. +> +> +> +> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> index c352a935c4..fdda9529fe 100644 +> +> --- a/hw/cxl/cxl-mailbox-utils.c +> +> +++ b/hw/cxl/cxl-mailbox-utils.c +> +> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> h = cxl_cmd->handler; +> +> if (h) { +> +> - if (len == cxl_cmd->in) { +> +> + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. +Cause of the error is a failure in GET_LSA. +Reason, payload length is wrong in QEMU but was hidden previously by my wrong +fix here. Probably still a good idea to inject an error in GET_LSA and chase +down the refcount issue. + + +diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +index fdda9529fe..e8565fbd6e 100644 +--- a/hw/cxl/cxl-mailbox-utils.c ++++ b/hw/cxl/cxl-mailbox-utils.c +@@ -489,7 +489,7 @@ static struct cxl_cmd cxl_cmd_set[256][256] = { + cmd_identify_memory_device, 0, 0 }, + [CCLS][GET_PARTITION_INFO] = { "CCLS_GET_PARTITION_INFO", + cmd_ccls_get_partition_info, 0, 0 }, +- [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 0, 0 }, ++ [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 }, + [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa, + ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE }, + [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST", +@@ -510,12 +510,13 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) + cxl_cmd = &cxl_cmd_set[set][cmd]; + h = cxl_cmd->handler; + if (h) { +- if (len == cxl_cmd->in || !cxl_cmd->in) { ++ if (len == cxl_cmd->in || cxl_cmd->in == ~0) { + cxl_cmd->payload = cxl_dstate->mbox_reg_state + + A_CXL_DEV_CMD_PAYLOAD; + +And woot, we get a namespace in the LSA :) + +I'll post QEMU fixes in next day or two. Kernel side now seems more or less +fine be it with suspicious refcount underflow. + +> +> +With that fixed we hit new fun paths - after some errors we get the +> +worrying - not totally sure but looks like a failure on an error cleanup. +> +I'll chase down the error source, but even then this is probably triggerable +> +by +> +hardware problem or similar. Some bonus prints in here from me chasing +> +error paths, but it's otherwise just cxl/next + the fix I posted earlier +> +today. +> +> +[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) +> +[ 69.920108] nd_region_probe +> +[ 69.920623] ------------[ cut here ]------------ +> +[ 69.920675] refcount_t: addition on 0; use-after-free. +> +[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 +> +refcount_warn_saturate+0xa0/0x144 +> +[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi +> +cxl_core +> +[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 +> +[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 +> +[ 69.931482] Workqueue: events_unbound async_run_entry_fn +> +[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +> +[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 +> +[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 +> +[ 69.936541] sp : ffff80000890b960 +> +[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: +> +0000000000000000 +> +[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: +> +0000000000000000 +> +[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: +> +ffff0000c5254800 +> +[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: +> +ffffffffffffffff +> +[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: +> +0000000000000000 +> +[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: +> +657466612d657375 +> +[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : +> +ffffa54a8f63d288 +> +[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : +> +00000000fffff31e +> +[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : +> +ffff5ab66e5ef000 +> +root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : +> +0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 +> +[ 69.957098] Call trace: +> +[ 69.957959] refcount_warn_saturate+0xa0/0x144 +> +[ 69.958773] get_ndd+0x5c/0x80 +> +[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 +> +[ 69.960253] nd_region_probe+0x100/0x290 +> +[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 +> +[ 69.962087] really_probe+0x19c/0x3f0 +> +[ 69.962620] __driver_probe_device+0x11c/0x190 +> +[ 69.963258] driver_probe_device+0x44/0xf4 +> +[ 69.963773] __device_attach_driver+0xa4/0x140 +> +[ 69.964471] bus_for_each_drv+0x84/0xe0 +> +[ 69.965068] __device_attach+0xb0/0x1f0 +> +[ 69.966101] device_initial_probe+0x20/0x30 +> +[ 69.967142] bus_probe_device+0xa4/0xb0 +> +[ 69.968104] device_add+0x3e8/0x910 +> +[ 69.969111] nd_async_device_register+0x24/0x74 +> +[ 69.969928] async_run_entry_fn+0x40/0x150 +> +[ 69.970725] process_one_work+0x1dc/0x450 +> +[ 69.971796] worker_thread+0x154/0x450 +> +[ 69.972700] kthread+0x118/0x120 +> +[ 69.974141] ret_from_fork+0x10/0x20 +> +[ 69.975141] ---[ end trace 0000000000000000 ]--- +> +[ 70.117887] Into nd_namespace_pmem_set_resource() +> +> +> cxl_cmd->payload = cxl_dstate->mbox_reg_state + +> +> A_CXL_DEV_CMD_PAYLOAD; +> +> ret = (*h)(cxl_cmd, cxl_dstate, &len); +> +> +> +> +> +> This lets the nvdimm/region probe fine, but I'm getting some issues with +> +> namespace capacity so I'll look at what is causing that next. +> +> Unfortunately I'm not that familiar with the driver/nvdimm side of things +> +> so it's take a while to figure out what kicks off what! +> +> +> +> Jonathan +> +> +> +> > +> +> > Jonathan +> +> > +> +> > +> +> > > +> +> > > > +> +> > > > And x4 region still failed with same errors, using latest cxl/preview +> +> > > > branch don't work. +> +> > > > I have picked "Two CXL emulation fixes" patches in qemu, still not +> +> > > > working. +> +> > > > +> +> > > > Bob +> +> > +> +> > +> +> +> + +Jonathan Cameron wrote: +> +On Fri, 12 Aug 2022 16:44:03 +0100 +> +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> On Thu, 11 Aug 2022 18:08:57 +0100 +> +> Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> +> > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > +> +> > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > +> +> > > > Hi Jonathan +> +> > > > +> +> > > > Thanks for your reply! +> +> > > > +> +> > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > +> +> > > > > Probably not related to your problem, but there is a disconnect in +> +> > > > > QEMU / +> +> > > > > kernel assumptionsaround the presence of an HDM decoder when a HB +> +> > > > > only +> +> > > > > has a single root port. Spec allows it to be provided or not as an +> +> > > > > implementation choice. +> +> > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > +> +> > > > > The temporary solution is to throw in a second root port on the HB +> +> > > > > and not +> +> > > > > connect anything to it. Longer term I may special case this so +> +> > > > > that the particular +> +> > > > > decoder defaults to pass through settings in QEMU if there is only +> +> > > > > one root port. +> +> > > > > +> +> > > > +> +> > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > region successfully. +> +> > > > But have some errors in Nvdimm: +> +> > > > +> +> > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > assuming node 0 +> +> > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > > +> +> > > +> +> > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > list to chase +> +> > > down. Once I reach this state I can verify the HDM Decode is correct +> +> > > which is what +> +> > > I've been using to test (Which wasn't true until earlier this week). +> +> > > I'm currently testing via devmem, more for historical reasons than +> +> > > because it makes +> +> > > that much sense anymore. +> +> > +> +> > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > I'd forgotten that was still on the todo list. I don't think it will +> +> > be particularly hard to do and will take a look in next few days. +> +> > +> +> > Very very indirectly this error is causing a driver probe fail that means +> +> > that +> +> > we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> > Should not have gotten near that path though - hence the problem is +> +> > actually +> +> > when we call cxl_pmem_get_config_data() and it returns an error because +> +> > we haven't fully connected up the command in QEMU. +> +> +> +> So a least one bug in QEMU. We were not supporting variable length payloads +> +> on mailbox +> +> inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> writes. +> +> We just need to relax condition on the supplied length. +> +> +> +> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> index c352a935c4..fdda9529fe 100644 +> +> --- a/hw/cxl/cxl-mailbox-utils.c +> +> +++ b/hw/cxl/cxl-mailbox-utils.c +> +> @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> h = cxl_cmd->handler; +> +> if (h) { +> +> - if (len == cxl_cmd->in) { +> +> + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. +> +> +With that fixed we hit new fun paths - after some errors we get the +> +worrying - not totally sure but looks like a failure on an error cleanup. +> +I'll chase down the error source, but even then this is probably triggerable +> +by +> +hardware problem or similar. Some bonus prints in here from me chasing +> +error paths, but it's otherwise just cxl/next + the fix I posted earlier +> +today. +One of the scenarios that I cannot rule out is nvdimm_probe() racing +nd_region_probe(), but given all the work it takes to create a region I +suspect all the nvdimm_probe() work to have completed... + +It is at least one potentially wrong hypothesis that needs to be chased +down. + +> +> +[ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) +> +[ 69.920108] nd_region_probe +> +[ 69.920623] ------------[ cut here ]------------ +> +[ 69.920675] refcount_t: addition on 0; use-after-free. +> +[ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 +> +refcount_warn_saturate+0xa0/0x144 +> +[ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port cxl_acpi +> +cxl_core +> +[ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ #399 +> +[ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 +> +[ 69.931482] Workqueue: events_unbound async_run_entry_fn +> +[ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +> +[ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 +> +[ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 +> +[ 69.936541] sp : ffff80000890b960 +> +[ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: +> +0000000000000000 +> +[ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: +> +0000000000000000 +> +[ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: +> +ffff0000c5254800 +> +[ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: +> +ffffffffffffffff +> +[ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: +> +0000000000000000 +> +[ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: +> +657466612d657375 +> +[ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : +> +ffffa54a8f63d288 +> +[ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : +> +00000000fffff31e +> +[ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : +> +ffff5ab66e5ef000 +> +root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : +> +0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 +> +[ 69.957098] Call trace: +> +[ 69.957959] refcount_warn_saturate+0xa0/0x144 +> +[ 69.958773] get_ndd+0x5c/0x80 +> +[ 69.959294] nd_region_register_namespaces+0xe4/0xe90 +> +[ 69.960253] nd_region_probe+0x100/0x290 +> +[ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 +> +[ 69.962087] really_probe+0x19c/0x3f0 +> +[ 69.962620] __driver_probe_device+0x11c/0x190 +> +[ 69.963258] driver_probe_device+0x44/0xf4 +> +[ 69.963773] __device_attach_driver+0xa4/0x140 +> +[ 69.964471] bus_for_each_drv+0x84/0xe0 +> +[ 69.965068] __device_attach+0xb0/0x1f0 +> +[ 69.966101] device_initial_probe+0x20/0x30 +> +[ 69.967142] bus_probe_device+0xa4/0xb0 +> +[ 69.968104] device_add+0x3e8/0x910 +> +[ 69.969111] nd_async_device_register+0x24/0x74 +> +[ 69.969928] async_run_entry_fn+0x40/0x150 +> +[ 69.970725] process_one_work+0x1dc/0x450 +> +[ 69.971796] worker_thread+0x154/0x450 +> +[ 69.972700] kthread+0x118/0x120 +> +[ 69.974141] ret_from_fork+0x10/0x20 +> +[ 69.975141] ---[ end trace 0000000000000000 ]--- +> +[ 70.117887] Into nd_namespace_pmem_set_resource() + +On Mon, 15 Aug 2022 15:55:15 -0700 +Dan Williams <dan.j.williams@intel.com> wrote: + +> +Jonathan Cameron wrote: +> +> On Fri, 12 Aug 2022 16:44:03 +0100 +> +> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> +> > On Thu, 11 Aug 2022 18:08:57 +0100 +> +> > Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> > +> +> > > On Tue, 9 Aug 2022 17:08:25 +0100 +> +> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> > > +> +> > > > On Tue, 9 Aug 2022 21:07:06 +0800 +> +> > > > Bobo WL <lmw.bobo@gmail.com> wrote: +> +> > > > +> +> > > > > Hi Jonathan +> +> > > > > +> +> > > > > Thanks for your reply! +> +> > > > > +> +> > > > > On Mon, Aug 8, 2022 at 8:37 PM Jonathan Cameron +> +> > > > > <Jonathan.Cameron@huawei.com> wrote: +> +> > > > > > +> +> > > > > > Probably not related to your problem, but there is a disconnect +> +> > > > > > in QEMU / +> +> > > > > > kernel assumptionsaround the presence of an HDM decoder when a HB +> +> > > > > > only +> +> > > > > > has a single root port. Spec allows it to be provided or not as +> +> > > > > > an implementation choice. +> +> > > > > > Kernel assumes it isn't provide. Qemu assumes it is. +> +> > > > > > +> +> > > > > > The temporary solution is to throw in a second root port on the +> +> > > > > > HB and not +> +> > > > > > connect anything to it. Longer term I may special case this so +> +> > > > > > that the particular +> +> > > > > > decoder defaults to pass through settings in QEMU if there is +> +> > > > > > only one root port. +> +> > > > > > +> +> > > > > +> +> > > > > You are right! After adding an extra HB in qemu, I can create a x1 +> +> > > > > region successfully. +> +> > > > > But have some errors in Nvdimm: +> +> > > > > +> +> > > > > [ 74.925838] Unknown online node for memory at 0x10000000000, +> +> > > > > assuming node 0 +> +> > > > > [ 74.925846] Unknown target node for memory at 0x10000000000, +> +> > > > > assuming node 0 +> +> > > > > [ 74.927470] nd_region region0: nmem0: is disabled, failing probe +> +> > > > > +> +> > > > +> +> > > > Ah. I've seen this one, but not chased it down yet. Was on my todo +> +> > > > list to chase +> +> > > > down. Once I reach this state I can verify the HDM Decode is correct +> +> > > > which is what +> +> > > > I've been using to test (Which wasn't true until earlier this week). +> +> > > > I'm currently testing via devmem, more for historical reasons than +> +> > > > because it makes +> +> > > > that much sense anymore. +> +> > > +> +> > > *embarassed cough*. We haven't fully hooked the LSA up in qemu yet. +> +> > > I'd forgotten that was still on the todo list. I don't think it will +> +> > > be particularly hard to do and will take a look in next few days. +> +> > > +> +> > > Very very indirectly this error is causing a driver probe fail that +> +> > > means that +> +> > > we hit a code path that has a rather odd looking check on NDD_LABELING. +> +> > > Should not have gotten near that path though - hence the problem is +> +> > > actually +> +> > > when we call cxl_pmem_get_config_data() and it returns an error because +> +> > > we haven't fully connected up the command in QEMU. +> +> > +> +> > So a least one bug in QEMU. We were not supporting variable length +> +> > payloads on mailbox +> +> > inputs (but were on outputs). That hasn't mattered until we get to LSA +> +> > writes. +> +> > We just need to relax condition on the supplied length. +> +> > +> +> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c +> +> > index c352a935c4..fdda9529fe 100644 +> +> > --- a/hw/cxl/cxl-mailbox-utils.c +> +> > +++ b/hw/cxl/cxl-mailbox-utils.c +> +> > @@ -510,7 +510,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate) +> +> > cxl_cmd = &cxl_cmd_set[set][cmd]; +> +> > h = cxl_cmd->handler; +> +> > if (h) { +> +> > - if (len == cxl_cmd->in) { +> +> > + if (len == cxl_cmd->in || !cxl_cmd->in) { +> +> Fix is wrong as we use ~0 as the placeholder for variable payload, not 0. +> +> +> +> With that fixed we hit new fun paths - after some errors we get the +> +> worrying - not totally sure but looks like a failure on an error cleanup. +> +> I'll chase down the error source, but even then this is probably +> +> triggerable by +> +> hardware problem or similar. Some bonus prints in here from me chasing +> +> error paths, but it's otherwise just cxl/next + the fix I posted earlier +> +> today. +> +> +One of the scenarios that I cannot rule out is nvdimm_probe() racing +> +nd_region_probe(), but given all the work it takes to create a region I +> +suspect all the nvdimm_probe() work to have completed... +> +> +It is at least one potentially wrong hypothesis that needs to be chased +> +down. +Maybe there should be a special award for the non-intuitive +ndctl create-namespace command (modifies existing namespace and might create +a different empty one...) I'm sure there is some interesting history behind +that one :) + +Upshot is I just threw a filesystem on fsdax and wrote some text files on it +to allow easy grepping. The right data ends up in the memory and a plausible +namespace description is stored in the LSA. + +So to some degree at least it's 'working' on an 8 way direct connected +set of emulated devices. + +One snag is that serial number support isn't yet upstream in QEMU. +(I have had it in my tree for a while but not posted it yet because of + QEMU feature freeze) +https://gitlab.com/jic23/qemu/-/commit/144c783ea8a5fbe169f46ea1ba92940157f42733 +That's needed for meaningful cookie generation. Otherwise you can build the +namespace once, but it won't work on next probe as the cookie is 0 and you +hit some error paths. + +Maybe sensible to add a sanity check and fail namespace creation if +cookie is 0? (Silly side question, but is there a theoretical risk of +a serial number / other data combination leading to a fletcher64() +checksum that happens to be 0 - that would give a very odd bug report!) + +So to make it work the following is needed: + +1) The kernel fix for mailbox buffer overflow. +2) Qemu fix for size of arguements for get_lsa +3) Qemu fix to allow variable size input arguements (for set_lsa) +4) Serial number patch above + command lines to qemu to set appropriate + serial numbers. + +I'll send out the QEMU fixes shortly and post the Serial number patch, +though that almost certainly won't go in until next QEMU development +cycle starts in a few weeks. + +Next up, run through same tests on some other topologies. + +Jonathan + +> +> +> +> +> [ 69.919877] nd_bus ndbus0: START: nd_region.probe(region0) +> +> [ 69.920108] nd_region_probe +> +> [ 69.920623] ------------[ cut here ]------------ +> +> [ 69.920675] refcount_t: addition on 0; use-after-free. +> +> [ 69.921314] WARNING: CPU: 3 PID: 710 at lib/refcount.c:25 +> +> refcount_warn_saturate+0xa0/0x144 +> +> [ 69.926949] Modules linked in: cxl_pmem cxl_mem cxl_pci cxl_port +> +> cxl_acpi cxl_core +> +> [ 69.928830] CPU: 3 PID: 710 Comm: kworker/u8:9 Not tainted 5.19.0-rc3+ +> +> #399 +> +> [ 69.930596] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 +> +> 02/06/2015 +> +> [ 69.931482] Workqueue: events_unbound async_run_entry_fn +> +> [ 69.932403] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS +> +> BTYPE=--) +> +> [ 69.934023] pc : refcount_warn_saturate+0xa0/0x144 +> +> [ 69.935161] lr : refcount_warn_saturate+0xa0/0x144 +> +> [ 69.936541] sp : ffff80000890b960 +> +> [ 69.937921] x29: ffff80000890b960 x28: 0000000000000000 x27: +> +> 0000000000000000 +> +> [ 69.940917] x26: ffffa54a90d5cb10 x25: ffffa54a90809e98 x24: +> +> 0000000000000000 +> +> [ 69.942537] x23: ffffa54a91a3d8d8 x22: ffff0000c5254800 x21: +> +> ffff0000c5254800 +> +> [ 69.944013] x20: ffff0000ce924180 x19: ffff0000c5254800 x18: +> +> ffffffffffffffff +> +> [ 69.946100] x17: ffff5ab66e5ef000 x16: ffff80000801c000 x15: +> +> 0000000000000000 +> +> [ 69.947585] x14: 0000000000000001 x13: 0a2e656572662d72 x12: +> +> 657466612d657375 +> +> [ 69.948670] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : +> +> ffffa54a8f63d288 +> +> [ 69.950679] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : +> +> 00000000fffff31e +> +> [ 69.952113] x5 : ffff0000ff61ba08 x4 : 00000000fffff31e x3 : +> +> ffff5ab66e5ef000 +> +> root@debian:/sys/bus/cxl/devices/decoder0.0/region0# [ 69.954752] x2 : +> +> 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c512e740 +> +> [ 69.957098] Call trace: +> +> [ 69.957959] refcount_warn_saturate+0xa0/0x144 +> +> [ 69.958773] get_ndd+0x5c/0x80 +> +> [ 69.959294] nd_region_register_namespaces+0xe4/0xe90 +> +> [ 69.960253] nd_region_probe+0x100/0x290 +> +> [ 69.960796] nvdimm_bus_probe+0xf4/0x1c0 +> +> [ 69.962087] really_probe+0x19c/0x3f0 +> +> [ 69.962620] __driver_probe_device+0x11c/0x190 +> +> [ 69.963258] driver_probe_device+0x44/0xf4 +> +> [ 69.963773] __device_attach_driver+0xa4/0x140 +> +> [ 69.964471] bus_for_each_drv+0x84/0xe0 +> +> [ 69.965068] __device_attach+0xb0/0x1f0 +> +> [ 69.966101] device_initial_probe+0x20/0x30 +> +> [ 69.967142] bus_probe_device+0xa4/0xb0 +> +> [ 69.968104] device_add+0x3e8/0x910 +> +> [ 69.969111] nd_async_device_register+0x24/0x74 +> +> [ 69.969928] async_run_entry_fn+0x40/0x150 +> +> [ 69.970725] process_one_work+0x1dc/0x450 +> +> [ 69.971796] worker_thread+0x154/0x450 +> +> [ 69.972700] kthread+0x118/0x120 +> +> [ 69.974141] ret_from_fork+0x10/0x20 +> +> [ 69.975141] ---[ end trace 0000000000000000 ]--- +> +> [ 70.117887] Into nd_namespace_pmem_set_resource() + +Bobo WL wrote: +> +Hi list +> +> +I want to test cxl functions in arm64, and found some problems I can't +> +figure out. +> +> +My test environment: +> +> +1. build latest bios from +https://github.com/tianocore/edk2.git +master +> +branch(cc2db6ebfb6d9d85ba4c7b35fba1fa37fffc0bc2) +> +2. build latest qemu-system-aarch64 from git://git.qemu.org/qemu.git +> +master branch(846dcf0ba4eff824c295f06550b8673ff3f31314). With cxl arm +> +support patch: +> +https://patchwork.kernel.org/project/cxl/cover/20220616141950.23374-1-Jonathan.Cameron@huawei.com/ +> +3. build Linux kernel from +> +https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git +preview +> +branch(65fc1c3d26b96002a5aa1f4012fae4dc98fd5683) +> +4. build latest ndctl tools from +https://github.com/pmem/ndctl +> +create_region branch(8558b394e449779e3a4f3ae90fae77ede0bca159) +> +> +And my qemu test commands: +> +sudo $QEMU_BIN -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 \ +> +-cpu max -smp 8 -nographic -no-reboot \ +> +-kernel $KERNEL -bios $BIOS_BIN \ +> +-drive if=none,file=$ROOTFS,format=qcow2,id=hd \ +> +-device virtio-blk-pci,drive=hd -append 'root=/dev/vda1 +> +nokaslr dyndbg="module cxl* +p"' \ +> +-object memory-backend-ram,size=4G,id=mem0 \ +> +-numa node,nodeid=0,cpus=0-7,memdev=mem0 \ +> +-net nic -net user,hostfwd=tcp::2222-:22 -enable-kvm \ +> +-object +> +memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M +> +\ +> +-object +> +memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M +> +\ +> +-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ +> +-device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ +> +-device cxl-upstream,bus=root_port0,id=us0 \ +> +-device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ +> +-device +> +cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ +> +-device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ +> +-device +> +cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ +> +-device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ +> +-device +> +cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ +> +-device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ +> +-device +> +cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ +> +-M +> +cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k +> +> +And I have got two problems. +> +1. When I want to create x1 region with command: "cxl create-region -d +> +decoder0.0 -w 1 -g 4096 mem0", kernel crashed with null pointer +> +reference. Crash log: +> +> +[ 534.697324] cxl_region region0: config state: 0 +> +[ 534.697346] cxl_region region0: probe: -6 +> +[ 534.697368] cxl_acpi ACPI0017:00: decoder0.0: created region0 +> +[ 534.699115] cxl region0: mem0:endpoint3 decoder3.0 add: +> +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +> +[ 534.699149] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +> +[ 534.699167] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +> +[ 534.699176] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +> +[ 534.699182] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +> +for mem0:decoder3.0 @ 0 +> +[ 534.699189] cxl region0: 0000:0d:00.0:port2 iw: 1 ig: 256 +> +[ 534.699193] cxl region0: 0000:0d:00.0:port2 target[0] = +> +0000:0e:00.0 for mem0:decoder3.0 @ 0 +> +[ 534.699405] Unable to handle kernel NULL pointer dereference at +> +virtual address 0000000000000000 +> +[ 534.701474] Mem abort info: +> +[ 534.701994] ESR = 0x0000000086000004 +> +[ 534.702653] EC = 0x21: IABT (current EL), IL = 32 bits +> +[ 534.703616] SET = 0, FnV = 0 +> +[ 534.704174] EA = 0, S1PTW = 0 +> +[ 534.704803] FSC = 0x04: level 0 translation fault +> +[ 534.705694] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010144a000 +> +[ 534.706875] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 +> +[ 534.709855] Internal error: Oops: 86000004 [#1] PREEMPT SMP +> +[ 534.710301] Modules linked in: +> +[ 534.710546] CPU: 7 PID: 331 Comm: cxl Not tainted +> +5.19.0-rc3-00064-g65fc1c3d26b9-dirty #11 +> +[ 534.715393] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 +> +[ 534.717179] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +> +[ 534.719190] pc : 0x0 +> +[ 534.719928] lr : commit_store+0x118/0x2cc +> +[ 534.721007] sp : ffff80000aec3c30 +> +[ 534.721793] x29: ffff80000aec3c30 x28: ffff0000da62e740 x27: +> +ffff0000c0c06b30 +> +[ 534.723875] x26: 0000000000000000 x25: ffff0000c0a2a400 x24: +> +ffff0000c0a29400 +> +[ 534.725440] x23: 0000000000000003 x22: 0000000000000000 x21: +> +ffff0000c0c06800 +> +[ 534.727312] x20: 0000000000000000 x19: ffff0000c1559800 x18: +> +0000000000000000 +> +[ 534.729138] x17: 0000000000000000 x16: 0000000000000000 x15: +> +0000ffffd41fe838 +> +[ 534.731046] x14: 0000000000000000 x13: 0000000000000000 x12: +> +0000000000000000 +> +[ 534.732402] x11: 0000000000000000 x10: 0000000000000000 x9 : +> +0000000000000000 +> +[ 534.734432] x8 : 0000000000000000 x7 : 0000000000000000 x6 : +> +ffff0000c0906e80 +> +[ 534.735921] x5 : 0000000000000000 x4 : 0000000000000000 x3 : +> +ffff80000aec3bf0 +> +[ 534.737437] x2 : 0000000000000000 x1 : 0000000000000000 x0 : +> +ffff0000c155a000 +> +[ 534.738878] Call trace: +> +[ 534.739368] 0x0 +> +[ 534.739713] dev_attr_store+0x1c/0x30 +> +[ 534.740186] sysfs_kf_write+0x48/0x58 +> +[ 534.740961] kernfs_fop_write_iter+0x128/0x184 +> +[ 534.741872] new_sync_write+0xdc/0x158 +> +[ 534.742706] vfs_write+0x1ac/0x2a8 +> +[ 534.743440] ksys_write+0x68/0xf0 +> +[ 534.744328] __arm64_sys_write+0x1c/0x28 +> +[ 534.745180] invoke_syscall+0x44/0xf0 +> +[ 534.745989] el0_svc_common+0x4c/0xfc +> +[ 534.746661] do_el0_svc+0x60/0xa8 +> +[ 534.747378] el0_svc+0x2c/0x78 +> +[ 534.748066] el0t_64_sync_handler+0xb8/0x12c +> +[ 534.748919] el0t_64_sync+0x18c/0x190 +> +[ 534.749629] Code: bad PC value +> +[ 534.750169] ---[ end trace 0000000000000000 ]--- +What was the top kernel commit when you ran this test? What is the line +number of "commit_store+0x118"? + +> +2. When I want to create x4 region with command: "cxl create-region -d +> +decoder0.0 -w 4 -g 4096 -m mem0 mem1 mem2 mem3". I got below errors: +> +> +cxl region: create_region: region0: failed to set target3 to mem3 +> +cxl region: cmd_create_region: created 0 regions +> +> +And kernel log as below: +> +[ 60.536663] cxl_region region0: config state: 0 +> +[ 60.536675] cxl_region region0: probe: -6 +> +[ 60.536696] cxl_acpi ACPI0017:00: decoder0.0: created region0 +> +[ 60.538251] cxl region0: mem0:endpoint3 decoder3.0 add: +> +mem0:decoder3.0 @ 0 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.538278] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem0:decoder3.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1 +> +[ 60.538295] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem0:decoder3.0 @ 0 next: 0000:0d:00.0 nr_eps: 1 nr_targets: 1 +> +[ 60.538647] cxl region0: mem1:endpoint4 decoder4.0 add: +> +mem1:decoder4.0 @ 1 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.538663] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem1:decoder4.0 @ 1 next: mem1 nr_eps: 2 nr_targets: 2 +> +[ 60.538675] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem1:decoder4.0 @ 1 next: 0000:0d:00.0 nr_eps: 2 nr_targets: 1 +> +[ 60.539311] cxl region0: mem2:endpoint5 decoder5.0 add: +> +mem2:decoder5.0 @ 2 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.539332] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem2:decoder5.0 @ 2 next: mem2 nr_eps: 3 nr_targets: 3 +> +[ 60.539343] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem2:decoder5.0 @ 2 next: 0000:0d:00.0 nr_eps: 3 nr_targets: 1 +> +[ 60.539711] cxl region0: mem3:endpoint6 decoder6.0 add: +> +mem3:decoder6.0 @ 3 next: none nr_eps: 1 nr_targets: 1 +> +[ 60.539723] cxl region0: 0000:0d:00.0:port2 decoder2.0 add: +> +mem3:decoder6.0 @ 3 next: mem3 nr_eps: 4 nr_targets: 4 +> +[ 60.539735] cxl region0: ACPI0016:00:port1 decoder1.0 add: +> +mem3:decoder6.0 @ 3 next: 0000:0d:00.0 nr_eps: 4 nr_targets: 1 +> +[ 60.539742] cxl region0: ACPI0016:00:port1 iw: 1 ig: 256 +> +[ 60.539747] cxl region0: ACPI0016:00:port1 target[0] = 0000:0c:00.0 +> +for mem0:decoder3.0 @ 0 +> +[ 60.539754] cxl region0: 0000:0d:00.0:port2 iw: 4 ig: 512 +> +[ 60.539758] cxl region0: 0000:0d:00.0:port2 target[0] = +> +0000:0e:00.0 for mem0:decoder3.0 @ 0 +> +[ 60.539764] cxl region0: ACPI0016:00:port1: cannot host mem1:decoder4.0 at +> +1 +> +> +I have tried to write sysfs node manually, got same errors. +> +> +Hope I can get some helps here. +What is the output of: + + cxl list -MDTu -d decoder0.0 + +...? It might be the case that mem1 cannot be mapped by decoder0.0, or +at least not in the specified order, or that validation check is broken. + +Hi Dan, + +Thanks for your reply! + +On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> wrote: +> +> +What is the output of: +> +> +cxl list -MDTu -d decoder0.0 +> +> +...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +at least not in the specified order, or that validation check is broken. +Command "cxl list -MDTu -d decoder0.0" output: + +[ + { + "memdevs":[ + { + "memdev":"mem2", + "pmem_size":"256.00 MiB (268.44 MB)", + "ram_size":0, + "serial":"0", + "host":"0000:11:00.0" + }, + { + "memdev":"mem1", + "pmem_size":"256.00 MiB (268.44 MB)", + "ram_size":0, + "serial":"0", + "host":"0000:10:00.0" + }, + { + "memdev":"mem0", + "pmem_size":"256.00 MiB (268.44 MB)", + "ram_size":0, + "serial":"0", + "host":"0000:0f:00.0" + }, + { + "memdev":"mem3", + "pmem_size":"256.00 MiB (268.44 MB)", + "ram_size":0, + "serial":"0", + "host":"0000:12:00.0" + } + ] + }, + { + "root decoders":[ + { + "decoder":"decoder0.0", + "resource":"0x10000000000", + "size":"4.00 GiB (4.29 GB)", + "pmem_capable":true, + "volatile_capable":true, + "accelmem_capable":true, + "nr_targets":1, + "targets":[ + { + "target":"ACPI0016:01", + "alias":"pci0000:0c", + "position":0, + "id":"0xc" + } + ] + } + ] + } +] + +Bobo WL wrote: +> +Hi Dan, +> +> +Thanks for your reply! +> +> +On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> wrote: +> +> +> +> What is the output of: +> +> +> +> cxl list -MDTu -d decoder0.0 +> +> +> +> ...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +> at least not in the specified order, or that validation check is broken. +> +> +Command "cxl list -MDTu -d decoder0.0" output: +Thanks for this, I think I know the problem, but will try some +experiments with cxl_test first. + +Did the commit_store() crash stop reproducing with latest cxl/preview +branch? + +On Tue, Aug 9, 2022 at 11:17 PM Dan Williams <dan.j.williams@intel.com> wrote: +> +> +Bobo WL wrote: +> +> Hi Dan, +> +> +> +> Thanks for your reply! +> +> +> +> On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> +> +> wrote: +> +> > +> +> > What is the output of: +> +> > +> +> > cxl list -MDTu -d decoder0.0 +> +> > +> +> > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +> > at least not in the specified order, or that validation check is broken. +> +> +> +> Command "cxl list -MDTu -d decoder0.0" output: +> +> +Thanks for this, I think I know the problem, but will try some +> +experiments with cxl_test first. +> +> +Did the commit_store() crash stop reproducing with latest cxl/preview +> +branch? +No, still hitting this bug if don't add extra HB device in qemu + +Dan Williams wrote: +> +Bobo WL wrote: +> +> Hi Dan, +> +> +> +> Thanks for your reply! +> +> +> +> On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> +> +> wrote: +> +> > +> +> > What is the output of: +> +> > +> +> > cxl list -MDTu -d decoder0.0 +> +> > +> +> > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +> > at least not in the specified order, or that validation check is broken. +> +> +> +> Command "cxl list -MDTu -d decoder0.0" output: +> +> +Thanks for this, I think I know the problem, but will try some +> +experiments with cxl_test first. +Hmm, so my cxl_test experiment unfortunately passed so I'm not +reproducing the failure mode. This is the result of creating x4 region +with devices directly attached to a single host-bridge: + +# cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s $((1<<30)) +{ + "region":"region8", + "resource":"0xf1f0000000", + "size":"1024.00 MiB (1073.74 MB)", + "interleave_ways":4, + "interleave_granularity":256, + "decode_state":"commit", + "mappings":[ + { + "position":3, + "memdev":"mem11", + "decoder":"decoder21.0" + }, + { + "position":2, + "memdev":"mem9", + "decoder":"decoder19.0" + }, + { + "position":1, + "memdev":"mem10", + "decoder":"decoder20.0" + }, + { + "position":0, + "memdev":"mem12", + "decoder":"decoder22.0" + } + ] +} +cxl region: cmd_create_region: created 1 region + +> +Did the commit_store() crash stop reproducing with latest cxl/preview +> +branch? +I missed the answer to this question. + +All of these changes are now in Linus' tree perhaps give that a try and +post the debug log again? + +On Thu, 11 Aug 2022 17:46:55 -0700 +Dan Williams <dan.j.williams@intel.com> wrote: + +> +Dan Williams wrote: +> +> Bobo WL wrote: +> +> > Hi Dan, +> +> > +> +> > Thanks for your reply! +> +> > +> +> > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> +> +> > wrote: +> +> > > +> +> > > What is the output of: +> +> > > +> +> > > cxl list -MDTu -d decoder0.0 +> +> > > +> +> > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +> > > at least not in the specified order, or that validation check is +> +> > > broken. +> +> > +> +> > Command "cxl list -MDTu -d decoder0.0" output: +> +> +> +> Thanks for this, I think I know the problem, but will try some +> +> experiments with cxl_test first. +> +> +Hmm, so my cxl_test experiment unfortunately passed so I'm not +> +reproducing the failure mode. This is the result of creating x4 region +> +with devices directly attached to a single host-bridge: +> +> +# cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s $((1<<30)) +> +{ +> +"region":"region8", +> +"resource":"0xf1f0000000", +> +"size":"1024.00 MiB (1073.74 MB)", +> +"interleave_ways":4, +> +"interleave_granularity":256, +> +"decode_state":"commit", +> +"mappings":[ +> +{ +> +"position":3, +> +"memdev":"mem11", +> +"decoder":"decoder21.0" +> +}, +> +{ +> +"position":2, +> +"memdev":"mem9", +> +"decoder":"decoder19.0" +> +}, +> +{ +> +"position":1, +> +"memdev":"mem10", +> +"decoder":"decoder20.0" +> +}, +> +{ +> +"position":0, +> +"memdev":"mem12", +> +"decoder":"decoder22.0" +> +} +> +] +> +} +> +cxl region: cmd_create_region: created 1 region +> +> +> Did the commit_store() crash stop reproducing with latest cxl/preview +> +> branch? +> +> +I missed the answer to this question. +> +> +All of these changes are now in Linus' tree perhaps give that a try and +> +post the debug log again? +Hi Dan, + +I've moved onto looking at this one. +1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy that +up +at some stage), 1 switch, 4 downstream switch ports each with a type 3 + +I'm not getting a crash, but can't successfully setup a region. +Upon adding the final target +It's failing in check_last_peer() as pos < distance. +Seems distance is 4 which makes me think it's using the wrong level of the +heirarchy for +some reason or that distance check is wrong. +Wasn't a good idea to just skip that step though as it goes boom - though +stack trace is not useful. + +Jonathan + +On Wed, 17 Aug 2022 17:16:19 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Thu, 11 Aug 2022 17:46:55 -0700 +> +Dan Williams <dan.j.williams@intel.com> wrote: +> +> +> Dan Williams wrote: +> +> > Bobo WL wrote: +> +> > > Hi Dan, +> +> > > +> +> > > Thanks for your reply! +> +> > > +> +> > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams <dan.j.williams@intel.com> +> +> > > wrote: +> +> > > > +> +> > > > What is the output of: +> +> > > > +> +> > > > cxl list -MDTu -d decoder0.0 +> +> > > > +> +> > > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, or +> +> > > > at least not in the specified order, or that validation check is +> +> > > > broken. +> +> > > +> +> > > Command "cxl list -MDTu -d decoder0.0" output: +> +> > +> +> > Thanks for this, I think I know the problem, but will try some +> +> > experiments with cxl_test first. +> +> +> +> Hmm, so my cxl_test experiment unfortunately passed so I'm not +> +> reproducing the failure mode. This is the result of creating x4 region +> +> with devices directly attached to a single host-bridge: +> +> +> +> # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s +> +> $((1<<30)) +> +> { +> +> "region":"region8", +> +> "resource":"0xf1f0000000", +> +> "size":"1024.00 MiB (1073.74 MB)", +> +> "interleave_ways":4, +> +> "interleave_granularity":256, +> +> "decode_state":"commit", +> +> "mappings":[ +> +> { +> +> "position":3, +> +> "memdev":"mem11", +> +> "decoder":"decoder21.0" +> +> }, +> +> { +> +> "position":2, +> +> "memdev":"mem9", +> +> "decoder":"decoder19.0" +> +> }, +> +> { +> +> "position":1, +> +> "memdev":"mem10", +> +> "decoder":"decoder20.0" +> +> }, +> +> { +> +> "position":0, +> +> "memdev":"mem12", +> +> "decoder":"decoder22.0" +> +> } +> +> ] +> +> } +> +> cxl region: cmd_create_region: created 1 region +> +> +> +> > Did the commit_store() crash stop reproducing with latest cxl/preview +> +> > branch? +> +> +> +> I missed the answer to this question. +> +> +> +> All of these changes are now in Linus' tree perhaps give that a try and +> +> post the debug log again? +> +> +Hi Dan, +> +> +I've moved onto looking at this one. +> +1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy +> +that up +> +at some stage), 1 switch, 4 downstream switch ports each with a type 3 +> +> +I'm not getting a crash, but can't successfully setup a region. +> +Upon adding the final target +> +It's failing in check_last_peer() as pos < distance. +> +Seems distance is 4 which makes me think it's using the wrong level of the +> +heirarchy for +> +some reason or that distance check is wrong. +> +Wasn't a good idea to just skip that step though as it goes boom - though +> +stack trace is not useful. +Turns out really weird corruption happens if you accidentally back two type3 +devices +with the same memory device. Who would have thought it :) + +That aside ignoring the check_last_peer() failure seems to make everything work +for this +topology. I'm not seeing the crash, so my guess is we fixed it somewhere along +the way. + +Now for the fun one. I've replicated the crash if we have + +1HB 1*RP 1SW, 4SW-DSP, 4Type3 + +Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be +programmed +but the null pointer dereference isn't related to that. + +The bug is straight forward. Not all decoders have commit callbacks... Will +send out +a possible fix shortly. + +Jonathan + + + +> +> +Jonathan +> +> +> +> +> +> + +On Thu, 18 Aug 2022 17:37:40 +0100 +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: + +> +On Wed, 17 Aug 2022 17:16:19 +0100 +> +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> On Thu, 11 Aug 2022 17:46:55 -0700 +> +> Dan Williams <dan.j.williams@intel.com> wrote: +> +> +> +> > Dan Williams wrote: +> +> > > Bobo WL wrote: +> +> > > > Hi Dan, +> +> > > > +> +> > > > Thanks for your reply! +> +> > > > +> +> > > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams +> +> > > > <dan.j.williams@intel.com> wrote: +> +> > > > > +> +> > > > > What is the output of: +> +> > > > > +> +> > > > > cxl list -MDTu -d decoder0.0 +> +> > > > > +> +> > > > > ...? It might be the case that mem1 cannot be mapped by decoder0.0, +> +> > > > > or +> +> > > > > at least not in the specified order, or that validation check is +> +> > > > > broken. +> +> > > > +> +> > > > Command "cxl list -MDTu -d decoder0.0" output: +> +> > > +> +> > > Thanks for this, I think I know the problem, but will try some +> +> > > experiments with cxl_test first. +> +> > +> +> > Hmm, so my cxl_test experiment unfortunately passed so I'm not +> +> > reproducing the failure mode. This is the result of creating x4 region +> +> > with devices directly attached to a single host-bridge: +> +> > +> +> > # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s +> +> > $((1<<30)) +> +> > { +> +> > "region":"region8", +> +> > "resource":"0xf1f0000000", +> +> > "size":"1024.00 MiB (1073.74 MB)", +> +> > "interleave_ways":4, +> +> > "interleave_granularity":256, +> +> > "decode_state":"commit", +> +> > "mappings":[ +> +> > { +> +> > "position":3, +> +> > "memdev":"mem11", +> +> > "decoder":"decoder21.0" +> +> > }, +> +> > { +> +> > "position":2, +> +> > "memdev":"mem9", +> +> > "decoder":"decoder19.0" +> +> > }, +> +> > { +> +> > "position":1, +> +> > "memdev":"mem10", +> +> > "decoder":"decoder20.0" +> +> > }, +> +> > { +> +> > "position":0, +> +> > "memdev":"mem12", +> +> > "decoder":"decoder22.0" +> +> > } +> +> > ] +> +> > } +> +> > cxl region: cmd_create_region: created 1 region +> +> > +> +> > > Did the commit_store() crash stop reproducing with latest cxl/preview +> +> > > branch? +> +> > +> +> > I missed the answer to this question. +> +> > +> +> > All of these changes are now in Linus' tree perhaps give that a try and +> +> > post the debug log again? +> +> +> +> Hi Dan, +> +> +> +> I've moved onto looking at this one. +> +> 1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy +> +> that up +> +> at some stage), 1 switch, 4 downstream switch ports each with a type 3 +> +> +> +> I'm not getting a crash, but can't successfully setup a region. +> +> Upon adding the final target +> +> It's failing in check_last_peer() as pos < distance. +> +> Seems distance is 4 which makes me think it's using the wrong level of the +> +> heirarchy for +> +> some reason or that distance check is wrong. +> +> Wasn't a good idea to just skip that step though as it goes boom - though +> +> stack trace is not useful. +> +> +Turns out really weird corruption happens if you accidentally back two type3 +> +devices +> +with the same memory device. Who would have thought it :) +> +> +That aside ignoring the check_last_peer() failure seems to make everything +> +work for this +> +topology. I'm not seeing the crash, so my guess is we fixed it somewhere +> +along the way. +> +> +Now for the fun one. I've replicated the crash if we have +> +> +1HB 1*RP 1SW, 4SW-DSP, 4Type3 +> +> +Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be +> +programmed +> +but the null pointer dereference isn't related to that. +> +> +The bug is straight forward. Not all decoders have commit callbacks... Will +> +send out +> +a possible fix shortly. +> +For completeness I'm carrying this hack because I haven't gotten my head +around the right fix for check_last_peer() failing on this test topology. + +diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c +index c49d9a5f1091..275e143bd748 100644 +--- a/drivers/cxl/core/region.c ++++ b/drivers/cxl/core/region.c +@@ -978,7 +978,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, + rc = check_last_peer(cxled, ep, cxl_rr, + distance); + if (rc) +- return rc; ++ // return rc; + goto out_target_set; + } + goto add_target; +-- + +I might find more bugs with more testing, but this is all the ones I've +seen so far + in Bobo's reports. Qemu fixes are now in upstream so +will be there in the release. + +As a reminder, testing on QEMU has a few corners... + +Need a patch to add serial number ECAP support. It is on list for revew, +but will have wait for after QEMU 7.1 release (which may be next week) + +QEMU still assumes HDM decoder on the host bridge will be programmed. +So if you want anything to work there should be at least +2 RP below the HB (no need to plug anything in to one of them). + +I don't want to add a commandline parameter to hide the decoder in QEMU +and detecting there is only one RP would require moving a bunch of static +stuff into runtime code (I think). + +I still think we should make the kernel check to see if there is a decoder, +but if not I might see how bad a hack it is to have QEMU ignore that decoder +if not committed in this one special case (HB HDM decoder with only one place +it can send stuff). Obviously that would be a break from specification +so less than idea! + +Thanks, + +Jonathan + +On Fri, 19 Aug 2022 09:46:55 +0100 +Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: + +> +On Thu, 18 Aug 2022 17:37:40 +0100 +> +Jonathan Cameron via <qemu-devel@nongnu.org> wrote: +> +> +> On Wed, 17 Aug 2022 17:16:19 +0100 +> +> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: +> +> +> +> > On Thu, 11 Aug 2022 17:46:55 -0700 +> +> > Dan Williams <dan.j.williams@intel.com> wrote: +> +> > +> +> > > Dan Williams wrote: +> +> > > > Bobo WL wrote: +> +> > > > > Hi Dan, +> +> > > > > +> +> > > > > Thanks for your reply! +> +> > > > > +> +> > > > > On Mon, Aug 8, 2022 at 11:58 PM Dan Williams +> +> > > > > <dan.j.williams@intel.com> wrote: +> +> > > > > > +> +> > > > > > What is the output of: +> +> > > > > > +> +> > > > > > cxl list -MDTu -d decoder0.0 +> +> > > > > > +> +> > > > > > ...? It might be the case that mem1 cannot be mapped by +> +> > > > > > decoder0.0, or +> +> > > > > > at least not in the specified order, or that validation check is +> +> > > > > > broken. +> +> > > > > +> +> > > > > Command "cxl list -MDTu -d decoder0.0" output: +> +> > > > +> +> > > > Thanks for this, I think I know the problem, but will try some +> +> > > > experiments with cxl_test first. +> +> > > +> +> > > Hmm, so my cxl_test experiment unfortunately passed so I'm not +> +> > > reproducing the failure mode. This is the result of creating x4 region +> +> > > with devices directly attached to a single host-bridge: +> +> > > +> +> > > # cxl create-region -d decoder3.5 -w 4 -m -g 256 mem{12,10,9,11} -s +> +> > > $((1<<30)) +> +> > > { +> +> > > "region":"region8", +> +> > > "resource":"0xf1f0000000", +> +> > > "size":"1024.00 MiB (1073.74 MB)", +> +> > > "interleave_ways":4, +> +> > > "interleave_granularity":256, +> +> > > "decode_state":"commit", +> +> > > "mappings":[ +> +> > > { +> +> > > "position":3, +> +> > > "memdev":"mem11", +> +> > > "decoder":"decoder21.0" +> +> > > }, +> +> > > { +> +> > > "position":2, +> +> > > "memdev":"mem9", +> +> > > "decoder":"decoder19.0" +> +> > > }, +> +> > > { +> +> > > "position":1, +> +> > > "memdev":"mem10", +> +> > > "decoder":"decoder20.0" +> +> > > }, +> +> > > { +> +> > > "position":0, +> +> > > "memdev":"mem12", +> +> > > "decoder":"decoder22.0" +> +> > > } +> +> > > ] +> +> > > } +> +> > > cxl region: cmd_create_region: created 1 region +> +> > > +> +> > > > Did the commit_store() crash stop reproducing with latest cxl/preview +> +> > > > branch? +> +> > > +> +> > > I missed the answer to this question. +> +> > > +> +> > > All of these changes are now in Linus' tree perhaps give that a try and +> +> > > post the debug log again? +> +> > +> +> > Hi Dan, +> +> > +> +> > I've moved onto looking at this one. +> +> > 1 HB, 2RP (to make it configure the HDM decoder in the QEMU HB, I'll tidy +> +> > that up +> +> > at some stage), 1 switch, 4 downstream switch ports each with a type 3 +> +> > +> +> > I'm not getting a crash, but can't successfully setup a region. +> +> > Upon adding the final target +> +> > It's failing in check_last_peer() as pos < distance. +> +> > Seems distance is 4 which makes me think it's using the wrong level of +> +> > the heirarchy for +> +> > some reason or that distance check is wrong. +> +> > Wasn't a good idea to just skip that step though as it goes boom - though +> +> > stack trace is not useful. +> +> +> +> Turns out really weird corruption happens if you accidentally back two +> +> type3 devices +> +> with the same memory device. Who would have thought it :) +> +> +> +> That aside ignoring the check_last_peer() failure seems to make everything +> +> work for this +> +> topology. I'm not seeing the crash, so my guess is we fixed it somewhere +> +> along the way. +> +> +> +> Now for the fun one. I've replicated the crash if we have +> +> +> +> 1HB 1*RP 1SW, 4SW-DSP, 4Type3 +> +> +> +> Now, I'd expect to see it not 'work' because the QEMU HDM decoder won't be +> +> programmed +> +> but the null pointer dereference isn't related to that. +> +> +> +> The bug is straight forward. Not all decoders have commit callbacks... +> +> Will send out +> +> a possible fix shortly. +> +> +> +For completeness I'm carrying this hack because I haven't gotten my head +> +around the right fix for check_last_peer() failing on this test topology. +> +> +diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c +> +index c49d9a5f1091..275e143bd748 100644 +> +--- a/drivers/cxl/core/region.c +> ++++ b/drivers/cxl/core/region.c +> +@@ -978,7 +978,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, +> +rc = check_last_peer(cxled, ep, cxl_rr, +> +distance); +> +if (rc) +> +- return rc; +> ++ // return rc; +> +goto out_target_set; +> +} +> +goto add_target; +I'm still carrying this hack and still haven't worked out the right fix. + +Suggestions welcome! If not I'll hopefully get some time on this +towards the end of the week. + +Jonathan + diff --git a/results/classifier/105/vnc/351 b/results/classifier/105/vnc/351 new file mode 100644 index 000000000..c185eaef9 --- /dev/null +++ b/results/classifier/105/vnc/351 @@ -0,0 +1,14 @@ +vnc: 0.949 +mistranslation: 0.935 +device: 0.685 +network: 0.450 +graphic: 0.266 +semantic: 0.219 +boot: 0.168 +other: 0.136 +KVM: 0.063 +instruction: 0.039 +assembly: 0.019 +socket: 0.005 + +German keyboard vnc issue diff --git a/results/classifier/105/vnc/42613410 b/results/classifier/105/vnc/42613410 new file mode 100644 index 000000000..c04a83e7c --- /dev/null +++ b/results/classifier/105/vnc/42613410 @@ -0,0 +1,157 @@ +vnc: 0.400 +KVM: 0.381 +device: 0.342 +other: 0.332 +graphic: 0.330 +semantic: 0.327 +mistranslation: 0.314 +instruction: 0.307 +network: 0.284 +assembly: 0.283 +socket: 0.190 +boot: 0.187 + +[Qemu-devel] [PATCH, Bug 1612908] scripts: Add TCP endpoints for qom-* scripts + +From: Carl Allendorph <address@hidden> + +I've created a patch for bug #1612908. The current docs for the scripts +in the "scripts/qmp/" directory suggest that both unix sockets and +tcp endpoints can be used. The TCP endpoints don't work for most of the +scripts, with notable exception of 'qmp-shell'. This patch attempts to +refactor the process of distinguishing between unix path endpoints and +tcp endpoints to work for all of these scripts. + +Carl Allendorph (1): + scripts: Add ability for qom-* python scripts to target tcp endpoints + + scripts/qmp/qmp-shell | 22 ++-------------------- + scripts/qmp/qmp.py | 23 ++++++++++++++++++++--- + 2 files changed, 22 insertions(+), 23 deletions(-) + +-- +2.7.4 + +From: Carl Allendorph <address@hidden> + +The current code for QEMUMonitorProtocol accepts both a unix socket +endpoint as a string and a tcp endpoint as a tuple. Most of the scripts +that use this class don't massage the command line argument to generate +a tuple. This patch refactors qmp-shell slightly to reuse the existing +parsing of the "host:port" string for all the qom-* scripts. + +Signed-off-by: Carl Allendorph <address@hidden> +--- + scripts/qmp/qmp-shell | 22 ++-------------------- + scripts/qmp/qmp.py | 23 ++++++++++++++++++++--- + 2 files changed, 22 insertions(+), 23 deletions(-) + +diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell +index 0373b24..8a2a437 100755 +--- a/scripts/qmp/qmp-shell ++++ b/scripts/qmp/qmp-shell +@@ -83,9 +83,6 @@ class QMPCompleter(list): + class QMPShellError(Exception): + pass + +-class QMPShellBadPort(QMPShellError): +- pass +- + class FuzzyJSON(ast.NodeTransformer): + '''This extension of ast.NodeTransformer filters literal "true/false/null" + values in an AST and replaces them by proper "True/False/None" values that +@@ -103,28 +100,13 @@ class FuzzyJSON(ast.NodeTransformer): + # _execute_cmd()). Let's design a better one. + class QMPShell(qmp.QEMUMonitorProtocol): + def __init__(self, address, pretty=False): +- qmp.QEMUMonitorProtocol.__init__(self, self.__get_address(address)) ++ qmp.QEMUMonitorProtocol.__init__(self, address) + self._greeting = None + self._completer = None + self._pretty = pretty + self._transmode = False + self._actions = list() + +- def __get_address(self, arg): +- """ +- Figure out if the argument is in the port:host form, if it's not it's +- probably a file path. +- """ +- addr = arg.split(':') +- if len(addr) == 2: +- try: +- port = int(addr[1]) +- except ValueError: +- raise QMPShellBadPort +- return ( addr[0], port ) +- # socket path +- return arg +- + def _fill_completion(self): + for cmd in self.cmd('query-commands')['return']: + self._completer.append(cmd['name']) +@@ -400,7 +382,7 @@ def main(): + + if qemu is None: + fail_cmdline() +- except QMPShellBadPort: ++ except qmp.QMPShellBadPort: + die('bad port number in command-line') + + try: +diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py +index 62d3651..261ece8 100644 +--- a/scripts/qmp/qmp.py ++++ b/scripts/qmp/qmp.py +@@ -25,21 +25,23 @@ class QMPCapabilitiesError(QMPError): + class QMPTimeoutError(QMPError): + pass + ++class QMPShellBadPort(QMPError): ++ pass ++ + class QEMUMonitorProtocol: + def __init__(self, address, server=False, debug=False): + """ + Create a QEMUMonitorProtocol class. + + @param address: QEMU address, can be either a unix socket path (string) +- or a tuple in the form ( address, port ) for a TCP +- connection ++ or a TCP endpoint (string in the format "host:port") + @param server: server mode listens on the socket (bool) + @raise socket.error on socket connection errors + @note No connection is established, this is done by the connect() or + accept() methods + """ + self.__events = [] +- self.__address = address ++ self.__address = self.__get_address(address) + self._debug = debug + self.__sock = self.__get_sock() + if server: +@@ -47,6 +49,21 @@ class QEMUMonitorProtocol: + self.__sock.bind(self.__address) + self.__sock.listen(1) + ++ def __get_address(self, arg): ++ """ ++ Figure out if the argument is in the port:host form, if it's not it's ++ probably a file path. ++ """ ++ addr = arg.split(':') ++ if len(addr) == 2: ++ try: ++ port = int(addr[1]) ++ except ValueError: ++ raise QMPShellBadPort ++ return ( addr[0], port ) ++ # socket path ++ return arg ++ + def __get_sock(self): + if isinstance(self.__address, tuple): + family = socket.AF_INET +-- +2.7.4 + diff --git a/results/classifier/105/vnc/659 b/results/classifier/105/vnc/659 new file mode 100644 index 000000000..604941934 --- /dev/null +++ b/results/classifier/105/vnc/659 @@ -0,0 +1,54 @@ +vnc: 0.687 +socket: 0.686 +device: 0.677 +instruction: 0.664 +graphic: 0.663 +network: 0.638 +boot: 0.634 +other: 0.618 +KVM: 0.590 +semantic: 0.519 +mistranslation: 0.500 +assembly: 0.498 + +Qemu6 regression causing disabled usb controller upon usbredir device_add +Description of problem: +I'm encountering a nagging issue with usbredir and a windows guest, but although I did pinpoint the commit that caused the issue, I have a hard time understanding it. + +The issue occurs when two usbredir devices are added to a guest windows vm (any vm installed from the official iso will reproduce the issue). When the second device is added, the UHCI usb controller is disabled by windows with an error code 43 (can be seen with in the usb adapters section of the device manager). +Steps to reproduce: +1. take/create an intalled windows image and run it with `qemu-system-x86_64 -M pc -cpu host,hv_time,hv_synic,hv_stimer,hv_vpindex -enable-kvm -m 4096 -device piix3-usb-uhci,id=uhci -qmp tcp:127.0.0.1:4444,server=on,wait=off,ipv4 -drive <disk-parameters> --snapshot` (snapshot not necessary but useful for multiples testing to avoid side effects as the usb status sometime lingers after a shutdown, not sure why) +2. Open windows device manager +3. add devices via [this qmp python script](/uploads/5f2f9240dce1b55ceb148b32f3d6073c/qmp-usb-adds.py) +Additional information: +The commit causing the issue (everything works well when reverting it) is 7bed89958bfbf40df9ca681cefbdca63abdde39d : device_core: use `drain_call_rcu` in in `qmp_device_add`. + +I narrowed the problem to the unlock of the iothread: the minimum `drain_call_rcu` code that still reproduce the issue is: + +```c +void drain_call_rcu(void) +{ + bool locked = qemu_mutex_iothread_locked(); + if (locked) { + qemu_mutex_unlock_iothread(); + } + usleep(50000); // time spent draining the rcu on a few slow cases. + + if (locked) { + qemu_mutex_lock_iothread(); + } +} +``` + +About the qemu command line: The hv parameters are needed to trigger the issue I do not know why. + +I tried to find what was able to take advantage of the free iothread lock, but the only thing I got so far is that the iothread lock is not taken during the first drain (from the first device add), but is taken many times during the second drain by physmem's IOs (from kvm-accel, but at this point, I'm a bit lost). + +I'm looking for pointers as to what could trigger the issue in order to narrow it down, as, so far, I do not understand exactly what causes the regression. +I am unsure of how this would even transcribe in a linux vm so i didn't try to reproduce the issue with one. + +With the attached [reproduction python script](/uploads/5f2f9240dce1b55ceb148b32f3d6073c/qmp-usb-adds.py), the issue triggers nearly 100% of the time. + +Note 1: Related to #650 as the commit causing the regression is the same, although the cause is probably different since the rcu is not implied. + +Note 2: This is a restranscription of [this ml report](https://lore.kernel.org/qemu-devel/20210930134844.f4kh72vpeknr2vmk@gmail.com/) as i wasn't aware, the correct way to report issue was through gitlab now. diff --git a/results/classifier/105/vnc/685 b/results/classifier/105/vnc/685 new file mode 100644 index 000000000..953b4b5fe --- /dev/null +++ b/results/classifier/105/vnc/685 @@ -0,0 +1,82 @@ +vnc: 0.871 +other: 0.844 +semantic: 0.811 +socket: 0.810 +device: 0.787 +graphic: 0.758 +assembly: 0.757 +KVM: 0.733 +instruction: 0.721 +boot: 0.701 +network: 0.490 +mistranslation: 0.468 + +QEMU Segmentation fault - Xen / Ubuntu 18.04 +Description of problem: +See notes below. +Steps to reproduce: +See notes below. +Additional information: +* The error is very rare. +* The VMs have been created with `xl create` (Xen utility). +* The error has been found with _coredump_ ([core.qemu-system-i38.0.abb1047980ee4143937dcce7b8da9e60.16892.1634806267000000.lz4](/uploads/a90e21a2e14c9ebba07585034de25b1a/core.qemu-system-i38.0.abb1047980ee4143937dcce7b8da9e60.16892.1634806267000000.lz4)): +```bash +$ sudo coredumpctl info 16892 + PID: 16892 (qemu-system-i38) + UID: 0 (root) + GID: 0 (root) + Signal: 11 (SEGV) + Timestamp: Thu 2021-10-21 11:51:07 MSK (17min ago) + Command Line: /usr/bin/qemu-system-i386 -xen-domid 2679 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-2679,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp + Executable: /usr/bin/qemu-system-i386 + Control Group: /system.slice/ptms.sandbox.sandbox-creator.service + Unit: ptms.sandbox.sandbox-creator.service + Slice: system.slice + Boot ID: abb1047980ee4143937dcce7b8da9e60 + Machine ID: bdce82649a9d4d9db192a692b330943f + Hostname: ptms-7 + Storage: /var/lib/systemd/coredump/core.qemu-system-i38.0.abb1047980ee4143937dcce7b8da9e60.16892.1634806267000000.lz4 + Message: Process 16892 (qemu-system-i38) of user 0 dumped core. + + Stack trace of thread 16892: + #0 0x00007f1c6d33ca5f __memmove_avx_unaligned_erms (libc.so.6) + #1 0x00005586abeae8bf iov_from_buf_full (qemu-system-i386) + #2 0x00005586abe03d46 n/a (qemu-system-i386) + #3 0x00005586abdd17ad n/a (qemu-system-i386) + #4 0x00005586abeac93c n/a (qemu-system-i386) + #5 0x00007f1c6d2067b0 n/a (libc.so.6) + #6 0x00005586abeb89bd n/a (qemu-system-i386) + #7 0x00005586abeaaf87 aio_bh_poll (qemu-system-i386) + #8 0x00005586abe9a45e aio_dispatch (qemu-system-i386) + #9 0x00005586abeaad9e n/a (qemu-system-i386) + #10 0x00007f1c6fd7f537 g_main_context_dispatch (libglib-2.0.so.0) + #11 0x00005586abeb5caa main_loop_wait (qemu-system-i386) + #12 0x00005586abca092d qemu_main_loop (qemu-system-i386) + #13 0x00005586ab9f508e main (qemu-system-i386) + #14 0x00007f1c6d1cfbf7 __libc_start_main (libc.so.6) + #15 0x00005586ab9f97fa _start (qemu-system-i386) + + Stack trace of thread 16932: + #0 0x00007f1c6d2c9639 syscall (libc.so.6) + #1 0x00005586abe9de1b qemu_event_wait (qemu-system-i386) + #2 0x00005586abea5e28 n/a (qemu-system-i386) + #3 0x00005586abe9d0b6 n/a (qemu-system-i386) + #4 0x00007f1c6d5a66db start_thread (libpthread.so.0) + #5 0x00007f1c6d2cf71f __clone (libc.so.6) + + Stack trace of thread 16957: + #0 0x00007f1c6d5b0474 __libc_read (libpthread.so.0) + #1 0x00007f1c71f67777 n/a (libxenstore.so.3.0) + #2 0x00007f1c71f6784d n/a (libxenstore.so.3.0) + #3 0x00007f1c71f67b61 n/a (libxenstore.so.3.0) + #4 0x00007f1c6d5a66db start_thread (libpthread.so.0) + #5 0x00007f1c6d2cf71f __clone (libc.so.6) + + Stack trace of thread 16958: + #0 0x00007f1c6d5b0474 __libc_read (libpthread.so.0) + #1 0x00007f1c71f67777 n/a (libxenstore.so.3.0) + #2 0x00007f1c71f6784d n/a (libxenstore.so.3.0) + #3 0x00007f1c71f67b61 n/a (libxenstore.so.3.0) + #4 0x00007f1c6d5a66db start_thread (libpthread.so.0) + #5 0x00007f1c6d2cf71f __clone (libc.so.6) +``` diff --git a/results/classifier/105/vnc/697197 b/results/classifier/105/vnc/697197 new file mode 100644 index 000000000..53975f513 --- /dev/null +++ b/results/classifier/105/vnc/697197 @@ -0,0 +1,265 @@ +vnc: 0.688 +mistranslation: 0.646 +KVM: 0.578 +graphic: 0.440 +instruction: 0.405 +network: 0.394 +semantic: 0.388 +device: 0.381 +assembly: 0.358 +other: 0.351 +boot: 0.310 +socket: 0.290 + +Empty password allows access to VNC in libvirt + +The help in the /etc/libvirt/qemu.conf states + +"To allow access without passwords, leave this commented out. An empty +string will still enable passwords, but be rejected by QEMU +effectively preventing any use of VNC." + +yet setting: + +vnc_password="" + +allows access to the vnc console without any password prompt just as if it is hashed out completely. + +ProblemType: Bug +DistroRelease: Ubuntu 10.10 +Package: libvirt-bin 0.8.3-1ubuntu14 +ProcVersionSignature: Ubuntu 2.6.35-24.42-server 2.6.35.8 +Uname: Linux 2.6.35-24-server x86_64 +Architecture: amd64 +Date: Tue Jan 4 12:18:35 2011 +InstallationMedia: Ubuntu-Server 10.04.1 LTS "Lucid Lynx" - Release amd64 (20100816.2) +ProcEnviron: + LANG=en_GB.UTF-8 + SHELL=/bin/bash +SourcePackage: libvirt + + + +Thanks for taking the time to report this bug and helping to make Ubuntu better. + +The feature itself may be low priority, bug getting the comment in the qemu.conf file fixed so that no admins get caught by surprise seems like high priority. I see no activity in the upstream bug yet, though, so will wait to see what feedback happens there. + +From the libvirt list + +"The behaviour you're seeing is a bug recently introduced in +> the QEMU monitor password command handling by QEMU GIT repo +> changeset 52c18be9e99dabe295321153fda7fce9f76647ac. +> " + + + +On 7 January 2011 14:41, Serge Hallyn <email address hidden> wrote: +> ** Changed in: libvirt (Ubuntu) +> Assignee: (unassigned) => Serge Hallyn (serge-hallyn) +> +> -- +> You received this bug notification because you are a direct subscriber +> of the bug. +> https://bugs.launchpad.net/bugs/697197 +> +> Title: +> Empty password allows access to VNC in libvirt +> + + + +-- +Neil Wilson + +Libvirt is in the clear on this one. It is a mild security issue introduced into QEMU. + +When I say in the clear, the libvirt guys think they're in the clear. + +Checked the qemu source and there is no fix for this problem. Could be a change of behaviour. + +CVE issued putting the onus squarely on qemu's shoulders. + +The solution to this problem is to reverse the commit 52c18be9e99dabe295321153fda7fce9f76647ac in the main Qemu archive. + + + + + +Installed patched build onto Maverick server. vnc_listen set to 0.0.0.0 in /etc/libvirt/qemu.conf + +Set vnc_password=""' with vnc_tls=1 in /etc/libvirt/qemu.conf and confirmed that the lanched server now rejects authentication for any password, whereas it turned off authentication and encryption completely before. + +Hashed out vnc_password and left vnc_tls=1 in /etc/libvirt/qemu.conf. Confirmed that the server uses anonymous auth with TLS. Allows the user on without a password. qemu-kvm launched with -vnc 0.0.0.0:0,tls,x509=/etc/pki/libvirt-vnc + +Hashed out vnc_tls=1. Confirmed server allows direct access to VNC. qemu-kvm launched with -vnc 0.0.0.0:0 + +Set vnc_password="". Confirmed server rejects authentication for any password, with no encryption. Again previously it had just let the user on. qemu-kvm launched with -vnc 0.0.0.0:0,password + +set vnc_password="password". Confirmed server accepts authentication with that password. qemu-kvm launched with -vnc 0.0.0.0:0,password + + + +Please sponsor for upload + + + +This fault probably affects all the current versions of qemu-kvm. It's present in 0.11 and the current qemu master branch. + +Looks good, thanks for doing this, Neil. + +I'm going to update it just slightly, as this debdiff will need to go through the security queue, since there's an associated CVE. I'll prep that upload and the security team will sponsor it into maverick-security. + +I'll get it uploaded to natty now. + +The last thing I need you to do is to email your patch to the qemu-devel mailing list. The maintainers do not accept patches solely attached to bugs in Launchpad. Their processes require that you email the patch to the mailing list. Sorry for the run-around. Cheers! + +@security team, + +Could you please sponsor this to the maverick-security queue? Thanks! + +The patch needs to go into Lucid as well. + +Marking the libvirt tasks "invalid", as upstream libvirt has correctly pointed out that this bug is in qemu, and not libvirt: + * https://bugzilla.redhat.com/show_bug.cgi?id=667097 + +Uploading to Natty now... + +Confirmed that the affected code is also in Lucid. Adding a task for that, and attaching a debdiff for lucid-security too. + +This bug was fixed in the package qemu-kvm - 0.13.0+noroms-0ubuntu13 + +--------------- +qemu-kvm (0.13.0+noroms-0ubuntu13) natty; urgency=low + + [ Neil Wilson <email address hidden> ] + * SECURITY UPDATE: Setting VNC password to empty string silently + disables all authentication (LP: #697197) + - debian/patches/697197-fix-vnc-password-semantics.patch: Reverses the + change introduced in Qemu by git commit 52c18be9 + - CVE: 2011-0011 + + [ Dustin Kirkland ] + * Updated patch to reflect the move of vnc.c to ui/vnc.c + -- Dustin Kirkland <email address hidden> Fri, 11 Feb 2011 09:53:19 -0600 + +Attaching Lucid debdiff. + +Thanks for preparing the debdiffs! It looks like karmic is vulnerable too, so we'll need that as well. I'll update the debdiffs to use proper DEP-3 and fix up the formatting of the changelogs a bit ("CVE-" vs "CVE: "), and get these building. + +Attaching debdiff for karmic. + +This bug was fixed in the package qemu-kvm - 0.12.5+noroms-0ubuntu7.2 + +--------------- +qemu-kvm (0.12.5+noroms-0ubuntu7.2) maverick-security; urgency=low + + [ Dustin Kirkland ] + * SECURITY UPDATE: Setting VNC password to empty string silently + disables all authentication (LP: #697197). + - debian/patches/697197-fix-vnc-password-semantics.patch: Reverses the + change introduced in Qemu by git commit 52c18be9, thanks to Neil Wilson. + - CVE-2011-0011 + + [ Kees Cook ] + * debian/rules: disable parallel build; fix FTBFS. + -- Kees Cook <email address hidden> Fri, 11 Feb 2011 15:52:12 -0800 + +This bug was fixed in the package qemu-kvm - 0.12.3+noroms-0ubuntu9.4 + +--------------- +qemu-kvm (0.12.3+noroms-0ubuntu9.4) lucid-security; urgency=low + + * SECURITY UPDATE: Setting VNC password to empty string silently + disables all authentication (LP: #697197) + - debian/patches/697197-fix-vnc-password-semantics.patch: Reverses the + change introduced in Qemu by git commit 52c18be9, thanks to Neil Wilson. + - CVE-2011-0011 + -- Dustin Kirkland <email address hidden> Fri, 11 Feb 2011 09:57:30 -0600 + +This bug was fixed in the package qemu-kvm - 0.11.0-0ubuntu6.4 + +--------------- +qemu-kvm (0.11.0-0ubuntu6.4) karmic-security; urgency=low + + * SECURITY UPDATE: Setting VNC password to empty string silently + disables all authentication (LP: #697197) + - debian/patches/697197-fix-vnc-password-semantics.patch: Reverses the + change introduced in Qemu by git commit 52c18be9, thanks to Neil Wilson. + - CVE-2011-0011 + -- Dustin Kirkland <email address hidden> Fri, 11 Feb 2011 17:46:26 -0600 + +Nothing left to do, unsubscribing ubuntu-security-sponsors. + +Ubuntu 12.04 is also affected + +Description of problem: + +The help for 'vnc_password' in qemu.conf states "An empty string will still enable passwords, but be rejected by QEMU effectively preventing any use of VNC.". + +Yet if you set vnc_password="" then you can access the VNC console without any password prompt at all - just as you can if the entry is hashed out. + +Version-Release number of selected component (if applicable): + +libvirtd (libvirt) 0.8.3 + + +How reproducible: + +Every time by configuration + +Steps to Reproduce: +1. Create a VNC console without a password. +2. Set vnc_password="" in /etc/libvirt/qemu.conf +3. Start up a guest and access the VNC console with a client. + +Actual results: + +You get straight into the console with no prompts. + + +Expected results: + +Should have come up with a prompt and rejected the access. Or the instructions in the qemu.conf file need changing to take account of the current behaviour. + +Additional info: + +Similarly if you set the passwd attribute to '' in the vnc graphics XML stanza. + +This is not a libvirt bug. This is caused by a flaw in particular QEMU version you are using, which silently disables auth when the password is set to "". This bug was introduced in QEMU in this bogus commit + +commit 52c18be9e99dabe295321153fda7fce9f76647ac +Author: Zachary Amsden <email address hidden> +Date: Thu Jul 30 00:15:01 2009 -1000 + + When using stdio monitor and VNC display, one can set or clear a VNC password; this should set or turn off VNC authentication as well. + +Description of problem: +The semantics of the ',password' option to -vnc are that it enables the VNC auth scheme. If the VNC server password is unset or empty string, all attempts to authenticate with the server will be explicitly blocked. + +This allows applications to enable and selectively allow access for a period of time, before clearing the password again to prevent further access. + +Upstream changes have introduced a flaw by disabling all authentication when the password was cleared with upstream commit [1]. + +[1] http://www.qemu.com/qemu.git/commit/?id=52c18be9e99dabe295321153fda7fce9f76647ac + +Created attachment 475841 +Fix to vnc password semantics + +This patch corrects the flaw in qemu-kvm + +Please see http://launchpad.net/bugs/697197 for testing performed. + +Created qemu tracking bugs for this issue + +Affects: fedora-all [bug 680886] + +This issue has been addressed in following products: + + Red Hat Enterprise Linux 6 + +Via RHSA-2011:0345 https://rhn.redhat.com/errata/RHSA-2011-0345.html + +Statement: + +This issue does not affect versions of kvm package as shipped with Red Hat Enterprise Linux 5. + diff --git a/results/classifier/105/vnc/697510 b/results/classifier/105/vnc/697510 new file mode 100644 index 000000000..a0c3cbc6d --- /dev/null +++ b/results/classifier/105/vnc/697510 @@ -0,0 +1,165 @@ +vnc: 0.977 +instruction: 0.975 +semantic: 0.973 +device: 0.966 +assembly: 0.965 +socket: 0.963 +other: 0.960 +boot: 0.955 +mistranslation: 0.952 +graphic: 0.952 +KVM: 0.939 +network: 0.924 + +Machine shut off after tons of lsi_scsi: error: MSG IN data too long + +The problem mostly happens during our backup, syslog does not have any problematic messages. + +Host is Ubuntu 10.10 x86 64 bits +Guest is Windows 2003 Server 32 bits +Using Virtio and Red Hat driver I get a STOP screen 0x100000d1 and machine either reboot, stay frozen or shut off. +Using SCSI the machine shuts off and I get tons of message on stdout; + +LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 3500 -smp 4,sockets=4,cores=1,threads=1 -name BMSRV0101 -uuid 6ccbb5fa-5880-a1ab-3b7a-fb7ccc7c8ccf -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/BMSRV0101.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=localtime -boot c -device lsi,id=scsi0,bus=pci.0,addr=0x4 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/dev/vgUbuntu/vmBMSRV0101,if=none,id=drive-scsi0-0-0,boot=on,format=raw -device scsi-disk,bus=scsi0.0,scsi-id=0,drive=drive-scsi0-0-0,id=scsi0-0-0 -device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:5d:7b:07,bus=pci.0,addr=0x3 -net tap,fd=63,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device usb-host,hostbus=002,hostaddr=005,id=hostdev0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 +char device redirected to /dev/pts/0 +pci_add_option_rom: failed to find romfile "pxe-virtio.bin" +husb: open device 2.5 +husb: config #1 need -1 +husb: 1 interfaces claimed for configuration 1 +husb: grabbed usb device 2.5 +husb: config #1 need 1 +husb: 1 interfaces claimed for configuration 1 + +lsi_scsi: error: Unimplemented message 0x00 +(x8) + +lsi_scsi: error: MSG IN data too long +lsi_scsi: error: Unimplemented message 0x00 +(x760) + +lsi_scsi: error: MSG IN data too long +lsi_scsi: error: MSG IN data too long +kvm: /build/buildd/qemu-kvm-0.12.5+noroms/hw/lsi53c895a.c:512: lsi_do_dma: Assertion `s->current' failed. + + +I can include minidump file if needed. +I am currently using IDE and it seems OK. + +On Wed, Jan 5, 2011 at 5:05 AM, TiCPU <email address hidden> wrote: +> Using Virtio and Red Hat driver I get a STOP screen 0x100000d1 and machine either reboot, stay frozen or shut off. + +Here the minidump would be useful and we should get in touch with the +person that maintains the virtio-blk Windows driver. + +> Using SCSI the machine shuts off and I get tons of message on stdout; +[...] +> lsi_scsi: error: Unimplemented message 0x00 +> (x8) +> +> lsi_scsi: error: MSG IN data too long +> lsi_scsi: error: Unimplemented message 0x00 +> (x760) +> +> lsi_scsi: error: MSG IN data too long +> lsi_scsi: error: MSG IN data too long +> kvm: /build/buildd/qemu-kvm-0.12.5+noroms/hw/lsi53c895a.c:512: lsi_do_dma: Assertion `s->current' failed. + +Looks like the LSI SCSI device emulation is getting out of sync with +the guest's device driver. + +Can you give more details or a test case that reproduces these +problems? Which backup software are you using and is it known to do +special purpose SCSI commands? + +Stefan + +On Thu, Jan 6, 2011 at 9:43 AM, Stefan Hajnoczi <email address hidden> wrote: +> On Wed, Jan 5, 2011 at 5:05 AM, TiCPU <email address hidden> wrote: +>> Using Virtio and Red Hat driver I get a STOP screen 0x100000d1 and machine either reboot, stay frozen or shut off. +> +> Here the minidump would be useful and we should get in touch with the +> person that maintains the virtio-blk Windows driver. + +Vadim, do you maintain the virtio-blk Windows driver? Perhaps you +have some ideas on how to debug this? + +Stefan + +"Red Hat driver" means driver from rhel virtio-win.prm? + +Some new informations, the IDE bus works for a while then the PC slows down and finally backup fails and freeze most of the I/Os. + +The underlying device is a Iomega REV 120 USB appearing as a CD-ROM, /dev/sr0 and using FAT32 with 32k clusters. +The backup program is Symantec Backup Exec 11d + +I worked around the problem by formatting the REV 120 back to UDF and sharing it via Samba, BackupExec now backups to the REV via network. + +I tried iSCSI and had problems too. + +I attached all the minidumps, they all look the same. + +Can you try viostor with sptd (scsi pass through direct) disabled? + +Triaging old bug tickets ... do you still have this problem with the latest version of QEMU, or could we close this bug nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + +We found a reproducer during fuzzing: + +``` +qemu-system-x86_64: hw/scsi/lsi53c895a.c:624: lsi_do_dma: Assertion `s->current' failed. +``` + +To reproduce run the QEMU with the following command line: +``` +qemu-system-x86_64 -cdrom hypertrash.iso -nographic -m 100 -enable-kvm -net none -device ich9-usb-ehci1 -device usb-tablet -device lsi53c810,id=scsi0 -drive file=hda.img,if=none,format=raw,discard=unmap,cache=none,id=someid -device scsi-hd,drive=someid,bus=scsi0.0 +``` + +QEMU Version: +``` +# qemu-5.0.0 +$ ./configure --target-list=x86_64-softmmu --enable-sanitizers; make +$ x86_64-softmmu/qemu-system-x86_64 --version +QEMU emulator version 5.0.0 +Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers +``` + +To create disk image run: +``` +dd if=/dev/zero of=hda.img bs=1024 count=1024 +``` + +Here is a qtest reproducer: + +cat << EOF | ./i386-softmmu/qemu-system-i386 -nographic -M q35,accel=qtest -qtest stdio -drive if=none,id=drive0,file=null-co://,file.read-zeroes=on,format=raw -device lsi53c895a,id=scsi0 -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 -monitor none -serial none +outl 0xcf8 0x80001814 +outl 0xcfc 0xe1068000 +outl 0xcf8 0x80001818 +outl 0xcf8 0x80001804 +outw 0xcfc 0x7 +outl 0xcf8 0x80002010 +write 0xe106802e 0x1 0xff +write 0xe106802f 0x1 0xff +EOF + +With -trace lsi\*: +... +[R +0.037396] write 0xe106802e 0x1 0xff +15257@1594419708.889733:lsi_reg_write Write reg DSP2 0x2e = 0xff +OK +[S +0.037420] OK +[R +0.037434] write 0xe106802f 0x1 0xff +15257@1594419708.889814:lsi_reg_write Write reg DSP3 0x2f = 0xff +15257@1594419708.889862:lsi_execute_script SCRIPTS dsp=0xffff0000 opcode 0x105e8b06 arg 0x89084e8b +qemu-system-i386: /home/alxndr/Development/qemu/hw/scsi/lsi53c895a.c:624: void lsi_do_dma(LSIState *, int): Assertion `s->current' failed. + + + + +This is an automated cleanup. This bug report has been moved to QEMU's +new bug tracker on gitlab.com and thus gets marked as 'expired' now. +Please continue with the discussion here: + + https://gitlab.com/qemu-project/qemu/-/issues/84 + + diff --git a/results/classifier/105/vnc/70 b/results/classifier/105/vnc/70 new file mode 100644 index 000000000..f7806ec42 --- /dev/null +++ b/results/classifier/105/vnc/70 @@ -0,0 +1,14 @@ +vnc: 0.870 +network: 0.819 +device: 0.778 +mistranslation: 0.767 +graphic: 0.256 +other: 0.210 +boot: 0.160 +semantic: 0.089 +socket: 0.068 +assembly: 0.014 +instruction: 0.011 +KVM: 0.009 + +hda sound capture broken with VNC diff --git a/results/classifier/105/vnc/723 b/results/classifier/105/vnc/723 new file mode 100644 index 000000000..b373e5825 --- /dev/null +++ b/results/classifier/105/vnc/723 @@ -0,0 +1,44 @@ +vnc: 0.818 +mistranslation: 0.804 +boot: 0.791 +other: 0.783 +device: 0.778 +instruction: 0.767 +KVM: 0.765 +graphic: 0.763 +semantic: 0.754 +assembly: 0.752 +network: 0.744 +socket: 0.743 + +multiple displays VGA + qxl forces Spice mouse-mode=server and breaks usb-tablet/seamless mode +Description of problem: +qxl causes a totally unexpected mouse conflict with the default VGA in OSX Catalina and newer guests using AppleVirtualGraphics.kext + +usb-tablet is unusable - only clicks are received +usb-mouse works but grabs focus +Steps to reproduce: +1. install and run OSX guest +2. connect to Spice port +3. can't move mouse if usb-tablet is used. usb-mouse pointer but is grabbed +4. removing qxl fixed the issue for me. Mouse is seamless/not grabbed now +5. added -spice agent-mouse=on just in case +Additional information: +qmp from broken shows mouse-mode server. Working guests show mouse-mode client + +``` +{ "execute": "query-spice" } +... "mouse-mode": "server"}} +``` +- spice works with multiple displays in OSX if both are VGA but I had the same focus problem, will need to recheck because Qemu 6.1 seems stuck on mouse-mode=server. + + +Working VGA +``` +/usr/bin/qemu-system-x86_64 -name macos-big-sur,process=macos-big-sur -pidfile macos-big-sur/macos-big-sur.pid -enable-kvm -machine q35,smm=off,vmport=off -device isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal\(c\)AppleComputerInc -no-hpet -global kvm-pit.lost_tick_policy=discard -cpu host,kvm=on,vendor=GenuineIntel,+hypervisor,+invtsc,+kvm_pv_eoi,+kvm_pv_unhalt -smp cores=2,threads=1,sockets=1 -m 8G -device virtio-balloon -smbios type=2,manufacturer="Wimpys World",product=Quickemu,version=2.3.1,serial=jvzclfjbeyq.pbz,location=wimpysworld.com,asset=macos-big-sur -device VGA,vgamem_mb=128 -display none -device usb-ehci,id=input -device usb-kbd,bus=input.0 -device usb-tablet,bus=input.0 -rtc base=localtime,clock=host,driftfix=slew -spice disable-ticketing=on,agent-mouse=on,port=5930 -device virtio-serial-pci -chardev socket,id=agent0,path=macos-big-sur/macos-big-sur-agent.sock,server=on,wait=off -device virtserialport,chardev=agent0,name=org.qemu.guest_agent.0 -device virtio-rng-pci,rng=rng0 -object rng-random,id=rng0,filename=/dev/urandom -chardev socket,id=monitor0,path=macos-big-sur/macos-big-sur-monitor.sock,server=on,wait=off -mon chardev=monitor0,id=monitor,mode=control -monitor none -serial mon:stdio -audiodev spice,id=audio0 -device ich9-intel-hda -device hda-duplex,audiodev=audio0 -device virtio-net,netdev=nic -netdev user,hostname=macos-big-sur,hostfwd=tcp::22220-:22,id=nic -global driver=cfi.pflash01,property=secure,value=on -drive if=pflash,format=raw,unit=0,file=macos-big-sur/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=macos-big-sur/OVMF_VARS-1024x768.fd -device ahci,id=ahci -device ide-hd,bus=ahci.0,drive=BootLoader,bootindex=0 -drive id=BootLoader,if=none,format=qcow2,file=macos-big-sur/OpenCore.qcow2 -device virtio-blk-pci,drive=SystemDisk -drive id=SystemDisk,if=none,format=qcow2,file=macos-big-sur/disk.qcow2 -device qemu-xhci,id=spicepass -chardev spicevmc,id=usbredirchardev1,name=usbredir -device usb-redir,chardev=usbredirchardev1,id=usbredirdev1 -chardev spicevmc,id=usbredirchardev2,name=usbredir -device usb-redir,chardev=usbredirchardev2,id=usbredirdev2 -chardev spicevmc,id=usbredirchardev3,name=usbredir -device usb-redir,chardev=usbredirchardev3,id=usbredirdev3 -device usb-ccid -chardev spicevmc,id=ccid,name=smartcard -device ccid-card-passthru,chardev=ccid -device virtio-serial-pci -chardev spiceport,id=webdav0,name=org.spice-space.webdav.0 -device virtserialport,chardev=webdav0,name=org.spice-space.webdav.0 -fsdev local,id=fsdev0,path=/home/jmorrison/Public,security_model=mapped-xattr -device virtio-9p-pci,fsdev=fsdev0,mount_tag=Public-jmorrison +``` + +Broken usb-tablet qxl +``` +/usr/bin/qemu-system-x86_64 -name macos-big-sur,process=macos-big-sur -pidfile macos-big-sur/macos-big-sur.pid -enable-kvm -machine q35,smm=off,vmport=off -device isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal\(c\)AppleComputerInc -no-hpet -global kvm-pit.lost_tick_policy=discard -cpu host,kvm=on,vendor=GenuineIntel,+hypervisor,+invtsc,+kvm_pv_eoi,+kvm_pv_unhalt -smp cores=2,threads=1,sockets=1 -m 8G -device virtio-balloon -smbios type=2,manufacturer="Wimpys World",product=Quickemu,version=2.3.1,serial=jvzclfjbeyq.pbz,location=wimpysworld.com,asset=macos-big-sur -device qxl -display none -device usb-ehci,id=input -device usb-kbd,bus=input.0 -device usb-tablet,bus=input.0 -rtc base=localtime,clock=host,driftfix=slew -spice disable-ticketing=on,port=5930 -device virtio-serial-pci -chardev socket,id=agent0,path=macos-big-sur/macos-big-sur-agent.sock,server=on,wait=off -device virtserialport,chardev=agent0,name=org.qemu.guest_agent.0 -device virtio-rng-pci,rng=rng0 -object rng-random,id=rng0,filename=/dev/urandom -chardev socket,id=monitor0,path=macos-big-sur/macos-big-sur-monitor.sock,server=on,wait=off -mon chardev=monitor0,id=monitor,mode=control -monitor none -serial mon:stdio -audiodev spice,id=audio0 -device ich9-intel-hda -device hda-duplex,audiodev=audio0 -device virtio-net,netdev=nic -netdev user,hostname=macos-big-sur,hostfwd=tcp::22220-:22,id=nic -global driver=cfi.pflash01,property=secure,value=on -drive if=pflash,format=raw,unit=0,file=macos-big-sur/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=macos-big-sur/OVMF_VARS-1024x768.fd -device ahci,id=ahci -device ide-hd,bus=ahci.0,drive=BootLoader,bootindex=0 -drive id=BootLoader,if=none,format=qcow2,file=macos-big-sur/OpenCore.qcow2 -device virtio-blk-pci,drive=SystemDisk -drive id=SystemDisk,if=none,format=qcow2,file=macos-big-sur/disk.qcow2 -device qemu-xhci,id=spicepass -chardev spicevmc,id=usbredirchardev1,name=usbredir -device usb-redir,chardev=usbredirchardev1,id=usbredirdev1 -chardev spicevmc,id=usbredirchardev2,name=usbredir -device usb-redir,chardev=usbredirchardev2,id=usbredirdev2 -chardev spicevmc,id=usbredirchardev3,name=usbredir -device usb-redir,chardev=usbredirchardev3,id=usbredirdev3 -device usb-ccid -chardev spicevmc,id=ccid,name=smartcard -device ccid-card-passthru,chardev=ccid -device virtio-serial-pci -chardev spiceport,id=webdav0,name=org.spice-space.webdav.0 -device virtserialport,chardev=webdav0,name=org.spice-space.webdav.0 -fsdev local,id=fsdev0,path=/home/jmorrison/Public,security_model=mapped-xattr -device virtio-9p-pci,fsdev=fsdev0,mount_tag=Public-jmorrison +``` diff --git a/results/classifier/105/vnc/759 b/results/classifier/105/vnc/759 new file mode 100644 index 000000000..daed244bf --- /dev/null +++ b/results/classifier/105/vnc/759 @@ -0,0 +1,25 @@ +vnc: 0.947 +graphic: 0.889 +device: 0.874 +network: 0.863 +instruction: 0.745 +semantic: 0.683 +mistranslation: 0.524 +boot: 0.359 +socket: 0.286 +KVM: 0.254 +other: 0.219 +assembly: 0.126 + +Copy&Paste does not work on VNC +Description of problem: +Cannot copy&paste between host and guest when vnc is used (gtk works fine). +Steps to reproduce: +1. Build qemu 6.2-rc2 using the following `./configure` options: +``` +--prefix=$HOME/.bin --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-vte --enable-xkbcommon --enable-sdl --enable-spice --enable-spice-protocol --enable-virglrenderer --enable-opengl --enable-guest-agent --enable-avx2 --enable-hax --enable-system --enable-linux-user --enable-libssh --enable-linux-aio --enable-linux-io-uring --enable-modules --enable-fuse --enable-fuse-lseek +``` +2. Run the above qemu command using vnc server. Connect to the VM desktop using `vncviewer :5900` where vncviewer is downloaded from [here](https://www.realvnc.com/en/connect/download/viewer/). +3. Try to copy and paste something in the terminal between host and guest. It doesn't work. +Additional information: +I'm following [this article](https://www.kraxel.org/blog/2021/05/qemu-cut-paste/) which says copy&paste is supported on vnc. diff --git a/results/classifier/105/vnc/772 b/results/classifier/105/vnc/772 new file mode 100644 index 000000000..893568393 --- /dev/null +++ b/results/classifier/105/vnc/772 @@ -0,0 +1,25 @@ +vnc: 0.939 +graphic: 0.910 +device: 0.881 +KVM: 0.875 +instruction: 0.849 +semantic: 0.808 +boot: 0.680 +socket: 0.586 +network: 0.546 +assembly: 0.533 +mistranslation: 0.530 +other: 0.265 + +Pop!_OS 20.10 host + RHEL 8.5 guest = Oh no! Something has gone wrong. +Description of problem: +Whenever starting the Qemu VM, there is an error covering the whole desktop "Oh no! Something has gone wrong. A problem has occurred and the system can't recover. Please log out and try again." After clicking the "Log Out" button and waiting for hours, the guest RHEL may or may not recover, based on your luck and other qemu options used. +Steps to reproduce: +1. Build qemu using the following `./configure` options: +``` +--prefix=$HOME/.bin --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-vte --enable-xkbcommon --enable-sdl --enable-spice --enable-spice-protocol --enable-virglrenderer --enable-opengl --enable-guest-agent --enable-avx2 --enable-avx512f --enable-hax --enable-system --enable-linux-user --enable-libssh --enable-linux-aio --enable-linux-io-uring --enable-modules --enable-gio --enable-fuse --enable-fuse-lseek +``` +2. Install Red Hat Enterprise Linux 8.5 in qemu +3. Run qemu using the above command line. +Additional information: + diff --git a/results/classifier/105/vnc/779 b/results/classifier/105/vnc/779 new file mode 100644 index 000000000..5391f0d51 --- /dev/null +++ b/results/classifier/105/vnc/779 @@ -0,0 +1,26 @@ +vnc: 0.961 +socket: 0.953 +network: 0.919 +graphic: 0.868 +device: 0.841 +instruction: 0.775 +mistranslation: 0.667 +boot: 0.638 +semantic: 0.578 +KVM: 0.494 +other: 0.096 +assembly: 0.056 + +VNC server not work +Description of problem: +I've created a sandbox guest with kata containers. The VM started successfully, but vnc server not listen unix socket. + +`root@bootstrap02:~# netstat -anp | grep 1989153` +`unix 3 [ ] STREAM CONNECTED 369610592 1989153/qemu-system /run/vc/vm/bash/qmp.sock` +`root@bootstrap02:~# lsof -p 1989153 | grep unix` +`qemu-syst 1989153 root 108u unix 0xffff912740d3b800 0t0 369610592 /run/vc/vm/bash/qmp.sock type=STREAM` +Steps to reproduce: +1.Create Linux sandbox guest VM +2.connect vnc server +Additional information: + diff --git a/results/classifier/105/vnc/80570214 b/results/classifier/105/vnc/80570214 new file mode 100644 index 000000000..8b1165d31 --- /dev/null +++ b/results/classifier/105/vnc/80570214 @@ -0,0 +1,408 @@ +vnc: 0.983 +assembly: 0.979 +semantic: 0.978 +instruction: 0.978 +other: 0.978 +graphic: 0.978 +network: 0.975 +socket: 0.975 +device: 0.974 +mistranslation: 0.973 +KVM: 0.971 +boot: 0.969 + +[Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user + +Hi, + +We catch a segfault in our project. + +Qemu version is 2.3.0 + +The Stack backtrace is: +(gdb) bt +#0 0x0000000000000000 in ?? () +#1 0x00007f7ad9280b2f in qemu_deliver_packet (sender=<optimized out>, flags=<optimized +out>, data=<optimized out>, size=100, opaque= + 0x7f7ad9d6db10) at net/net.c:510 +#2 0x00007f7ad92831fa in qemu_net_queue_deliver (size=<optimized out>, data=<optimized +out>, flags=<optimized out>, + sender=<optimized out>, queue=<optimized out>) at net/queue.c:157 +#3 qemu_net_queue_flush (queue=0x7f7ad9d39630) at net/queue.c:254 +#4 0x00007f7ad9280dac in qemu_flush_or_purge_queued_packets +(nc=0x7f7ad9d6db10, purge=true) at net/net.c:539 +#5 0x00007f7ad9280e76 in net_vm_change_state_handler (opaque=<optimized out>, +running=<optimized out>, state=100) at net/net.c:1214 +#6 0x00007f7ad915612f in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) +at vl.c:1820 +#7 0x00007f7ad906db1a in do_vm_stop (state=<optimized out>) at +/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:631 +#8 vm_stop (state=RUN_STATE_SHUTDOWN) at +/usr/src/packages/BUILD/qemu-kvm-2.3.0/cpus.c:1325 +#9 0x00007f7ad915e4a2 in main_loop_should_exit () at vl.c:2080 +#10 main_loop () at vl.c:2131 +#11 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at +vl.c:4721 +(gdb) p *(NetClientState *)0x7f7ad9d6db10 +$1 = {info = 0x7f7ad9824520, link_down = 0, next = {tqe_next = 0x7f7ad0f06d10, +tqe_prev = 0x7f7ad98b1cf0}, peer = 0x7f7ad0f06d10, + incoming_queue = 0x7f7ad9d39630, model = 0x7f7ad9d39590 "vhost_user", name = +0x7f7ad9d39570 "hostnet0", info_str = + "vhost-user to charnet0", '\000' <repeats 233 times>, receive_disabled = 0, +destructor = + 0x7f7ad92821f0 <qemu_net_client_destructor>, queue_index = 0, +rxfilter_notify_enabled = 0} +(gdb) p *(NetClientInfo *)0x7f7ad9824520 +$2 = {type = NET_CLIENT_OPTIONS_KIND_VHOST_USER, size = 360, receive = 0, +receive_raw = 0, receive_iov = 0, can_receive = 0, cleanup = + 0x7f7ad9288850 <vhost_user_cleanup>, link_status_changed = 0, +query_rx_filter = 0, poll = 0, has_ufo = + 0x7f7ad92886d0 <vhost_user_has_ufo>, has_vnet_hdr = 0x7f7ad9288670 +<vhost_user_has_vnet_hdr>, has_vnet_hdr_len = 0, + using_vnet_hdr = 0, set_offload = 0, set_vnet_hdr_len = 0} +(gdb) + +The corresponding codes where gdb reports error are: (We have added some codes +in net.c) +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is 510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two callback +functions seem to be always NULL, +Why we can come here ? +Is it an error to add VM state change handler for vhost-user ? + +Thanks, +zhanghailiang + +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +> +The corresponding codes where gdb reports error are: (We have added some +> +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions to do so? + +> +ssize_t qemu_deliver_packet(NetClientState *sender, +> +unsigned flags, +> +const uint8_t *data, +> +size_t size, +> +void *opaque) +> +{ +> +NetClientState *nc = opaque; +> +ssize_t ret; +> +> +if (nc->link_down) { +> +return size; +> +} +> +> +if (nc->receive_disabled) { +> +return 0; +> +} +> +> +if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { +> +ret = nc->info->receive_raw(nc, data, size); +> +} else { +> +ret = nc->info->receive(nc, data, size); ----> Here is 510 line +> +} +> +> +I'm not quite familiar with vhost-user, but for vhost-user, these two +> +callback functions seem to be always NULL, +> +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) + +-- +Marc-André Lureau + +On 2015/11/3 22:54, Marc-André Lureau wrote: +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +The corresponding codes where gdb reports error are: (We have added some +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions to do so? +OK, i will try to do it. There is nothing special, we run iperf tool in VM, +and then shutdown or reboot it. There is change you can catch segfault. +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is 510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two +callback functions seem to be always NULL, +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) +I have looked at the newest codes, i think we can still have chance to +come here, since we will change nc->receive_disable to false temporarily in +qemu_flush_or_purge_queued_packets(), there is no difference between 2.3 and 2.5 +for this. +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +in qemu_net_queue_flush() for vhost-user ? + +i will try to reproduce it by using newest qemu. + +Thanks, +zhanghailiang + +On 11/04/2015 10:24 AM, zhanghailiang wrote: +> +On 2015/11/3 22:54, Marc-André Lureau wrote: +> +> Hi +> +> +> +> On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +> +> <address@hidden> wrote: +> +>> The corresponding codes where gdb reports error are: (We have added +> +>> some +> +>> codes in net.c) +> +> +> +> Can you reproduce with unmodified qemu? Could you give instructions +> +> to do so? +> +> +> +> +OK, i will try to do it. There is nothing special, we run iperf tool +> +in VM, +> +and then shutdown or reboot it. There is change you can catch segfault. +> +> +>> ssize_t qemu_deliver_packet(NetClientState *sender, +> +>> unsigned flags, +> +>> const uint8_t *data, +> +>> size_t size, +> +>> void *opaque) +> +>> { +> +>> NetClientState *nc = opaque; +> +>> ssize_t ret; +> +>> +> +>> if (nc->link_down) { +> +>> return size; +> +>> } +> +>> +> +>> if (nc->receive_disabled) { +> +>> return 0; +> +>> } +> +>> +> +>> if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { +> +>> ret = nc->info->receive_raw(nc, data, size); +> +>> } else { +> +>> ret = nc->info->receive(nc, data, size); ----> Here is +> +>> 510 line +> +>> } +> +>> +> +>> I'm not quite familiar with vhost-user, but for vhost-user, these two +> +>> callback functions seem to be always NULL, +> +>> Why we can come here ? +> +> +> +> You should not come here, vhost-user has nc->receive_disabled (it +> +> changes in 2.5) +> +> +> +> +I have looked at the newest codes, i think we can still have chance to +> +come here, since we will change nc->receive_disable to false +> +temporarily in +> +qemu_flush_or_purge_queued_packets(), there is no difference between +> +2.3 and 2.5 +> +for this. +> +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +> +in qemu_net_queue_flush() for vhost-user ? +The only thing I can image is self announcing. Are you trying to do +migration? 2.5 only support sending rarp through this. + +And it's better to have a breakpoint to see why a packet was queued for +vhost-user. The stack trace may also help in this case. + +> +> +i will try to reproduce it by using newest qemu. +> +> +Thanks, +> +zhanghailiang +> + +On 2015/11/4 11:19, Jason Wang wrote: +On 11/04/2015 10:24 AM, zhanghailiang wrote: +On 2015/11/3 22:54, Marc-André Lureau wrote: +Hi + +On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang +<address@hidden> wrote: +The corresponding codes where gdb reports error are: (We have added +some +codes in net.c) +Can you reproduce with unmodified qemu? Could you give instructions +to do so? +OK, i will try to do it. There is nothing special, we run iperf tool +in VM, +and then shutdown or reboot it. There is change you can catch segfault. +ssize_t qemu_deliver_packet(NetClientState *sender, + unsigned flags, + const uint8_t *data, + size_t size, + void *opaque) +{ + NetClientState *nc = opaque; + ssize_t ret; + + if (nc->link_down) { + return size; + } + + if (nc->receive_disabled) { + return 0; + } + + if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) { + ret = nc->info->receive_raw(nc, data, size); + } else { + ret = nc->info->receive(nc, data, size); ----> Here is +510 line + } + +I'm not quite familiar with vhost-user, but for vhost-user, these two +callback functions seem to be always NULL, +Why we can come here ? +You should not come here, vhost-user has nc->receive_disabled (it +changes in 2.5) +I have looked at the newest codes, i think we can still have chance to +come here, since we will change nc->receive_disable to false +temporarily in +qemu_flush_or_purge_queued_packets(), there is no difference between +2.3 and 2.5 +for this. +Besides, is it possible for !QTAILQ_EMPTY(&queue->packets) to be true +in qemu_net_queue_flush() for vhost-user ? +The only thing I can image is self announcing. Are you trying to do +migration? 2.5 only support sending rarp through this. +Hmm, it's not triggered by migration, For qemu-2.5, IMHO, it doesn't have such +problem, +since the callback function 'receive' is not NULL. It is vhost_user_receive(). +And it's better to have a breakpoint to see why a packet was queued for +vhost-user. The stack trace may also help in this case. +OK, i'm trying to reproduce it. + +Thanks, +zhanghailiang +i will try to reproduce it by using newest qemu. + +Thanks, +zhanghailiang +. + diff --git a/results/classifier/105/vnc/824074 b/results/classifier/105/vnc/824074 new file mode 100644 index 000000000..814bb68fc --- /dev/null +++ b/results/classifier/105/vnc/824074 @@ -0,0 +1,24 @@ +vnc: 0.981 +network: 0.767 +graphic: 0.760 +device: 0.756 +mistranslation: 0.526 +other: 0.523 +socket: 0.484 +instruction: 0.434 +semantic: 0.413 +boot: 0.261 +KVM: 0.146 +assembly: 0.111 + +Provide runtime option to expose the supported list of keymaps for vnc + +As discussed in the ganeti group[1], I'm opening this bug to request that qemu provides a runtime command or switch to list the supported keymaps for vnc. + + [1] - http://groups.google.com/group/ganeti/browse_thread/thread/dd524f5311d8d79e + +The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. +If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/850 b/results/classifier/105/vnc/850 new file mode 100644 index 000000000..bf0929842 --- /dev/null +++ b/results/classifier/105/vnc/850 @@ -0,0 +1,72 @@ +vnc: 0.697 +KVM: 0.694 +graphic: 0.691 +other: 0.690 +device: 0.683 +mistranslation: 0.659 +network: 0.622 +socket: 0.607 +instruction: 0.599 +assembly: 0.587 +semantic: 0.584 +boot: 0.541 + +virtio-gpu: bogus descriptor or out of resources +Description of problem: +The guest which I use have 1GB memory, also the guest contains 8GB swap, when I open lot of applications in the guest, the guest kernel starts using swap, after some time, I get this error + +<code> +qemu-system-x86_64: virtio: bogus descriptor or out of resources +</code> + +I tried to see which virtio device causing this issue, it seems this issue is happening in "virtio-gpu", I modified the sources ad added this line to see the device name + +virtio.c:1312: virtio_error(vdev, "virtio: %s: bogus descriptor or out of resources", vdev->name); +Steps to reproduce: +1. create a vm with 8GB swap +2. run that vm with above mentioned commandline (memory = 1MB) +3. open huge applications which eats ram in guest +Additional information: +Seems suddenly condition "if (!memory_access_is_direct(mr, is_write))" [physmem.c:1385] becomes true, this is the stack trace when "if (qatomic_xchg(&bounce.in_use, true)) {" [physmem.c:1386] line gets hit for the first time, + +<code> +#0 address_space_map (as=<optimized out>, addr=addr@entry=45251811299328, plen=plen@entry=0x7fffffff7e30, is_write=is_write@entry=false, attrs=..., attrs@entry=...) at ../qemu-6.2.0/softmmu/physmem.c:3186 +#1 0x0000555555cb8cf4 in dma_memory_map (dir=DMA_DIRECTION_TO_DEVICE, len=<synthetic pointer>, addr=45251811299328, as=<optimized out>) at /home/mohan/Downloads/qemu/src/qemu-6.2.0/include/sysemu/dma.h:202 +#2 virtqueue_map_desc + (vdev=vdev@entry=0x5555579d3bb0, p_num_sg=p_num_sg@entry=0x7fffffff7ed8, addr=addr@entry=0x7fffffff7f70, iov=0x7fffffff9f70, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=45251811299328, sz=65536) at ../qemu-6.2.0/hw/virtio/virtio.c:1307 +#3 0x0000555555cb8f9e in virtqueue_packed_pop (vq=<optimized out>, sz=<optimized out>) at ../qemu-6.2.0/hw/virtio/virtio.c:1624 +#4 0x00007fffec0b329e in virtio_gpu_gl_handle_ctrl (vdev=<optimized out>, vq=0x7fffdced6010) at ../qemu-6.2.0/hw/display/virtio-gpu-gl.c:77 +#5 0x0000555555f74134 in aio_bh_call (bh=0x555556d02bc0) at ../qemu-6.2.0/util/async.c:141 +#6 aio_bh_poll (ctx=ctx@entry=0x555556958750) at ../qemu-6.2.0/util/async.c:169 +#7 0x0000555555f5f784 in aio_dispatch (ctx=0x555556958750) at ../qemu-6.2.0/util/aio-posix.c:381 +#8 0x0000555555f73d63 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../qemu-6.2.0/util/async.c:311 +#9 0x00007ffff787dfd3 in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0 +#10 0x0000555555f80129 in glib_pollfds_poll () at ../qemu-6.2.0/util/main-loop.c:232 +#11 os_host_main_loop_wait (timeout=0) at ../qemu-6.2.0/util/main-loop.c:255 +#12 main_loop_wait (nonblocking=nonblocking@entry=0) at ../qemu-6.2.0/util/main-loop.c:531 +#13 0x0000555555c48fe5 in qemu_main_loop () at ../qemu-6.2.0/softmmu/runstate.c:726 +#14 0x000055555597b664 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../qemu-6.2.0/softmmu/main.c:50 +</code> +<br/> +address_space_map() returns valid pointer in the first hit, but it returns NULL on the second hit because qatomic_xchg(bounce.in_use, true) returns true, I think it should suppose to return false. this is the stack trace when it happens for the second time +<br/> +<code> +#0 address_space_map (as=<optimized out>, addr=addr@entry=45251811303424, plen=plen@entry=0x7fffffff7e30, is_write=is_write@entry=false, attrs=..., attrs@entry=...) at ../qemu-6.2.0/softmmu/physmem.c:3186 +#1 0x0000555555cb8cf4 in dma_memory_map (dir=DMA_DIRECTION_TO_DEVICE, len=<synthetic pointer>, addr=45251811303424, as=<optimized out>) at /home/mohan/Downloads/qemu/src/qemu-6.2.0/include/sysemu/dma.h:202 +#2 virtqueue_map_desc + (vdev=vdev@entry=0x5555579d3bb0, p_num_sg=p_num_sg@entry=0x7fffffff7ed8, addr=addr@entry=0x7fffffff7f70, iov=0x7fffffff9f70, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=45251811303424, sz=61440) at ../qemu-6.2.0/hw/virtio/virtio.c:1307 +#3 0x0000555555cb8f9e in virtqueue_packed_pop (vq=<optimized out>, sz=<optimized out>) at ../qemu-6.2.0/hw/virtio/virtio.c:1624 +#4 0x00007fffec0b329e in virtio_gpu_gl_handle_ctrl (vdev=<optimized out>, vq=0x7fffdced6010) at ../qemu-6.2.0/hw/display/virtio-gpu-gl.c:77 +#5 0x0000555555f74134 in aio_bh_call (bh=0x555556d02bc0) at ../qemu-6.2.0/util/async.c:141 +#6 aio_bh_poll (ctx=ctx@entry=0x555556958750) at ../qemu-6.2.0/util/async.c:169 +#7 0x0000555555f5f784 in aio_dispatch (ctx=0x555556958750) at ../qemu-6.2.0/util/aio-posix.c:381 +#8 0x0000555555f73d63 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../qemu-6.2.0/util/async.c:311 +#9 0x00007ffff787dfd3 in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0 +#10 0x0000555555f80129 in glib_pollfds_poll () at ../qemu-6.2.0/util/main-loop.c:232 +#11 os_host_main_loop_wait (timeout=0) at ../qemu-6.2.0/util/main-loop.c:255 +#12 main_loop_wait (nonblocking=nonblocking@entry=0) at ../qemu-6.2.0/util/main-loop.c:531 +#13 0x0000555555c48fe5 in qemu_main_loop () at ../qemu-6.2.0/softmmu/runstate.c:726 +#14 0x000055555597b664 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../qemu-6.2.0/softmmu/main.c:50 +</code> +<br/> +It seems virtqueue_packed_pop() receives one desc with desc.len=65536 (or -1) which should not suppose to happen. I dont know why this is happening diff --git a/results/classifier/105/vnc/854 b/results/classifier/105/vnc/854 new file mode 100644 index 000000000..13ac6e23e --- /dev/null +++ b/results/classifier/105/vnc/854 @@ -0,0 +1,75 @@ +vnc: 0.861 +other: 0.858 +device: 0.851 +instruction: 0.847 +assembly: 0.843 +semantic: 0.837 +network: 0.832 +graphic: 0.829 +KVM: 0.829 +socket: 0.793 +mistranslation: 0.774 +boot: 0.746 + +rsync to ext4-fs on dynamic expanding qcow2 fails +Description of problem: +Firstly, this issue does not seem to happen when the virtual-disk is dd-raw-img or fixed qcow2 (preallocation=falloc). The guest-kernel has multiple tracebacks during rsync to dst folder on ext4-fs on qcow2. +I ctrl-C-ed the rsync process after the first traceback, which happened after copying around 52 GiB. +On a previous run, wherein I had let it continue, somewhere near the end, around 83 GiB, dmesg would bloat with a zillion trace-backs and stall. The sha256sum verify seems to have succeeded for all files copied so far and correctly gives error "Failed open or read" on subsequent files that were not copied. +In this test, the partial-rsync completed files were not corrupted. However, as qemu's disk emulation allocates blocks, qemu may be inducing paging-bugs into the guest-kernel. Paging issues like these may also lead to corruption. The guest-kernel should see the same full emulated disk regardless of whether qemu provided a fixed disk, dynamic disk, or even a different type of virtual-disk-format. The guest-vm should not detect/perceive any difference between them. + +There may be upcoming trouble round the 5.17 corner. + +It is beyond me to figure out if this is due to +* qemu-6.2 block code +* guest-kernel ( kernel-5.17 folio/page management or ntfs3 driver or something else ) + +It may be necessary to ascertain if this is a new bug on account of qemu not being ready for folio type page-management or a bug in upstream kernel.org. My apologies in advance if it turns out that this is not a qemu bug. + +There there does seem to be some problem with qemu dealing with expanding virtual disks, with bugs that show up only if the underlying virtual-disk is dynamic and expanding. + +I just think that storage/block-code should be made rock solid with a much higher priority than adding new features. +If storage code is undependable, then qemu/vm cannot be used, and there is no point in any other feature. qcow2 in particular is the qemu's native virtual-disk format. + +I had to stop testing on Issue #727 , Issue #814 , on account of what I thought was a bug in 5.15 kernels. I filed the bug as "fs/ntfs3: page_cache_ra_unbounded on rsync from ntfs3 to ext4" https://bugzilla.kernel.org/show_bug.cgi?id=215460 . I assume that bug is different because it happens even on raw image. + +setup is as follows: +- Host: Fedora-35 with kernel-5.17.0-0.rc2.83.fc35.x86_64 self-built from srpm ( https://koji.fedoraproject.org/koji/buildinfo?buildID=1910212 ) +- Guest: Fedora-Workstation-Live-x86_64-Rawhide-20220201.n.0.iso with 5.17.0-0.rc2.83.fc36.x86_64 ( https://koji.fedoraproject.org/koji/buildinfo?buildID=1910892 ) +- qemu: 6.2.0 (qemu-6.2.0-2.fc35.1) self-built from srpm ( https://koji.fedoraproject.org/koji/buildinfo?buildID=1897713 ) +- hda: qcow2(dyn) with ext4 and also 4 combinations of raw_img/fixed_qcow2 with ext4/ntfs3 +- hdb: vhdx, ntfs3 (pre-prepared sgdata https://gitlab.com/qemu-project/qemu/-/issues/727#note_739930694 ) + +qcow2 image is created as follows: +``` +[root@sirius ~]# qemu-img create -f qcow2 /mnt/a16/gkpics01.qcow2 99723771904 +Formatting '/mnt/a16/gkpics01.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=99723771904 lazy_refcounts=off refcount_bits=16 +``` + +qemu command is as follows: +``` +[root@sirius ~]# qemu-system-x86_64 -cpu qemu64 -m 4096 -machine "type=q35" -accel "kvm" -smp "sockets=1,cores=8,threads=1" -boot "d" -cdrom "/vol/15KJ_Images/transcend/Fedora-Workstation-Live-x86_64-Rawhide-20220201.n.0.iso" -hda "/mnt/a16/gkpics01.raw" -hdb "/vol/15KJ_Images/test/sgdata.vhdx" -device "virtio-vga" -display "gtk,gl=on" -rtc "base=utc" -net "user" -device "virtio-net,netdev=vmnic" -netdev "user,id=vmnic,net=192.168.20.0/24,dns=192.168.20.3,dhcpstart=192.168.20.15" +``` +Steps to reproduce: +1. Inside booted vm, use gdisk to partition /dev/sda1 if necessary +2. ```dmesg -w (in another pty)``` +3. ```mkfs.ext4 /dev/sda1 -L fs_gkpics001``` +4. ```mkdir /mnt/a /mnt/b``` +5. ```mount -t ext4 /dev/sda1 /mnt/a``` +6. ```mount -t ntfs3 /dev/sdb2 /mnt/b``` +7. rsync testdata: ```(sdate=`date` ; echo "$sdate" ; cd /mnt/b ; rsync -avH ./photos001 /mnt/a | tee /tmp/rst.txt ; echo "$sdate" ; date )``` +8. ```umount /mnt/a ; ``` +9. ```mount -t ext4 /dev/sda1 /mnt/a``` +10. verify: ```(sdate=`date` ; echo "$sdate" ; cd /mnt/a/photos001 ; sha256sum -c ./find.CHECKSUM --quiet ; echo "$sdate" ; date )``` +11. ```umount /mnt/a ; umount /mnt/b;``` +Additional information: +**Test attempts** +- Bug does not happen with 5.17.0-0.rc2.83/qemu-6.2.0-2/5.17.0-0.rc2.83/ExFAT/rawimg/ext4 with vhdx/ntfs3/sgdata +- Bug does not happen with 5.17.0-0.rc2.83/qemu-6.2.0-2/5.17.0-0.rc2.83/ExFAT/rawimg/ntfs3 with vhdx/ntfs3/sgdata +- Bug does not happen with 5.17.0-0.rc2.83/qemu-6.2.0-2/5.17.0-0.rc2.83/ExFAT/qcow2(fixed)/ext4 with vhdx/ntfs3/sgdata +- Bug does not happen with 5.17.0-0.rc2.83/qemu-6.2.0-2/5.17.0-0.rc2.83/ExFAT/qcow2(fixed)/ntfs3 with vhdx/ntfs3/sgdata +- Bug **does** happen with 5.17.0-0.rc2.83/qemu-6.2.0-2/5.17.0-0.rc2.83/ExFAT/**qcow2(dyn)**/ext4 with vhdx/ntfs3/sgdata +- Bug does not happen directly on Host with 5.17.0-0.rc2.83/ExFat with ntfs3/sgdata +- Bug does not happen directly on Host with 5.17.0-0.rc2.83/ntfs3 with ntfs3/sgdata + +Also filed a linux-kernel bug titled "during rsync, vm guest kernel trace arising from memcg_kmem_charge_page alloc_pages" https://bugzilla.kernel.org/show_bug.cgi?id=215563 diff --git a/results/classifier/105/vnc/974229 b/results/classifier/105/vnc/974229 new file mode 100644 index 000000000..2a58bc2a8 --- /dev/null +++ b/results/classifier/105/vnc/974229 @@ -0,0 +1,140 @@ +vnc: 0.917 +graphic: 0.886 +semantic: 0.875 +KVM: 0.864 +instruction: 0.842 +device: 0.841 +assembly: 0.808 +other: 0.773 +mistranslation: 0.768 +network: 0.713 +socket: 0.654 +boot: 0.654 + +qemu-kvm-1.0: segfault using vnc-console => not threadsafe! + +after failure using qemu-kvm-0.14.1 I've tried v1.0, but there's a problem if compiled with vnc-thread-support: + +Program received signal SIGSEGV, Segmentation fault. +0x0000000000000000 in ?? () +(gdb) bt +#0 0x0000000000000000 in ?? () +#1 0x00007f3ac48ca10a in qemu_iohandler_poll (readfds=0x7fff12379ac0, writefds=0x7fff12379b40, xfds=0x7fff12379bc0, ret=3) + at iohandler.c:124 +#2 0x00007f3ac4964387 in main_loop_wait (nonblocking=0) at main-loop.c:463 +#3 0x00007f3ac4958fb1 in main_loop () at /opt/workspace/oneiric64/qemu-kvm-1.0/vl.c:1482 +#4 0x00007f3ac495e1ec in main (argc=68, argv=0x7fff1237a088, envp=0x7fff1237a2b0) + at /opt/workspace/oneiric64/qemu-kvm-1.0/vl.c:3523 +(gdb) up +#1 0x00007f3ac48ca10a in qemu_iohandler_poll (readfds=0x7fff12379ac0, writefds=0x7fff12379b40, xfds=0x7fff12379bc0, ret=3) + at iohandler.c:124 +124 ioh->fd_write(ioh->opaque); + +(gdb) print *ioh +$4 = {fd = 29, fd_read_poll = 0, fd_read = 0x7f3ac49de158 <vnc_client_read>, fd_write = 0, deleted = 0, + opaque = 0x7f3ac7978d50, next = {le_next = 0x7f3ac6add2e0, le_prev = 0x7f3ac52bde90}} + + +ok, how could that happen? +loooking deeper at the code and backtraces shows, that iohandler.c:124 is called within the main-loop, while iohandler.c:77 is called within the vnc-thread-loop + +mmmh, but where the hell is the threadsafe-locking of the ioh-structure???? + +I didn't found anything... + +the resetting in line 77 is called from vnc_client_write_plain(), where following code can be found: + +=================== + if (vs->output.offset == 0) { + qemu_set_fd_handler2(vs->csock, NULL, vnc_client_read, NULL, vs); + } +=================== + +why should the function-ptrs should be zeroed? + +further tracing shows, that the vnc-thread sometimes seems to exits normally and a new one is started (I haven't verified that), but this would be a reason for zeroing function-ptrs, which may point to code inside the thread, which will exit... + +but why should this be done? and why there's no threadsafe-modification of the structure? + +well: disabling vnc-thread at configure-state leads into a normal running machine, but threading would be nice here... + +so a quick fix could be, to drop the call above and make the vnc-thread staying for the whole session, but I don't know all mechanisms of vnc-support within kvm. +but a better solution would be usage of clean locking-mechanisms + +Thanks for reporting this bug. + +Since vnc-thread-support is not compiled into the qemu-kvm package, the bug is invalid there. I will mark it as affecting the upstream QEMU project. Note you'll want to use git://git.qemu.org/qemu.git as the upstream code base. + +puh, it did take a while, but meanwhile another segfault has occured, which has nothing to do with the above one. due to the long time, it took to happen, it might not be as reproducible as needed for efficient debugging, at least I've currently no further time for this. I'll now try V0.15.1 and hope, it will work well for me. + +some gdb-info of the current segfault, if there's someone, who want to have a look at: + +Program received signal SIGSEGV, Segmentation fault. +[Switching to Thread 0x7fe086c46700 (LWP 30362)] +0x00007fe08c0639fc in ?? () from /lib/x86_64-linux-gnu/libc.so.6 + +(gdb) thread apply all bt + +Thread 6 (Thread 0x7fdfa2ecf700 (LWP 30793)): +#0 0x00007fe08c3963cb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 +#1 0x00007fe08f581c79 in cond_timedwait (cond=0x7fe08fed6b20, mutex=0x7fe08fed6ae0, ts=0x7fdfa2ecee10) + at posix-aio-compat.c:104 +#2 0x00007fe08f5823f0 in aio_thread (unused=0x0) at posix-aio-compat.c:334 +#3 0x00007fe08c391efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 +#4 0x00007fe08c0cc59d in __cmsg_nxthdr () from /lib/x86_64-linux-gnu/libc.so.6 +#5 0x0000000000000000 in ?? () + +Thread 5 (Thread 0x7fe087648700 (LWP 30361)): +#0 0x00007fe08c399a73 in pwrite64 () from /lib/x86_64-linux-gnu/libpthread.so.0 +#1 0x00007fe08f58201f in handle_aiocb_rw_linear (aiocb=0x7fe093c98e50, + buf=0x7fe093e05600 "\004\063\377\211t$\b\213\064$\213\034\272G\205\333\017\204\246") at posix-aio-compat.c:216 +#2 0x00007fe08f58212d in handle_aiocb_rw (aiocb=0x7fe093c98e50) at posix-aio-compat.c:251 +#3 0x00007fe08f582573 in aio_thread (unused=0x0) at posix-aio-compat.c:362 +#4 0x00007fe08c391efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 +#5 0x00007fe08c0cc59d in __cmsg_nxthdr () from /lib/x86_64-linux-gnu/libc.so.6 +#6 0x0000000000000000 in ?? () + +Thread 4 (Thread 0x7fe086c46700 (LWP 30362)): +#0 0x00007fe08c0639fc in ?? () from /lib/x86_64-linux-gnu/libc.so.6 +#1 0x00007fe08c3851c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 +#2 0x00007fe086c45990 in ?? () +#3 0x00007fe08c39bc20 in ?? () from /lib/x86_64-linux-gnu/libpthread.so.0 +#4 0x00007fe086c469c0 in ?? () +#5 0x0000000000000000 in ?? () + +Thread 3 (Thread 0x7fe086445700 (LWP 30363)): +#0 0x00007fe08c0c4747 in getmntent_r () from /lib/x86_64-linux-gnu/libc.so.6 +#1 0x0000000000000000 in ?? () + +Thread 2 (Thread 0x7fdfa36d0700 (LWP 30388)): +#0 0x00007fe08c3963cb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 +#1 0x00007fe08f581c79 in cond_timedwait (cond=0x7fe08fed6b20, mutex=0x7fe08fed6ae0, ts=0x7fdfa36cfe10) + at posix-aio-compat.c:104 +#2 0x00007fe08f5823f0 in aio_thread (unused=0x0) at posix-aio-compat.c:334 +#3 0x00007fe08c391efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 +#4 0x00007fe08c0cc59d in __cmsg_nxthdr () from /lib/x86_64-linux-gnu/libc.so.6 +#5 0x0000000000000000 in ?? () + +Thread 1 (Thread 0x7fe08f3cd7a0 (LWP 30158)): +#0 0x00007fe08c0c5613 in getttyent () from /lib/x86_64-linux-gnu/libc.so.6 +#1 0x00000000000f4140 in ?? () +#2 0x00007fff62c7e3d0 in ?? () +#3 0x0000001d8f675e5c in ?? () +#4 0x00000000000003e8 in ?? () +#5 0x0000000062c7e3c0 in ?? () +#6 0x0000000062c7e440 in ?? () +#7 0x0000000162c7e4c0 in ?? () +#8 0x000000003c080980 in ?? () +#9 0x0000000000000000 in ?? () +(gdb) + +all done on a ubuntu-11.10 64bit, last configure-options were: +'./configure' '--target-list=x86_64-softmmu i386-softmmu x86_64-linux-user i386-linux-user' '--prefix=/usr' '--interp-prefix=/etc/qemu-binfmt/%M' '--disable-blobs' '--disable-strip' '--audio-drv-list=pa,alsa,sdl,oss' '--enable-vnc-sasl' '--enable-docs' '--enable-vhost-net' '--enable-attr' '--enable-linux-aio' '--enable-uuid' '--enable-nptl' '--enable-kvm-device-assignment' '--enable-kvm-pit' '--enable-kvm' '--enable-curses' '--enable-vnc-png' '--enable-vnc-tls' '--audio-card-list=ac97,es1370,sb16,cs4231a,adlib,gus,hda' '--enable-user' '--enable-system' '--enable-linux-user' '--enable-bsd-user' '--enable-guest-base' '--enable-darwin-user' --enable-debug + +the segfault occures while installing a larger app within winxp+sp3 near the possible end of setup + + +The QEMU versions mentioned in this ticket are quite old already ... can you still reproduce this with the latest version of QEMU? If so, please also provide the exact command line parameters that you used to start QEMU, and the steps you took afterwards to get to the crash? Thanks! + +[Expired for QEMU because there has been no activity for 60 days.] + diff --git a/results/classifier/105/vnc/981 b/results/classifier/105/vnc/981 new file mode 100644 index 000000000..653fc3783 --- /dev/null +++ b/results/classifier/105/vnc/981 @@ -0,0 +1,23 @@ +vnc: 0.925 +graphic: 0.899 +socket: 0.878 +network: 0.797 +device: 0.770 +instruction: 0.757 +mistranslation: 0.714 +semantic: 0.500 +other: 0.213 +boot: 0.178 +assembly: 0.012 +KVM: 0.005 + +VNC UNIX sockets are not deleted +Description of problem: +After exiting QEMU a unix VNC socket file is left behind. Upon termination I would expect it to remove the socket file like it does for example with a monitor unix socket. +Steps to reproduce: +``` + rm -f foo.socket + qemu-system-x86_64 -vnc unix:foo.socket + # Exit QEMU + ls foo.socket + ``` diff --git a/results/classifier/105/vnc/994412 b/results/classifier/105/vnc/994412 new file mode 100644 index 000000000..c73c29ded --- /dev/null +++ b/results/classifier/105/vnc/994412 @@ -0,0 +1,24 @@ +vnc: 0.669 +graphic: 0.308 +mistranslation: 0.278 +network: 0.273 +semantic: 0.270 +device: 0.251 +other: 0.178 +instruction: 0.142 +socket: 0.141 +boot: 0.009 +assembly: 0.007 +KVM: 0.003 + +reverse vnc to unix domain sockets does not work + +I tried to connect to a unix domain socket, but failed. + +$ qemu -vnc unix:/tmp/my.sock,reverse +connect(unix:/tmp/my.sock,reverse): No such file or directory + +I guess it is because unix_connect does not remove characters after first comma. + +Looks like this should work nowadays (of course you need to start a listening program first), so closing this bug ticket now. + |