diff options
Diffstat (limited to 'results/classifier/gemma3:12b/performance')
233 files changed, 6534 insertions, 0 deletions
diff --git a/results/classifier/gemma3:12b/performance/1004 b/results/classifier/gemma3:12b/performance/1004 new file mode 100644 index 00000000..0708a639 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1004 @@ -0,0 +1,10 @@ + +qemu-system-i386 peggs 100% host CPU +Description of problem: +Before the guest OS even starts up, the host CPU eggs at 100%. +Steps to reproduce: +1. Start any VM using qemu-system-i386 +2. On Ubuntu use Virt Manager or command line. +3. On macOS use UTM. +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/1018 b/results/classifier/gemma3:12b/performance/1018 new file mode 100644 index 00000000..8635b71f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1018 @@ -0,0 +1,24 @@ + +virtio-scsi-pci with iothread results in 100% CPU in qemu 7.0.0 +Description of problem: +Top reports constant 100% host CPU usage by `qemu-system-x86`. I have narrowed the issue down to the following section of the config: +``` + -object iothread,id=t0 \ + -device virtio-scsi-pci,iothread=t0,num_queues=4 \ +``` +If this is replaced by +``` + -device virtio-scsi-pci \ +``` +Then CPU usage is normal (near 0%). + +This problem doesn't appear with qemu 6.2.0 where CPU usage is near 0% even with iothread in the qemu options. +Steps to reproduce: +1. Download Kubuntu 22.04 LTS ISO (https://cdimage.ubuntu.com/kubuntu/releases/22.04/release/kubuntu-22.04-desktop-amd64.iso), +2. Create a root virtual drive for the guest with 'qemu-img create -f qcow2 -o cluster_size=4k kubuntu.img 256G', +3. Start the guest with the config given above, +4. Connect to the guest (using spicy for example, password 'p'), select "try kubuntu" in grub menu AND later in the GUI, let it boot to plasma desktop, monitor host CPU usage using 'top'. + +(there could be a faster way to reproduce it) +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/1026 b/results/classifier/gemma3:12b/performance/1026 new file mode 100644 index 00000000..0acc9dfa --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1026 @@ -0,0 +1,117 @@ + +Backup with large RBD disk is slow since QEMU 6.2.0 (since commit 0347a8fd) +Description of problem: +Since commit 0347a8fd4c ("block/rbd: implement bdrv_co_block_status"), there is a big slowdown for large RBD images for backup. +Steps to reproduce: +I used the following script +``` +root@pve701 ~ # cat rbdbackup.sh +#!/bin/bash +rbd create emptytestA -p rbdkvm --size $2 +rbd create emptytestB -p rbdkvm --size $2 +$1 \ + -qmp stdio \ + -drive file=rbd:rbdkvm/emptytestA:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/rbdkvm.keyring,if=none,id=driveA,format=raw \ + -drive file=rbd:rbdkvm/emptytestB:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/rbdkvm.keyring,if=none,id=driveB,format=raw \ +<<EOF +{"execute": "qmp_capabilities"} +{"execute": "blockdev-backup", + "arguments": { "device": "driveA", + "sync": "full", + "target": "driveB" } } +EOF +rbd -p rbdkvm rm emptytestA +rbd -p rbdkvm rm emptytestB +``` +with 200G and 500G images respectively and QEMU binaries built from current master (i.e. 10c2a0c5e7d48e590d945c017b5b8af5b4c89a3c) and from current master with fc176116cdea816ceb8dd969080b2b95f58edbc0, 9e302f64bb407a9bb097b626da97228c2654cfee and 0347a8fd4c3faaedf119be04c197804be40a384b reverted. + + +Timings: +``` +200G master: 92s +200G master+reverts: 57s +500G master: 526s +500G master+reverts: 142s +``` + +I checked how long a single call to `rbd_diff_iterate2()` in `block/rbd.c` takes, and it seems to take about linearly more time the bigger the image is. But it is also called linearly more often, resulting in about quadratic slowdown overall. +Additional information: +Full commands/output: +``` +root@pve701 ~ # ./rbdbackup.sh ./qemu-upstream/10c2a0c5e7d48e590d945c017b5b8af5b4c89a3c/qemu-system-x86_64 200G +{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 7}, "package": "v7.0.0-981-g10c2a0c5e7"}, "capabilities": ["oob"]}} +VNC server running on 127.0.0.1:5900 +{"return": {}} +{"timestamp": {"seconds": 1652695629, "microseconds": 651397}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "driveA"}} +{"timestamp": {"seconds": 1652695629, "microseconds": 651447}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"timestamp": {"seconds": 1652695629, "microseconds": 651464}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "driveA"}} +{"timestamp": {"seconds": 1652695629, "microseconds": 651490}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"return": {}} +{"timestamp": {"seconds": 1652695721, "microseconds": 415892}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "driveA"}} +{"timestamp": {"seconds": 1652695721, "microseconds": 416066}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "driveA"}} +{"timestamp": {"seconds": 1652695721, "microseconds": 416197}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "driveA", "len": 214748364800, "offset": 214748364800, "speed": 0, "type": "backup"}} +{"timestamp": {"seconds": 1652695721, "microseconds": 416239}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "driveA"}} +{"timestamp": {"seconds": 1652695721, "microseconds": 416265}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "driveA"}} +^Cqemu-system-x86_64: terminating on signal 2 +{"timestamp": {"seconds": 1652695727, "microseconds": 145031}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}} +Removing image: 100% complete...done. +Removing image: 100% complete...done. +./rbdbackup.sh 200G 81.15s user 6.31s system 89% cpu 1:38.21 total +root@pve701 ~ # ./rbdbackup.sh ./qemu-upstream/10c2a0c5e7d48e590d945c017b5b8af5b4c89a3c-with-rbd-reverts/qemu-system-x86_64 200G +{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 7}, "package": "v7.0.0-984-g20a19f8eae"}, "capabilities": ["oob"]}} +VNC server running on 127.0.0.1:5900 +{"return": {}} +{"timestamp": {"seconds": 1652695737, "microseconds": 444734}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "driveA"}} +{"timestamp": {"seconds": 1652695737, "microseconds": 444818}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"timestamp": {"seconds": 1652695737, "microseconds": 444860}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "driveA"}} +{"timestamp": {"seconds": 1652695737, "microseconds": 444885}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"return": {}} +{"timestamp": {"seconds": 1652695794, "microseconds": 437168}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "driveA"}} +{"timestamp": {"seconds": 1652695794, "microseconds": 437248}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "driveA"}} +{"timestamp": {"seconds": 1652695794, "microseconds": 437341}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "driveA", "len": 214748364800, "offset": 214748364800, "speed": 0, "type": "backup"}} +{"timestamp": {"seconds": 1652695794, "microseconds": 437368}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "driveA"}} +{"timestamp": {"seconds": 1652695794, "microseconds": 437381}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "driveA"}} +^Cqemu-system-x86_64: terminating on signal 2 +{"timestamp": {"seconds": 1652695803, "microseconds": 242148}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}} +Removing image: 100% complete...done. +Removing image: 100% complete...done. +./rbdbackup.sh 200G 40.68s user 111.12s system 228% cpu 1:06.47 total +root@pve701 ~ # ./rbdbackup.sh ./qemu-upstream/10c2a0c5e7d48e590d945c017b5b8af5b4c89a3c/qemu-system-x86_64 500G +{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 7}, "package": "v7.0.0-981-g10c2a0c5e7"}, "capabilities": ["oob"]}} +VNC server running on 127.0.0.1:5900 +{"return": {}} +{"timestamp": {"seconds": 1652695970, "microseconds": 663752}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "driveA"}} +{"timestamp": {"seconds": 1652695970, "microseconds": 663892}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"timestamp": {"seconds": 1652695970, "microseconds": 663920}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "driveA"}} +{"timestamp": {"seconds": 1652695970, "microseconds": 663980}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"return": {}} +{"timestamp": {"seconds": 1652696496, "microseconds": 556219}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "driveA"}} +{"timestamp": {"seconds": 1652696496, "microseconds": 556386}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "driveA"}} +{"timestamp": {"seconds": 1652696496, "microseconds": 556497}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "driveA", "len": 536870912000, "offset": 536870912000, "speed": 0, "type": "backup"}} +{"timestamp": {"seconds": 1652696496, "microseconds": 556536}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "driveA"}} +{"timestamp": {"seconds": 1652696496, "microseconds": 556555}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "driveA"}} +^Cqemu-system-x86_64: terminating on signal 2 +{"timestamp": {"seconds": 1652696786, "microseconds": 408273}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}} +Removing image: 100% complete...done. +Removing image: 100% complete...done. +./rbdbackup.sh 500G 453.34s user 28.30s system 58% cpu 13:36.48 total +root@pve701 ~ # ./rbdbackup.sh ./qemu-upstream/10c2a0c5e7d48e590d945c017b5b8af5b4c89a3c-with-rbd-reverts/qemu-system-x86_64 500G +{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 7}, "package": "v7.0.0-984-g20a19f8eae"}, "capabilities": ["oob"]}} +VNC server running on 127.0.0.1:5900 +{"return": {}} +{"timestamp": {"seconds": 1652695810, "microseconds": 648931}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "driveA"}} +{"timestamp": {"seconds": 1652695810, "microseconds": 649012}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"timestamp": {"seconds": 1652695810, "microseconds": 649057}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "driveA"}} +{"timestamp": {"seconds": 1652695810, "microseconds": 649080}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "driveA"}} +{"return": {}} +{"timestamp": {"seconds": 1652695952, "microseconds": 13070}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "driveA"}} +{"timestamp": {"seconds": 1652695952, "microseconds": 13144}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "driveA"}} +{"timestamp": {"seconds": 1652695952, "microseconds": 13210}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "driveA", "len": 536870912000, "offset": 536870912000, "speed": 0, "type": "backup"}} +{"timestamp": {"seconds": 1652695952, "microseconds": 13233}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "driveA"}} +{"timestamp": {"seconds": 1652695952, "microseconds": 13249}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "driveA"}} +^Cqemu-system-x86_64: terminating on signal 2 +{"timestamp": {"seconds": 1652695955, "microseconds": 692599}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-signal"}} +Removing image: 100% complete...done. +Removing image: 100% complete...done. +./rbdbackup.sh 500G 99.49s user 277.78s system 258% cpu 2:25.78 total +``` diff --git a/results/classifier/gemma3:12b/performance/1032 b/results/classifier/gemma3:12b/performance/1032 new file mode 100644 index 00000000..89bb0280 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1032 @@ -0,0 +1,17 @@ + +Slow random performance of virtio-blk +Steps to reproduce: +1. Download Virtualbox Windows 11 image from https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/ +2. Download virtio-win-iso: `wget https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/virtio-win-0.1.215-2/virtio-win-0.1.215.iso` +3. Extract WinDev*.zip `unzip WinDev2204Eval.VirtualBox.zip`and import the extracted Ova in VirtualBox (import WinDev with the option "conversion to vdi" clicked) +4. `qemu-img convert -f vdi -O raw <YourVirtualBoxVMFolder>/WinDev2204Eval-disk001.vdi<YourQemuImgFolder>/WinDev2204Eval-disk001.img` +5. Start Windows 11 in Qemu: +``` +qemu-system-x86_64 -enable-kvm -cpu host -device virtio-blk-pci,scsi=off,drive=WinDevDrive,id=virtio-disk0,bootindex=0 -drive file=<YourQemuImgFolder>/WinDev2204Eval-disk001.img,if=none,id=WinDevDrive,format=raw -net nic -net user,hostname=windowsvm -m 8G -monitor stdio -name "Windows" -usbdevice tablet -device virtio-serial -chardev spicevmc,id=vdagent,name=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -cdrom <YourDownloadFolder>/virtio-win-0.1.215.iso +``` +6. Win 11 won't boot and will go into recovery mode (even the safeboot trick doesn't work here), please follow that [answer](https://superuser.com/questions/1057959/windows-10-in-kvm-change-boot-disk-to-virtio#answer-1200899) to load the viostor driver over recovery cmd +7. Reboot the VM and it should start +2. Install CrystalDiskMark +3. Execute CrystalDiskMark Benchmark +Additional information: +# diff --git a/results/classifier/gemma3:12b/performance/1036987 b/results/classifier/gemma3:12b/performance/1036987 new file mode 100644 index 00000000..5ecf63ce --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1036987 @@ -0,0 +1,26 @@ + +compilation error due to bug in savevm.c + +Since + +302dfbeb21fc5154c24ca50d296e865a3778c7da + +Add xbzrle_encode_buffer and xbzrle_decode_buffer functions + + For performance we are encoding long word at a time. + For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp(): + using ((lword - 0x0101010101010101) & (~lword) & 0x8080808080808080) test + to find out if any byte in the long word is zero. + + Signed-off-by: Benoit Hudzia <email address hidden> + Signed-off-by: Petter Svard <email address hidden> + Signed-off-by: Aidan Shribman <email address hidden> + Signed-off-by: Orit Wasserman <email address hidden> + Signed-off-by: Eric Blake <email address hidden> + + Reviewed-by: Luiz Capitulino <email address hidden> + Reviewed-by: Eric Blake <email address hidden> + + commit arrived into master barnch, I can't compile qemu at all: + +savevm.c:2476:13: error: overflow in implicit constant conversion [-Werror=overflow] \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1056 b/results/classifier/gemma3:12b/performance/1056 new file mode 100644 index 00000000..01237034 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1056 @@ -0,0 +1,2 @@ + +Bad Performance of Windows 11 ARM64 VM on Windows 11 Qemu 7.0 Host System diff --git a/results/classifier/gemma3:12b/performance/1089 b/results/classifier/gemma3:12b/performance/1089 new file mode 100644 index 00000000..d2784da0 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1089 @@ -0,0 +1,25 @@ + +when I use memory balloon,the qemu process memory usage is displayed incorrectly +Description of problem: +My vm memory is 4GB,and use the balloon driver,the balloon value is also 4GB. +I run a soft to consume memory in vm,I can see the memory usage rate is 15% in host. When I stop the soft in vm,the memory of free info in host and vm +become normal,but use "top -d 3 -Hp $qemu_pid" to query in host,the memory usage rate is also 15%.I need to modify the balloon value in a smaller values,the memory usage rate will reduce. why? + +Steps to reproduce: +1.run a soft to consume memory in vm,and query top info,the qemu process memory usage:15% + + +2.query free info in host and vm (reduce) + + +3.stop sort in vm + + +4.query free info in host and vm (recover) + + +5.query top info again (also 15%) + + + +6.modify the balloon value in a smaller (modify the balloon value in a smaller values,the memory usage rate will reduce) diff --git a/results/classifier/gemma3:12b/performance/1094 b/results/classifier/gemma3:12b/performance/1094 new file mode 100644 index 00000000..90084f7d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1094 @@ -0,0 +1,9 @@ + +Ubuntu's 22.04 Qemu high RAM usage (memory leak maybe) +Description of problem: +After starting/using my VM for a while, RAM fills up to the 32gb maximum, and firefox starts closing tabs and etc. This didn't happen in ubuntu 21.10 or earlier ubuntus. I've been using virt-manager + qemu for years and only had this after upgrading to ubuntu 22.04. +Steps to reproduce: +1. Launch virt-manager ubuntu VM with 12gb ram maximum (as an example) +2. RAM entire 32gb gets filled but nothing in gnome-system-monitor shows what is using all that RAM +3. Firefox starts closing tabs because RAM is full. Remember that only a 12gb RAM vm and firefox with a few tabs are running, and it fills all 32gb of RAM. Ram starts filling slowly and in 1 hour it fills the entire 32gb. For some reason htop shows a smaller usage, but I'm pretty sure all 32gb are being used as the computer starts freezing and almost crashing (I think swap is being used so it slows down but do not crash) +4. have to restart the computer for RAM to get normal again diff --git a/results/classifier/gemma3:12b/performance/1126369 b/results/classifier/gemma3:12b/performance/1126369 new file mode 100644 index 00000000..fa3c107d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1126369 @@ -0,0 +1,15 @@ + +qemu-img snapshot -c is unreasonably slow + +Something fishy is going on with qcow2 internal snapshot creation times. I don't know if this is a regression because I haven't used internal snapshots in the past. + +QEMU 1.4-rc2: +$ qemu-img create -f qcow2 test.qcow2 -o size=50G,preallocation=metadata +$ time qemu-img snapshot -c new test.qcow2 +real 3m39.147s +user 0m10.748s +sys 0m26.165s + +(This is on an SSD) + +I expect snapshot creation to take under 1 second. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1129957 b/results/classifier/gemma3:12b/performance/1129957 new file mode 100644 index 00000000..d2d34a89 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1129957 @@ -0,0 +1,24 @@ + +Performance issue running quest image on qemu compiled for Win32 platform + +I'm seeing performance issues when booting a guest image on qemu 1.4.0 compiled for the Win32 platform. +The same image boots a lot faster on the same computer running qemu/linux on Fedora via VmWare, and even running the Win32 exectuable via Wine performs better than running qemu natively on Win32. + +Although I'm not the author of the image, it is located here: +http://people.freebsd.org/~wpaul/qemu/vxworks.img + +All testing has been done on QEMU 1.4.0. + +I'm also attaching a couple of gprof logs. For these I have disabled ssp in qemu by removing "-fstack-protector-all" and "-D_FORTIFY_SOURCE=2" from the qemu configure script. + +qemu-perf-linux.txt +================ +Machine - Windows XP - VmWare - Fedora - QEMU + +qemu-perf-win32.txt +================= +Machine - Windows XP - QEMU + +qemu-perf-wine.txt +================ +Machine - Windows XP - VmWare - Fedora - Wine - QEMU \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1139 b/results/classifier/gemma3:12b/performance/1139 new file mode 100644 index 00000000..4a0c9a6f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1139 @@ -0,0 +1,79 @@ + +block/nbd.c and drive backup to a remote nbd server +Description of problem: +Good afternoon! + +I trying to copy attached drive content to remote NBD server via drive-backup QMP method. I'he tested two very similar ways but with very different performance. First is a backuping to exported NBD at another server. Second way is a backuping to same server but with connecting to /dev/nbd*. + +Exporting qcow2 via nbd: +``` +(nbd) ~ # qemu-nbd -p 12345 -x backup --cache=none --aio=native --persistent -f qcow2 backup.qcow2 + +(qemu) ~ # qemu-img info nbd://10.0.0.1:12345/backup +image: nbd://10.0.0.1:12345/backup +file format: raw +virtual size: 10 GiB (10737418240 bytes) +disk size: unavailable +``` + +Starting drive backuping via QMP: + +``` +{ + "execute": "drive-backup", + "arguments": { + "device": "disk", + "sync": "full", + "target": "nbd://10.0.0.1:12345/backup", + "mode": "existing" + } +} +``` + +With process starting qemu notifying about warning: + +> warning: The target block device doesn't provide information about the block size and it doesn't have a backing file. The default block size of 65536 bytes is used. If the actual block size of the target exceeds this default, the backup may be unusable + +And backup process is limited by speed around 30MBps, watched by iotop + + +Second way to creating backup + +Exporting qcow2 via nbd: +``` +(nbd) ~ # qemu-nbd -p 12345 -x backup --cache=none --aio=native --persistent -f qcow2 backup.qcow2 +``` + +``` +(qemu) ~ # qemu-img info nbd://10.0.0.1:12345/backup +image: nbd://10.0.0.1:12345/backup +file format: raw +virtual size: 10 GiB (10737418240 bytes) +disk size: unavailable +(qemu) ~ # qemu-nbd -c /dev/nbd0 nbd://10.0.0.1:12345/backup +(qemu) ~ # qemu-img info /dev/nbd0 +image: /dev/nbd0 +file format: raw +virtual size: 10 GiB (10737418240 bytes) +disk size: 0 B +``` + +Starting drive backuping via QMP to local nbd device: + +``` +{ + "execute": "drive-backup", + "arguments": { + "device": "disk", + "sync": "full", + "target": "/dev/nbd0", + "mode": "existing" + } +} +``` + +Backup process started without previous warning, and speed limited around 100MBps (network limit) + +So I have question: how I can get same performance without connection network device to local block nbd device at the qemu host? + +Kind regards diff --git a/results/classifier/gemma3:12b/performance/1140 b/results/classifier/gemma3:12b/performance/1140 new file mode 100644 index 00000000..a37dda96 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1140 @@ -0,0 +1,2 @@ + +High CPU usage on AMD after migrating guests diff --git a/results/classifier/gemma3:12b/performance/1174654 b/results/classifier/gemma3:12b/performance/1174654 new file mode 100644 index 00000000..a3867224 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1174654 @@ -0,0 +1,6 @@ + +qemu-system-x86_64 takes 100% CPU after host machine resumed from suspend to ram + +I have Windows XP SP3 inside qemu VM. All works fine in 12.10. But after upgraiding to 13.04 i have to restart the VM each time i resuming my host machine, because qemu process starts to take CPU cycles and OS inside VM is very slow and sluggish. However it's still controllable and could be shutdown by itself. + +According to the taskmgr any active process takes 99% CPU. It's not stucked on some single process. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1186 b/results/classifier/gemma3:12b/performance/1186 new file mode 100644 index 00000000..12aca656 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1186 @@ -0,0 +1,18 @@ + +qos-test fails when built with LTO and gcc-12 +Description of problem: +The issue is already discussed here [1]. I'm simply building latest QEMU release and running the test suite. I thought the issue was fixed in 7.0 but it has resurfaced. Do QEMU dev's not build with LTO? I'm not able to debug this but I can test any proposed fixes etc. Thanks. + +[1] https://lore.kernel.org/all/1d3bbff9e92e7c8a24db9e140dcf3f428c2df103.camel@suse.com/ +Steps to reproduce: +1. Build QEMU with gcc-12 and LTO enabled +2. Run make check +3. Observe test suite failures in qos-test +Additional information: +``` +Summary of Failures: + + 2/265 qemu:qtest+qtest-aarch64 / qtest-aarch64/qos-test ERROR 0.59s killed by signal 6 SIGABRT + 3/265 qemu:qtest+qtest-i386 / qtest-i386/qos-test ERROR 0.22s killed by signal 6 SIGABRT + 7/265 qemu:qtest+qtest-x86_64 / qtest-x86_64/qos-test ERROR 0.40s killed by signal 6 SIGABRT +``` diff --git a/results/classifier/gemma3:12b/performance/1192065 b/results/classifier/gemma3:12b/performance/1192065 new file mode 100644 index 00000000..95be77b5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1192065 @@ -0,0 +1,7 @@ + +qemu release memory to system + +Qemu pre-allocates the maximum balloon amount which is inconvinient if all of the memory is used up and some other system needs to be added memory resource + +eg:- I have 4GB RAM with 4 virtual systems to be run. +I want each of them to start with 1GB RAM with maximum 2GB possible. This is not achievable since qemu pre-allocates the maximum balloon amount which is 2GBx4 systems . So to start all 4 systems the system needs 8GB RAM rather than 4GB RAM to start with although I have told initial balloon amount to be 1GB. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1200 b/results/classifier/gemma3:12b/performance/1200 new file mode 100644 index 00000000..82f2bceb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1200 @@ -0,0 +1,26 @@ + +always zero when query-dirty-rate +Description of problem: +The creation of VM works well(by virt-install), and I can enter it by 'virsh console or ssh'. + +Now, I try to use qemu's feature: calc-dirty-rate. + +But, always get '"dirty-rate":0' when 'query-dirty-rate', occasionally '"dirty-rate":2'. + +At the same time, I run 'mbw'(mbw -t0 -n 1000000 1024 -q) in vm, a memcpy-intensive benchmark. + + +I'm not sure if some configurations of QEMU/KVM are not enabled. + +looking forward to your reply! +Steps to reproduce: +``` +1. virsh qemu-monitor-command centos-huazhang '{"execute":"calc-dirty-rate", "arguments": {"calc-time": 1}}' + + {"return":{},"id":"libvirt-16"} + +2. virsh qemu-monitor-command centos-huazhang1 '{"execute":"query-dirty-rate"}' + + {"return":{"status":"measured","sample-pages":512,"dirty-rate":0,"mode":"page-sampling","start-time":607266,"calc-time":1},"id":"libvirt-17"} + +``` diff --git a/results/classifier/gemma3:12b/performance/1202289 b/results/classifier/gemma3:12b/performance/1202289 new file mode 100644 index 00000000..0481944d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1202289 @@ -0,0 +1,77 @@ + +Windows 2008/7 Guest to Guest Very slow 10-20Mbit/s + +I'm not sure if I'm submitting this to the proper place or not, if not, please direct me accordingly. + +At this point I'm starting to get desperate, I'll take any options or suggestions that spring to mind: + +Anyway, the problem exists on multiple hosts of various quality. From 4 core 8g mem machines to 12 core 64Gig mem machines with LVM and Raid-10. + +Using iperf as the testing utility: (windows guest can be either Windows 7 or 2008R2) +-Windows Guest -> Windows Guest averages 20Mbit/s (The problem) +-Windows Guest -> Host averages 800Mbit/s +-Host -> Windows Guest averages 1.1Gbit/s +-Linux Guest -> Host averages 12GBit/s +-Linux Guest -> Linux Guest averages 10.2Gbit/s + +For windows guests, switching between e1000 and virtio drivers doesn't make much of a difference. + +I use openvswitch to handle the bridging (makes bonding nics much easier) + +Disabling TSO GRO on all the host nics, and virtual nics, as well as modding the registry using: +netsh int tcp set global (various params here) can slightly improve Windows -> windows throughput. up to maybe 100Mbit/s but even that is spotty at best. + +The Particulars of the fastest host which benchmarks about the same as the slowest host. + +Ubuntu 12.04 64bit (updated to lastest as of July 15th) +Linux cckvm03 3.5.0-36-generic #57~precise1-Ubuntu SMP Thu Jun 20 18:21:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux + +libvirt: +Source: libvirt +Version: 0.9.8-2ubuntu17.10 + +qemu-kvm +Package: qemu-kvm +Version: 1.0+noroms-0ubuntu14.8 +Replaces: kvm (<< 1:84+dfsg-0ubuntu16+0.11.0), kvm-data, qemu + +openvswitch +Source: openvswitch +Version: 1.4.0-1ubuntu1.5 + +/proc/cpuifo + +processor : 0 +vendor_id : GenuineIntel +cpu family : 6 +model : 45 +model name : Intel(R) Xeon(R) CPU E5-2440 0 @ 2.40GHz +stepping : 7 +microcode : 0x70d +cpu MHz : 2400.226 +cache size : 15360 KB +physical id : 0 +siblings : 12 +core id : 0 +cpu cores : 6 +apicid : 0 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov +pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt +scp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc ap +erfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdc +m pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm +ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid +bogomips : 4800.45 +clflush size : 64 +cache_alignment : 64 +address sizes : 46 bits physical, 48 bits virtual +power management: + + +-Sample KVM line +usr/bin/kvm -S -M pc-1.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -name gvexch01 -uuid d28ffb4b-d809-3b40-ae3d-2925e6995394 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/gvexch01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot order=dc,menu=on -drive file=/dev/vgroup/gvexch01,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/dev/vgroup/gvexch01-d,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 -drive if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=18,id=hostnet0,vhost=on,vhostfd=21 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:bf:4e:1c,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:2 -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1214 b/results/classifier/gemma3:12b/performance/1214 new file mode 100644 index 00000000..adb394d4 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1214 @@ -0,0 +1,2 @@ + +qemu-riscv64 mmap will exhaust all physical memory diff --git a/results/classifier/gemma3:12b/performance/1252010 b/results/classifier/gemma3:12b/performance/1252010 new file mode 100644 index 00000000..6c9ec3d2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1252010 @@ -0,0 +1,8 @@ + +can't assign enough RAM to the VM + +QEMU version: 1.6.90.0 from 2013 11 16 +Host OS: Windows XP SP3 x86 +Host machine: 3.2 GHz AMD Athlon 64 dual core processor, 4 GB DDR II (3.2 seen by the OS) memory +Guest OS: Grub4Dos boot manager menu +Problem: you can't assign more than 880 MB memory to the VM, although with 0.15.1.0 version you can assign up to 1179 MB. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1253563 b/results/classifier/gemma3:12b/performance/1253563 new file mode 100644 index 00000000..23c7ac45 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1253563 @@ -0,0 +1,35 @@ + +bad performance with rng-egd backend + + +1. create listen socket +# cat /dev/random | nc -l localhost 1024 + +2. start vm with rng-egd backend + +./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -mon chardev=qmp,mode=control,pretty=on -chardev socket,id=qmp,host=localhost,port=1234,server,nowait -m 2000 -device virtio-net-pci,netdev=h1,id=vnet0 -netdev tap,id=h1 -vnc :0 -drive file=/images/RHEL-64-virtio.qcow2 \ +-chardev socket,host=localhost,port=1024,id=chr0 \ +-object rng-egd,chardev=chr0,id=rng0 \ +-device virtio-rng-pci,rng=rng0,max-bytes=1024000,period=1000 + +(guest) # dd if=/dev/hwrng of=/dev/null + +note: cancelling dd process by Ctrl+c, it will return the read speed. + +Problem: the speed is around 1k/s + +=================== + +If I use rng-random backend (filename=/dev/random), the speed is about 350k/s). + +It seems that when the request entry is added to the list, we don't read the data from queue list immediately. +The chr_read() is delayed, the virtio_notify() is delayed. the next request will also be delayed. It effects the speed. + +I tried to change rng_egd_chr_can_read() always returns 1, the speed is improved to (about 400k/s) + +Problem: we can't poll the content in time currently + + +Any thoughts? + +Thanks, Amos \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1253777 b/results/classifier/gemma3:12b/performance/1253777 new file mode 100644 index 00000000..ca61dbd4 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1253777 @@ -0,0 +1,20 @@ + +OpenBSD VM running on OpenBSD host has sleep calls taking twice as long as they should + +Running a script like + +while [ 1 ] +do + date + sleep 1 +done + +on the VM will result in the (correct) date being displayed, but it is displayed only every two (!) seconds. We have also noticed that if we connect to the VM's console using VNC, and move the mouse pointer constantly in the VNC window, the script runs normally with updates every second! Note that the script doesn't have to be running on the VM's console - it's also possible to (say) ssh to the VM from a separate machine and run the script and it will display the '2 second' issue, but as soon as you move the mouse pointer constantly in the VNC console window the script starts behaving normally with updates every second. + +I have only seen this bug when running an OpenBSD VM on an OpenBSD host. Running an OpenBSD VM on a Linux host does not exhibit the problem for me. I also belive (am told) that a Linux VM running on an OpenBSD host does not exhibit the problem. + +I have been using the OpenBSD 5.4 64 bit distro which comes with qemu 1.5.1 in a package, however I tried compiling qemu 1.6.1 and that has the same bug. In fact older OpenBSD distros have the same issue - going back to OpenBSD distros from two years ago still have the problem. This is not a 'new' bug recently introduced. + +Initially I wondered if it could be traced to an incorrectly set command line option, but I've since gone through many of the options in the man page simply trying different values (eg. different CPU types ( -cpu) , different emulated PC (-M)) but so far the problem remains. + +I'm quite happy to run tests in order to track this bug down better. We use qemu running on OpenBSD extensively and find it very useful! \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1263747 b/results/classifier/gemma3:12b/performance/1263747 new file mode 100644 index 00000000..b7667a9b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1263747 @@ -0,0 +1,30 @@ + +Arm64 fails to run a binary which runs OK on real hardware + +This binary: + +http://oirase.annexia.org/tmp/test.gz + +runs OK on real aarch64 hardware. It is a statically linked Linux binary which (if successful) will print "hello, world" and exit cleanly. + +On qemu-arm64 userspace emulator it doesn't print anything and loops forever using 100% CPU. + +---- +The following section is only if you wish to compile this binary from source, otherwise you can ignore it. + +First compile OCaml from: + +https://github.com/ocaml/ocaml + +(note you have to compile it on aarch64 or in qemu, it's not possible to cross-compile). You will have to apply the one-line patch from: + +https://sympa.inria.fr/sympa/arc/caml-list/2013-12/msg00179.html + + ./configure + make -j1 world.opt + +Then do: + + echo 'print_endline "hello, world"' > test.ml + ./boot/ocamlrun ./ocamlopt -I stdlib stdlib.cmxa test.ml -o test + ./test \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1272252 b/results/classifier/gemma3:12b/performance/1272252 new file mode 100644 index 00000000..ed580396 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1272252 @@ -0,0 +1,25 @@ + +qemu-img ftp/http convert + +Converting images with ftp or http as source could be done a lot faster. The way it works now (qemu 1.7.50) is significantly slower than the optimal way. + +FTP - how it works now +1. Connect and login to ftp-server. Ask for size of file. +2. Get a chunk of data using rest+retr +3. Goto step 1 again in a loop until all data is retrieved + +FTP - better solution +1. Connect and login to ftp-server. Dont ask for size of file. +2. Retrieve all remaining data +3. Goto step 1 again if disconnected/io error (max NN errors etc) + + +Http - how it works now +1. Connect to webserver and ask for size of file / http HEAD. +2. Get a chunk of data using http Range. +3. Goto step 1 again in a loop until all data is retrieved. + +Http - better solution +1. Connect to webserver. +2. Retrieve all remaining data. +3. Goto step 1 again if disconnected/io error (max NN errors). \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1275 b/results/classifier/gemma3:12b/performance/1275 new file mode 100644 index 00000000..b3ed1606 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1275 @@ -0,0 +1,10 @@ + +javac command stuck forever in qemu vm which does not use hardware virtualization +Description of problem: + +Steps to reproduce: +1. +2. +3. +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/1291 b/results/classifier/gemma3:12b/performance/1291 new file mode 100644 index 00000000..8072872f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1291 @@ -0,0 +1,2 @@ + +--enable-jemalloc configure option is not covered in CI diff --git a/results/classifier/gemma3:12b/performance/1292 b/results/classifier/gemma3:12b/performance/1292 new file mode 100644 index 00000000..ce20bee0 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1292 @@ -0,0 +1,4 @@ + +Default jemalloc config doesn't work on Asahi Linux +Description of problem: +M1 Macs use 16KB pages, jemalloc builds with 4KB page by default. diff --git a/results/classifier/gemma3:12b/performance/1307 b/results/classifier/gemma3:12b/performance/1307 new file mode 100644 index 00000000..6094097f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1307 @@ -0,0 +1,73 @@ + +query-named-block-nodes, without flat=true, is massively slow as number of block nodes increases +Description of problem: +The query-named-block-nodes command is insanely slow with deep backing chains when the flat=true arg is NOT given. + +``` +qemu-img create demo0.qcow2 1g +j=0 +for i in `seq 1 199` +do + qemu-img create -f qcow2 -o backing_file=demo$j.qcow2 -o backing_fmt=qcow2 demo$i.qcow2 + j=$i +done +``` + +Now configure libvirt with + +``` + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2' discard='unmap'/> + <source file='/var/lib/libvirt/images/demo199.qcow2'/> + <target dev='vdb' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> + </disk> +``` + +This results in `-blockdev` args + +``` +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo0.qcow2","node-name":"libvirt-201-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-201-format","read-only":true,"discard":"unmap","driver":"qcow2","file":"libvirt-201-storage","backing":null}' \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo1.qcow2","node-name":"libvirt-200-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-200-format","read-only":true,"discard":"unmap","driver":"qcow2","file":"libvirt-200-storage","backing":"libvirt-201-format"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo2.qcow2","node-name":"libvirt-199-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-199-format","read-only":true,"discard":"unmap","driver":"qcow2","file":"libvirt-199-storage","backing":"libvirt-200-format"}' \ +...snip... +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo197.qcow2","node-name":"libvirt-4-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-4-format","read-only":true,"discard":"unmap","driver":"qcow2","file":"libvirt-4-storage","backing":"libvirt-5-format"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo198.qcow2","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-3-format","read-only":true,"discard":"unmap","driver":"qcow2","file":"libvirt-3-storage","backing":"libvirt-4-format"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/demo199.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-3-format"}' \ +-device '{"driver":"virtio-blk-pci","bus":"pci.7","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk1"}' \ +``` + +Now stop libvirt + +``` +systemctl stop libvirtd +``` + +And speak directly to QMP + +``` +$ time socat UNIX:/var/lib/libvirt/qemu/domain-158-fedora38/monitor.sock - > /dev/null +{ "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } } +{ "execute": "query-named-block-nodes"} +{ "execute": "quit" } + +real 2m19.276s +user 0m0.006s +sys 0m0.014s +``` + +If we save the 'query-named-block-nodes' output instead of sending it to /dev/null, we get a 86 MB file for the QMP response. This will break all known client apps since they limit QMP reply size. + +It appears to have a combinatorial expansion of block nodes in the output. + +Blocking the main event loop for 2 minutes is obviously not good either. + +If we use '"flat": true' parameter to query-named-block-nodes, the command completes in just 15 seconds, and produces a large, but more manageable 2.7 MB + +Since the non-flat query-named-block-nodes output is so incredibly non-scalable, I think we should deprecate non-flat mode, and eventually make flat the mandatory option. diff --git a/results/classifier/gemma3:12b/performance/1321 b/results/classifier/gemma3:12b/performance/1321 new file mode 100644 index 00000000..43cfc48f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1321 @@ -0,0 +1,9 @@ + +qemu-system-i386 runs slow after upgrading legacy project from qemu 2.9.0 to 7.1.0 +Description of problem: +Using several custom serial and irq devices including timers. +The same code (after some customisation in order to compile with new 7.1.0 API and meson build system runs about 50% slower. +We had to remove "-icount 4" switch which worked fine with 2.9.0 just to get to this point. +Even running with multi-threaded tcg did not help. +We don't use the new ptimer API but rather the old QEMUTimer. +Any suggestions to why we encounter this vast performance degradation? diff --git a/results/classifier/gemma3:12b/performance/1321464 b/results/classifier/gemma3:12b/performance/1321464 new file mode 100644 index 00000000..d808cde1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1321464 @@ -0,0 +1,26 @@ + +qemu/block/qcow2.c:1942: possible performance problem ? + +I just ran static analyser cppcheck over today (20140520) qemu source code. + +It said many things, including + +[qemu/block/qcow2.c:1942] -> [qemu/block/qcow2.c:1943]: (performance) Buffer 'pad_buf' is being writ +ten before its old content has been used. + +Source code is + + memset(pad_buf, 0, s->cluster_size); + memcpy(pad_buf, buf, nb_sectors * BDRV_SECTOR_SIZE); + +Worth tuning ? + +Similar problem here + +[qemu/block/qcow.c:815] -> [qemu/block/qcow.c:816]: (performance) Buffer 'pad_buf' is being written +before its old content has been used. + +and + +[qemu/hw/i386/acpi-build.c:1265] -> [qemu/hw/i386/acpi-build.c:1267]: (performance) Buffer 'dsdt' is + being written before its old content has been used. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1321684 b/results/classifier/gemma3:12b/performance/1321684 new file mode 100644 index 00000000..2d8ecb84 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1321684 @@ -0,0 +1,47 @@ + +block_stream command stalls + +block_stream command stalls near finishing. +I tried 1.7.1, 2.0.0 and the master versions. All of them stalled. +But the 1.1.2 could finish the job. + +Here is how to reproduce it: + +$ sudo $QEMU \ +-enable-kvm -cpu qemu64 -m 1024 \ +-monitor stdio \ +-drive file=./i1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 \ + +QEMU 2.0.50 monitor - type 'help' for more information +(qemu) VNC server running on `127.0.0.1:5900' +(qemu) snapshot_blkdev drive_0 s1 +Formatting 's1', fmt=qcow2 size=26843545600 backing_file='./i1' backing_fmt='qcow2' encryption=off cluster_size=65536 lazy_refcounts=off +(qemu) block_stream drive_0 +(qemu) info block-jobs +Streaming device drive_0: Completed 400818176 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) info block-jobs +Streaming device drive_0: Completed 904396800 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) info block-jobs +Streaming device drive_0: Completed 23401070592 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) info block-jobs +Streaming device drive_0: Completed 26513768448 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) main-loop: WARNING: I/O thread spun for 1000 iterations +info block-jobs +Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) info block-jobs +Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed limit 0 bytes/s +(qemu) info block-jobs +Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed limit 0 bytes/s + +#### here, the progress stopped at 26841513984 #### + + +$ qemu-img info i1 +image: i1 +file format: qcow2 +virtual size: 25G (26843545600 bytes) +disk size: 1.0G +cluster_size: 2097152 +Format specific information: + compat: 1.1 + lazy refcounts: false \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1338957 b/results/classifier/gemma3:12b/performance/1338957 new file mode 100644 index 00000000..e7eff4ef --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1338957 @@ -0,0 +1,12 @@ + +RFE: add an event to report block devices watermark + +Add an event to report if a block device usage exceeds a threshold. The threshold should be configurable with a monitor command. The event should report the affected block device. Additional useful information could be the offset of the highest sector , like in the query-blockstats output. + +Rationale for the RFE +Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. +In order to let the guest run flawlessly and be not unnecessarily paused, oVirt sets a watermark and automatically resized the image once the watermark is reached or exceeded. + +In order to detect the mark crossing, the managing application has no choice than aggressively polling the QEMU monitor +using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: scenarios +with hunderds of VM are becoming not unusual. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/134 b/results/classifier/gemma3:12b/performance/134 new file mode 100644 index 00000000..375ccdeb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/134 @@ -0,0 +1,2 @@ + +Performance improvement when using "QEMU_FLATTEN" with softfloat type conversions diff --git a/results/classifier/gemma3:12b/performance/1362 b/results/classifier/gemma3:12b/performance/1362 new file mode 100644 index 00000000..cf210e9b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1362 @@ -0,0 +1,76 @@ + +BLKZEROOUT ioct/write requests getting split a weird boundary (and for no apparent reason?) +Description of problem: +i was investigating into some performance weirdness with passthrough/directly-mapped SAS vs. SATA disk, which seems to relate to detect_zeroes feature (see https://forum.proxmox.com/threads/disk-passthrough-performance-weirdness.118943/#post-516599 ). + +apparently, writing zeroes to passtrough/direct-mapped sas disk ( ST4000NM0034 ) in virtual machine is MUCH slower then sata disk ( HGST HDN728080AL ). + +with detect_zeroes=on (default in proxmox) , qemu converts writes of zeroes into BLKZEROOUT ioctl issued to the target disk, and my sas disk is much much slower with this (<80MB/s in comparison to the sata disk with 200MB/s). + +i found that the sas disk needs 0.01s on average for this ioctl to finish, whereas sata disk needs 0.004s. + +writing zeroes to the device directly is at about 200MB/s for both of them, so having detect_zeroes=on a default does not seem to be an advantage on all circumstances. + +anyway, i have made a weird observation during analysis: + +inside the virtual machine, i'm writing to the virtual disk like this: + +``` +dd if=/dev/zero of=/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1 bs=1024k count=1024 oflag=direct + +(scsi-0QEMU_QEMU_HARDDISK_drive-scsi1 mapped to scsi-35000c500836b1c73 / SAS on the host, scsi-0QEMU_QEMU_HARDDISK_drive-scsi2 mapped to ata-HGST_HDN728080ALE604_VJGDNX5X ) + +``` + +on the HOST i'm attaching to the kvm process with strace , every time i issue the above dd inside VM, kvm/qemu process issues BLKZEROOUT to the device in a different way, i.e. either + +- within a single ioctl at originating 1048576 byte size (=1024k) +or +- split into 2 ioctl with 1040384+8192(=1048576) +or +- split into 2 ioctl with 1044480+4096(=1048576) + + +why does kvm/qemu sometimes split the write request and sometimes not ? and why at such a weird boundary just below 1Mb? + + +i don't know if this is a bug, but at least it looks weird to me, that's why i'm reporting + +``` + +root@pve:~/util-linux/sys-utils# strace -T -f -p 18897 -e trace=all 2>&1 |grep BLK|head +[pid 65413] ioctl(19, BLKZEROOUT, [0, 1048576] <unfinished ...> +[pid 65412] ioctl(19, BLKZEROOUT, [1048576, 1048576] <unfinished ...> +[pid 65366] ioctl(19, BLKZEROOUT, [2097152, 1048576] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [3145728, 1048576] <unfinished ...> +[pid 65412] ioctl(19, BLKZEROOUT, [4194304, 1048576]) = 0 <0.011287> +[pid 65366] ioctl(19, BLKZEROOUT, [5242880, 1048576]) = 0 <0.012025> +[pid 65413] ioctl(19, BLKZEROOUT, [6291456, 1048576]) = 0 <0.011377> +[pid 65412] ioctl(19, BLKZEROOUT, [7340032, 1048576] <unfinished ...> +[pid 65366] ioctl(19, BLKZEROOUT, [8388608, 1048576] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [9437184, 1048576]) = 0 <0.011705> + +# strace -T -f -p 18897 -e trace=all 2>&1 |grep BLK|head +[pid 65878] ioctl(19, BLKZEROOUT, [0, 1040384] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [1040384, 8192] <unfinished ...> +[pid 65366] ioctl(19, BLKZEROOUT, [1048576, 1040384] <unfinished ...> +[pid 65878] ioctl(19, BLKZEROOUT, [2088960, 8192] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [2097152, 1040384] <unfinished ...> +[pid 65366] ioctl(19, BLKZEROOUT, [3137536, 8192] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [3145728, 1040384] <unfinished ...> +[pid 65878] ioctl(19, BLKZEROOUT, [4186112, 8192] <unfinished ...> +[pid 65366] ioctl(19, BLKZEROOUT, [4194304, 1040384] <unfinished ...> +[pid 65413] ioctl(19, BLKZEROOUT, [5234688, 8192] <unfinished ...> + +root@pve:~/util-linux/sys-utils# strace -T -f -p 18897 -e trace=all 2>&1 |grep BLK|head +[pid 66591] ioctl(19, BLKZEROOUT, [0, 1044480] <unfinished ...> +[pid 66592] ioctl(19, BLKZEROOUT, [1044480, 4096] <unfinished ...> +[pid 66593] ioctl(19, BLKZEROOUT, [1048576, 1044480] <unfinished ...> +[pid 66584] ioctl(19, BLKZEROOUT, [2093056, 4096] <unfinished ...> +[pid 66585] ioctl(19, BLKZEROOUT, [2097152, 1044480] <unfinished ...> +[pid 66565] ioctl(19, BLKZEROOUT, [3141632, 4096] <unfinished ...> +[pid 66591] ioctl(19, BLKZEROOUT, [3145728, 1044480] <unfinished ...> +[pid 66592] ioctl(19, BLKZEROOUT, [4190208, 4096] <unfinished ...> +[pid 66584] ioctl(19, BLKZEROOUT, [4194304, 1044480] <unfinished ...> +[pid 66593] ioctl(19, BLKZEROOUT, [5238784, 4096] <unfinished ... +``` diff --git a/results/classifier/gemma3:12b/performance/1397157 b/results/classifier/gemma3:12b/performance/1397157 new file mode 100644 index 00000000..2cd355a1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1397157 @@ -0,0 +1,12 @@ + +cpu high with ps2 keyboard on multi-core win7 guest os + + +qemu ver: 1.5.3-Latest + +guest os: window 7 64bit with 2 cpu and ps2 keybord. + +problem: Use virt-viwer as client to connect, When input quickly, the guest and host cpu will high and the input-char will display later. but when only 1 cpu on the vm, the problem will not display or When qemu ver is 0.12.1, the problem will not display. + +qemu cmd: +/usr/libexec/qemu-kvm -name xx_win7 -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu qemu64,+sse4.2,+sse4.1,+ssse3,-svm,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 4096 -realtime mlock=off -smp 2,sockets=1,cores=2,threads=1 -uuid 0860a434-6560-591b-f92f-c13c5caaf52d -rtc base=localtime -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/lfs/xx_win7/xx_win7.vda,if=none,id=drive-virtio-disk0,format=qcow2,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -drive if=none,id=drive-ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=::,disable-ticketing,plaintext-channel=main,plaintext-channel=display,plaintext-channel=inputs,plaintext-channel=cursor,plaintext-channel=playback,plaintext-channel=record,plaintext-channel=usbredir,image-compression=auto_glz,jpeg-wan-compression=auto,zlib-glz-wan-compression=auto,playback-compression=on,disable-copy-paste,seamless-migration=on -vga qxl -global qxl-vga.ram_size=268435456 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1399939 b/results/classifier/gemma3:12b/performance/1399939 new file mode 100644 index 00000000..14207912 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1399939 @@ -0,0 +1,5 @@ + +Qemu build with -faltivec and maltivec support in + +if is possible add the build support for qemu for have the -faltivec -maltivec in CPPFLAGS for make the emulation more faster on PPC equiped machine . +Thank you \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1413 b/results/classifier/gemma3:12b/performance/1413 new file mode 100644 index 00000000..e910af28 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1413 @@ -0,0 +1,23 @@ + +I tried to use qemu-nbd in the shell script, but it seems that qemu-nbd has some delay. +Description of problem: + +Steps to reproduce: +1. +``` +cat ~/test.sh +#!/bin/bash +qemu-nbd -c /dev/nbd0 $1 +mount -t ntfs3 -o uid=1000,gid=1000 /dev/disk/by-label/OS /mnt/OS +``` +2. +``` +sudo ~/test.sh ~/VM/win7_i386.qcow2 +mount: /mnt/OS: special device /dev/disk/by-label/OS does not exist. + dmesg(1) may have more information after failed mount system call. + +``` +Additional information: +But when I added a one-second delay between qemu-nbd and mount commands, the problem was solved. + +The qemu-img convert command also has a similar problem. It seems that these commands have a certain delay. Is this in line with expectations? diff --git a/results/classifier/gemma3:12b/performance/1442 b/results/classifier/gemma3:12b/performance/1442 new file mode 100644 index 00000000..ebffc707 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1442 @@ -0,0 +1,2 @@ + +RISC-V qemu, get cpu tick diff --git a/results/classifier/gemma3:12b/performance/1462944 b/results/classifier/gemma3:12b/performance/1462944 new file mode 100644 index 00000000..2cecad77 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1462944 @@ -0,0 +1,12 @@ + +vpc file causes qemu-img to consume lots of time and memory + +The attached vpc file causes 'qemu-img info' to consume 3 or 4 seconds of CPU time and 1.3 GB of heap, causing a minor denial of service. + +$ /usr/bin/time ~/d/qemu/qemu-img info afl12.img +block-vpc: The header checksum of 'afl12.img' is incorrect. +qemu-img: Could not open 'afl12.img': block-vpc: free_data_block_offset points after the end of file. The image has been truncated. +1.19user 3.15system 0:04.35elapsed 99%CPU (0avgtext+0avgdata 1324504maxresident)k +0inputs+0outputs (0major+327314minor)pagefaults 0swaps + +The file was found using american-fuzzy-lop. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1462949 b/results/classifier/gemma3:12b/performance/1462949 new file mode 100644 index 00000000..b0c84388 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1462949 @@ -0,0 +1,28 @@ + +vmdk files cause qemu-img to consume lots of time and memory + +The two attached files cause 'qemu-img info' to consume lots of time and memory. Around 10-12 seconds of CPU time, and around 3-4 GB of heap. + +$ /usr/bin/time ~/d/qemu/qemu-img info afl10.img +qemu-img: Can't get size of device 'image': File too large +0.40user 11.57system 0:12.03elapsed 99%CPU (0avgtext+0avgdata 4197804maxresident)k +56inputs+0outputs (0major+1045672minor)pagefaults 0swaps + +$ /usr/bin/time ~/d/qemu/qemu-img info afl11.img +image: afl11.img +file format: vmdk +virtual size: 12802T (14075741666803712 bytes) +disk size: 4.0K +cluster_size: 65536 +Format specific information: + cid: 4294967295 + parent cid: 4294967295 + create type: monolithicSparse + extents: + [0]: + virtual size: 14075741666803712 + filename: afl11.img + cluster size: 65536 + format: +0.29user 9.10system 0:09.43elapsed 99%CPU (0avgtext+0avgdata 3297360maxresident)k +8inputs+0outputs (0major+820507minor)pagefaults 0swaps \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1481272 b/results/classifier/gemma3:12b/performance/1481272 new file mode 100644 index 00000000..bcff6dc9 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1481272 @@ -0,0 +1,12 @@ + +main-loop: WARNING: I/O thread spun for 1000 iterations + +I compile the latest qemu to launch a VM but the monitor output the "main-loop: WARNING: I/O thread spun for 1000 iterations". + +# /usr/local/bin/qemu-system-x86_64 -name rhel6 -S -no-kvm -m 1024M -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c9dd2a5c-40f2-fd3d-3c54-9cd84f8b9174 -rtc base=utc -drive file=/home/samba-share/ubuntu.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=disk,serial=425618d4-871f-4021-bc5d-bcd7f1b5ca9c,bootindex=0 -vnc :1 -boot menu=on -monitor stdio +QEMU 2.3.93 monitor - type 'help' for more information +(qemu) c +(qemu) main-loop: WARNING: I/O thread spun for 1000 iterations <----------------------- + +qemu]# git branch -v +* master e95edef Merge remote-tracking branch 'remotes/sstabellini/tags/xen-migration-2.4-tag' into staging \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1493033 b/results/classifier/gemma3:12b/performance/1493033 new file mode 100644 index 00000000..2b2d0005 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1493033 @@ -0,0 +1,50 @@ + +memory leak/high memory usage with spice webdav feature + +This bug is being open due the comment: +https://bugs.freedesktop.org/show_bug.cgi?id=91350#c9 + +Description of problem: +When copying big files from client to guest, the memory usage in the host grows by about the size of the file. This is partially spice problem due the memory pool being able to increase as much as necessary without a limit which should be handled by the patches sent in the mailing list [0] + +[0] http://lists.freedesktop.org/archives/spice-devel/2015-August/021644.html + +At the same time, massif shows high memory usage by qemu as well [1] (output attached) + +[1] (peak) +->49.64% (267,580,319B) 0x308B89: malloc_and_trace (vl.c:2724) +| ->49.38% (266,167,561B) 0x67CE678: g_malloc (gmem.c:97) +| | ->49.03% (264,241,152B) 0x511D8E: qemu_coroutine_new (coroutine-ucontext.c:106) +| | | ->49.03% (264,241,152B) 0x510E24: qemu_coroutine_create (qemu-coroutine.c:74) +(...) + +The file being shared was a 320M ogv video. + +Version-Release number of selected component (if applicable): +QEMU emulator version 2.3.93 +SPICE and SPICE-GTK: from git master + +How reproducible: +100% + +Steps to Reproduce: +1-) build spice-gtk with --enable-webdav=yes +2-) enable webdav in your VM by following: +https://elmarco.fedorapeople.org/manual.html#_folder_sharing +3-) using remote-viewer with webdav patches, connects to a fedora guest +4-) Open nautilus, go to 'Browse Network' +5-) On remote-viewer, enable shared folder by File > Preferences > [X] Share folder +6-) The spice client folder should appear: Double-click to mount it. +7-) Check the memory of your qemu process +8-) Copy a big file (let's say, 300 MB) from the shared folder to local VM +9-) See the memory consumption of qemu grows by a lot; + +Actual results: +Memory usage grows during copy and is not freed + +Expected results: +Memory should have an upper limit to grow and should be freed after copy + +Additional info: +Also reported in Fedora/rawhide: https://bugzilla.redhat.com/show_bug.cgi?id=1256376 +SPICE upstream bug: https://bugs.freedesktop.org/show_bug.cgi?id=91350 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1497 b/results/classifier/gemma3:12b/performance/1497 new file mode 100644 index 00000000..d4eadcf5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1497 @@ -0,0 +1,4 @@ + +no documentation on plugins with mem_cb in their name +Additional information: +I'm especially interested in how vector ops under mask report their memory traffic diff --git a/results/classifier/gemma3:12b/performance/1502613 b/results/classifier/gemma3:12b/performance/1502613 new file mode 100644 index 00000000..a6295628 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1502613 @@ -0,0 +1,8 @@ + +[Feature Request] Battery Status / Virtual Battery + +When using virtualization on notebooks heavily then virtual machines do not realize that they're running on a notebook device causing high power consumption because they're not switching into a optimized "laptop mode". This leads to the circumstance that they are trying to do things like defragmentation / virtus scan / etc. while the host is still running on batteries. + +So it would be great if QEMU / KVM would have support for emulating "Virtual Batteries" to guests causing them to enable power-saving options like disabling specific services / devices / file operations automatically by OS. + +Optionally a great feature would be to set virtual battery's status manually. For example: Current charge rate / charging / discharging / ... \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1505041 b/results/classifier/gemma3:12b/performance/1505041 new file mode 100644 index 00000000..ebff5c84 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1505041 @@ -0,0 +1,58 @@ + +Live snapshot revert times increases linearly with snapshot age + +The WineTestBot (https://testbot.winehq.org/) uses QEmu live snapshots to ensure the Wine tests are always run in a pristine Windows environment. However the revert times keep increasing linearly with the age of the snapshot, going from tens of seconds to thousands. While the revert takes place the qemu process takes 100% of a core and there is no disk activity. Obviously waiting over 20 minutes before being able to run a 10 second test is not viable. + +Only some VMs are impacted. Based on libvirt's XML files the common point appears to be the presence of the following <timer> tags: + + <clock offset='localtime'> + <timer name='rtc' tickpolicy='delay'/> + <timer name='pit' tickpolicy='delay'/> + <timer name='hpet' present='no'/> + </clock> + +Where the unaffected VMs have the following clock definition instead: + + <clock offset='localtime'/> + +Yet shutting down the affected VMs, changing the clock definition, creating a live snapshot and trying to revert to it 6 months later results in slow revert times (>400 seconds). + +Changing the tickpolicy to catchup for rtc and/or pit has no effect on the revert time (and unsurprisingly causes the clock to run fast in the guest). + + +To reproduce this problem do the following: +* Create a Windows VM (either 32 or 64 bits). This is known to happen with at least Windows 2000, XP, 2003, 2008 and 10. +* That VM will have the <timer> tags shown above, with the possible addition of an hypervclock timer. +* Shut down the VM. +* date -s "2014/04/01" +* Start the VM. +* Take a live snapshot. +* Shut down the VM. +* date -s "<your current date>" +* Revert to the live snapshot. + +If the revert takes more than 2 minutes then there is a problem. + + +A workaround is to set track='guest' on the rtc timer. This makes the revert fast and may even be the correct solution. But why is it not the default or better documented? + * It setting track='wall' or omitting track, then the revert is slow and the clock in the guest is not updated. + * It setting track='guest' the revert is fast and the clock in the guest is not updated. + + +I found three past mentions of this issue but as far as I can tell none of them got anywhere: + +* [Qemu-discuss] massive slowdown for reverts after given amount of time on any newer versions + https://lists.gnu.org/archive/html/qemu-discuss/2013-02/msg00000.html + +* The above post references another one from 2011 wrt qemu 0.14: + https://lists.gnu.org/archive/html/qemu-devel/2011-03/msg02645.html + +* Comment #9 of Launchpad bug 1174654 matches this slow revert issue. However + the bug was really about another issue so this was not followed on. + https://bugs.launchpad.net/qemu/+bug/1174654/comments/9 + + +I'm currently running into this issue with QEmu 2.1 but it looks like this bug has been there all along. +1:2.1+dfsg-12+deb8u2 qemu-kvm +1:2.1+dfsg-12+deb8u2 qemu-system-common +1:2.1+dfsg-12+deb8u2 qemu-system-x86 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1520 b/results/classifier/gemma3:12b/performance/1520 new file mode 100644 index 00000000..11b2d806 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1520 @@ -0,0 +1,50 @@ + +x86 TCG acceleration running on s390x with -smp > host cpus slowed down by x10 +Description of problem: +This boots up a trivial guest using OVMF, when the conditions below are given it runs ~10x slower. + +I have found this breaking our tests of qemu 7.2 [(which due to Debian adding the offending change as backport is affected)](https://salsa.debian.org/qemu-team/qemu/-/blob/master/debian/patches/master/acpi-cpuhp-fix-guest-visible-maximum-access-size-to-.patch) by runnig an order of magnitude slower. + + +I was tracing it down (insert a long strange trip here) and found that it occurs: +- only with patch dab30fb "acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block" applied + - latest master is still affetced +- only with s390x running emulation of x86 + - emulating x86 on ppc64 didn't show the same behavior +- only with -smp > host cpus + - smp 2 with 1 host cpu => slow + - smp 4 with 2 host cpu => slow + - any case where host cpu >= smp => fast + +On average good cases are on a 2964 s390x machine taking ~5-6 seconds for the good case. +The bad case is close to 60s which is the timeout of the automated tests. + +We all know -smp shouldn't be >host-cpus, and I totally admit that this is the definition of an edge case. +But I do not know what else might be affected and this just happened to be what the test does by default - and a slowdown by x10 seems too much even for edge cases to be just ignored. +And while we could just bump up the timeout (and probably will as an interim workaround) I wanted to file it here for your awareness. +Steps to reproduce: +You can recreate the same by using the commandline above and timing things on your own. + +Or you can use the [autopkgtest of edk2 in Ubuntu](https://git.launchpad.net/ubuntu/+source/edk2/tree/debian/tests/shell.py#n214) which have [shown this](https://autopkgtest.ubuntu.com/results/autopkgtest-lunar/lunar/s390x/e/edk2/20230224_094012_c95f4@/log.gz) first. +Additional information: +Only signed OVMF cases are affected, while aavmf and other OVMF are more or less on the same speed. + +``` +1 CPU / 1GB Memory +7.0 7.2 +6.54s 58.32s test_ovmf_ms +6.72s 56.96s test_ovmf_4m_ms +7.54s 55.47s test_ovmf_4m_secboot +7.56s 49.88s test_ovmf_secboot +7.01s 39.79s test_ovmf32_4m_secboot +7.38s 7.43s test_aavmf32 +7.27s 7.30s test_aavmf +7.26s 7.26s test_aavmf_snakeoil +5.83s 5.95s test_ovmf_4m +5.61s 5.81s test_ovmf_q35 +5.51s 5.64s test_ovmf_pc +5.26s 5.42s test_ovmf_snakeoil +``` + +Highlighting @cborntra since it is somewhat s390x related and @mjt0k as the patch is applied as backport in Debian. +I didn't find the handle of Laszlo (Author) to highlight him as well. diff --git a/results/classifier/gemma3:12b/performance/1529173 b/results/classifier/gemma3:12b/performance/1529173 new file mode 100644 index 00000000..29a0e1d7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1529173 @@ -0,0 +1,32 @@ + +Absolutely slow Windows XP SP3 installation + +Host: Linux 4.3.3 vanilla x86-64/Qemu 2.5 i686 (mixed env) +Guest: Windows XP Professional SP3 (i686) + +This is my launch string: + +$ qemu-system-i386 \ +-name "Windows XP Professional SP3" \ +-vga std \ +-net nic,model=pcnet \ +-cpu core2duo \ +-smp cores=2 \ +-cdrom /tmp/en_winxp_pro_with_sp3_vl.iso \ +-hda Windows_XP.qcow \ +-boot d \ +-net nic \ +-net user \ +-m 1536 \ +-localtime + +Console output: + +warning: TCG doesn't support requested feature: CPUID.01H:EDX.vme [bit 1] +warning: TCG doesn't support requested feature: CPUID.80000001H:EDX.syscall [bit 11] +warning: TCG doesn't support requested feature: CPUID.80000001H:EDX.lm|i64 [bit 29] +warning: TCG doesn't support requested feature: CPUID.01H:EDX.vme [bit 1] +warning: TCG doesn't support requested feature: CPUID.80000001H:EDX.syscall [bit 11] +warning: TCG doesn't support requested feature: CPUID.80000001H:EDX.lm|i64 [bit 29] + +After hitting 35% installation more or less stalls (it actually doesn't but it progresses 1% a minute which is totally unacceptable). \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1556044 b/results/classifier/gemma3:12b/performance/1556044 new file mode 100644 index 00000000..1f734e89 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1556044 @@ -0,0 +1,9 @@ + +Redox GUI hangs with 100% CPU on ARM + +Booting into Redox OS cli on ARM with qemu-system-i386 works fine. However, starting the Redox GUI (orbital) brings up the graphical interface and then starts using 100% CPU. I'd guess it's related to mouse detection and handling. + +The OS image is fully usable on x86. + + +https://www.dropbox.com/s/u6v2k9wzcuiycfo/redox-disk.img.xz?dl=0 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1569491 b/results/classifier/gemma3:12b/performance/1569491 new file mode 100644 index 00000000..8ea59c09 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1569491 @@ -0,0 +1,6 @@ + +qemu system i386 poor performance on e5500 core + +I had been tested with generic core net building or with mtune e5500 but i have the same result: performances +are extremly low compared with other classes of powerpc cpu. +The strange is the 5020 2ghz in all emulators been tested by me is comparable with a 970MP 2.7 ghz in speed and benchmarks but im facing the half of performance in i386-soft-mmu compared with a 2.5 ghz 970MP. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1581334 b/results/classifier/gemma3:12b/performance/1581334 new file mode 100644 index 00000000..4950a606 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1581334 @@ -0,0 +1,101 @@ + +qemu + librbd takes high %sy cpu under high random io workload + +I got an IO problem. When running Qemu + ceph(use librbd), and do a random IO benchmark or some high load random IO test, it will exhaust all my host cpu on %sy cpu. +It doesn’t happen all the time, but when it appear it will reproduce every time I start a random IO benchmark(test with Fio). +And the only way to fix the problem is shutdown my vm and start it, but the problem will happen again with high random IO load. + +Some information: + Vendor : HP + Product : HP ProLiant BL460c Gen9 + Kernel : 3.16.0-4-amd64 + Disto : Debian + Version : 8.4 + Arch : amd64 + Qemu : 2.1 ~ 2.6 (Yes, I already test the latest qemu2.6 version, but still got the problem) + Ceph : Hammer 0.94.5 + Librbd : 0.94.5 ~ 10.2 (I rebuild librbd with ceph 10.2 source code, but the problem still here) + Qemu config : cache=none + Qemu cpu&mem: 4core, 8GB + +How can i reproduce the problem? + +while :; do bash randwrite.sh ; sleep 3600; done >test.log 2>&1 & +(Sleep 3600 is the key to reproduce my problem. I don’t known how long sleep suit for reproduce, but one hour sleep is enough. the problem will easy reproduce after a long sleep, if i keep benchmark running without sleep, i can't reproduce it) + +My randwrite.sh script +---------------------------------------------- +#!/bin/sh +sync +echo 3 > /proc/sys/vm/drop_caches + +FILENAME=/dev/vdc +RUNTIME=100 +BLOCKSIZE=4K +IOENGINE=libaio +RESULTFILE=fio-randwrite.log +IODEPTH=32 +RAMP_TIME=5 +SIZE=100G + +fio --numjobs 10 --norandommap --randrepeat=0 --readwrite=randwrite --ramp_time=$RAMP_TIME --bs=$BLOCKSIZE --runtime=$RUNTIME --iodepth=$IODEPTH --filename=$FILENAME --ioengine=$IOENGINE --direct=1 --name=iops_randwrite --group_reporting | tee $RESULTFILE +---------------------------------------------- + +What happened after the problem appear? +my vm will got huge IOPS drop. In my case, it will drop from 15000 IOPS to 3500 IOPS. And other thing, my host cpu will exhaust on %sy. Top output like this. + +Qemu Fio benchmark +---------------------------------------------------- +Tasks: 284 total, 2 running, 282 sleeping, 0 stopped, 0 zombie +%Cpu0 : 11.8 us, 66.7 sy, 0.0 ni, 21.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu1 : 12.7 us, 64.9 sy, 0.0 ni, 22.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu2 : 13.7 us, 64.5 sy, 0.0 ni, 21.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu3 : 13.2 us, 64.1 sy, 0.0 ni, 22.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu4 : 11.7 us, 65.4 sy, 0.0 ni, 22.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu5 : 13.2 us, 64.4 sy, 0.0 ni, 22.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu6 : 12.4 us, 65.1 sy, 0.0 ni, 22.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu7 : 13.6 us, 63.8 sy, 0.0 ni, 22.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu8 : 9.8 us, 73.0 sy, 0.0 ni, 17.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu9 : 7.8 us, 74.5 sy, 0.0 ni, 17.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu10 : 6.0 us, 81.4 sy, 0.0 ni, 6.6 id, 0.0 wa, 0.0 hi, 6.0 si, 0.0 st +%Cpu11 : 8.4 us, 79.5 sy, 0.0 ni, 8.8 id, 0.0 wa, 0.0 hi, 3.4 si, 0.0 st +%Cpu12 : 7.6 us, 80.7 sy, 0.0 ni, 7.0 id, 0.0 wa, 0.0 hi, 4.7 si, 0.0 st +%Cpu13 : 7.4 us, 79.9 sy, 0.0 ni, 7.7 id, 0.0 wa, 0.0 hi, 5.0 si, 0.0 st +%Cpu14 : 9.8 us, 75.4 sy, 0.0 ni, 11.4 id, 0.0 wa, 0.0 hi, 3.4 si, 0.0 st +%Cpu15 : 6.7 us, 80.1 sy, 0.0 ni, 10.1 id, 0.0 wa, 0.0 hi, 3.0 si, 0.0 st +%Cpu16 : 9.2 us, 69.2 sy, 0.0 ni, 17.5 id, 0.0 wa, 0.0 hi, 4.1 si, 0.0 st +%Cpu17 : 9.9 us, 66.6 sy, 0.0 ni, 20.1 id, 0.0 wa, 0.0 hi, 3.4 si, 0.0 st +%Cpu18 : 16.6 us, 49.0 sy, 0.0 ni, 34.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu19 : 16.7 us, 46.4 sy, 0.0 ni, 36.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu20 : 13.0 us, 50.8 sy, 0.0 ni, 36.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu21 : 18.9 us, 46.2 sy, 0.0 ni, 34.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu22 : 12.1 us, 52.9 sy, 0.0 ni, 35.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu23 : 15.9 us, 47.6 sy, 0.0 ni, 36.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu24 : 6.7 us, 62.0 sy, 0.0 ni, 31.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu25 : 7.6 us, 63.7 sy, 0.0 ni, 28.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu26 : 8.1 us, 75.8 sy, 0.0 ni, 16.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu27 : 6.7 us, 73.6 sy, 0.0 ni, 19.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu28 : 9.2 us, 74.3 sy, 0.0 ni, 16.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu29 : 8.2 us, 73.3 sy, 0.0 ni, 18.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu30 : 4.4 us, 73.1 sy, 0.0 ni, 22.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +%Cpu31 : 7.5 us, 69.6 sy, 0.0 ni, 22.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +KiB Mem: 13217662+total, 3721572 used, 12845504+free, 283228 buffers +KiB Swap: 4194300 total, 0 used, 4194300 free. 2242976 cached Mem + + PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND +30349 root 20 0 25.381g 499892 20640 R 2495 0.4 119:11.98 qemu-system-x86 + +Anything I do? +I use perf top, profile to debug the problem. It show me that something like thread deadlock problem. Any I test QEMU with kernel RBD, it work fine. +Here are the perf top output on host. +--------------------------------------------------------------- + PerfTop: 12393 irqs/sec kernel:87.3% exact: 0.0% [4000Hz cycles], (all, 32 CPUs) +------------------------------------------------------------------------------- + + 75.25% [kernel] [k] _raw_spin_lock + 1.17% [kernel] [k] futex_wait_setup + 0.86% libc-2.19.so [.] malloc + 0.58% [kernel] [k] futex_wake + 0.55% libc-2.19.so [.] 0x00000000000ea96f + 0.41% [kernel] [k] native_write_msr_safe +--------------------------------------------------------------- \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1588591 b/results/classifier/gemma3:12b/performance/1588591 new file mode 100644 index 00000000..a1876741 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1588591 @@ -0,0 +1,7 @@ + +Qemu 2.6 Solaris 8 Sparc telnet terminate itself + +With Qemu 2.6, Solaris 8 can be installed and run. However, it sometimes terminate itself with I/O thread spun for 1000 iterations. + +qemu-system-sparc -nographic -monitor null -serial mon:telnet:0.0.0.0:3000,server -hda ./Sparc8.disk -m 256 -boot c -net nic,macaddr=52:54:0:12:34:56 -net tap,ifname=tap0,script=no,downscript=noQEMU waiting for connection on: disconnected:telnet:0.0.0.0:3000,server +main-loop: WARNING: I/O thread spun for 1000 iterations \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1594069 b/results/classifier/gemma3:12b/performance/1594069 new file mode 100644 index 00000000..f1dd6a54 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1594069 @@ -0,0 +1,9 @@ + +SIMD instructions translated to scalar host instructions + +SIMD instructions inside the guest (NEON, MMX, SSE, SSE2, AVX) are translated to scalar instructions on the host instead of SIMD instructions. It appears that there have been a few efforts to rectify this [1], and even a submitted patch series, but all discussion has effectively died out [2]. + +I would like to see better SIMD performance on qemu, especially as non-x86 architectures are becoming widely used (e.g. ARM). + +[1] http://dl.acm.org/citation.cfm?id=2757098&dl=ACM&coll=DL&CFID=633095244&CFTOKEN=12352103 +[2] https://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg01720.html \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1603734 b/results/classifier/gemma3:12b/performance/1603734 new file mode 100644 index 00000000..3110cea8 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1603734 @@ -0,0 +1,8 @@ + +Hang in fsqrt + +At least qemu-i368 and qemu-x86_64 hang in floatx80_sqrt in versions 2.6.0 and git (2.6.50) for some input values, likely due to an infinite loop at fpu/softfloat.c:6569. + +Steps to reproduce: +1) Compile attached code: gcc -o test test.c -lm +2) `qemu-i368 test` and `qemu-x86_64 test` will hang at 100% cpu \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1609968 b/results/classifier/gemma3:12b/performance/1609968 new file mode 100644 index 00000000..bac735da --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1609968 @@ -0,0 +1,32 @@ + +"cannot set up guest memory" b/c no automatic clearing of Linux' cache + +Version: qemu-2.6.0-1 +Kernel: 4.4.13-1-MANJARO +Full script (shouldn't matter though): https://pastebin.com/Hp24PWNE + +Problem: +When host has been up and used for a while cache has been filled as much that guest can't be started without droping caches. + +Expected behavior: +Qemu should be able to request as much Memory as it needs and cause Linux to drop cache pages if needed. A user shouldn't be required to have to come to this conclusion and having to drop caches to start Qemu with the required amount of memory. + +My fix: +Following command (as root) required before qemu start: +# sync && echo 3 > /proc/sys/vm/drop_caches + +Example: +$ sudo qemu.sh -m 10240 && echo success || echo failed +qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory +failed +$ free + total used free shared buff/cache available +Mem: 16379476 9126884 3462688 148480 3789904 5123572 +Swap: 0 0 0 +$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches' +$ free + total used free shared buff/cache available +Mem: 16379476 1694528 14106552 149772 578396 14256428 +Swap: 0 0 0 +$ sudo qemu.sh -m 10240 && echo success || echo failed +success \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1635 b/results/classifier/gemma3:12b/performance/1635 new file mode 100644 index 00000000..0a762119 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1635 @@ -0,0 +1,38 @@ + +Slow graphics output under aarch64 hvf (no dirty bitmap tracking) +Description of problem: +When using a display adapter such as `bochs-display` (which, yes, I realize is not the ideal choice for an aarch64 guest, but it works fine under TCG and KVM, so bear with me) under `hvf` acceleration on an M1 Mac, display output is slow enough to be measured in seconds-per-frame. + +The issue seems to stem from each write to the framebuffer memory resulting in a data abort, while the expected behavior is that only one such write results in a data abort exception, which is handled by marking the region dirty and then subsequent writes do not yield exceptions until the display management in QEMU resets the dirty flag. Instead, every pixel drawn causes the VM to trap, and performance is degraded. +Steps to reproduce: +1. Start an aarch64 HVF guest with the `bochs-display` display adapter. +2. Observe performance characteristics. +3. +Additional information: +I reported this issue on IRC around a year ago, and was provided with a patch by @agraf which I have confirmed works. That patch was shared on the `qemu-devel` mailing list in February, 2022, with a response from @pm215: https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg00609.html + +As a quick summary, the patch takes this snippet from the i386 HVF target: + +https://gitlab.com/qemu-project/qemu/-/blob/master/target/i386/hvf/hvf.c#L132-138 + +And applies a variation of it to the ARM target when handling a data abort exception, before this assert: + +https://gitlab.com/qemu-project/qemu/-/blob/master/target/arm/hvf/hvf.c#L1381 + +Something to the effect of: + +```c + if (iswrite) { + uint64_t gpa = hvf_exit->exception.physical_address; + hvf_slot *slot = hvf_find_overlap_slot(gpa, 1); + + if (slot && slot->flags & HVF_SLOT_LOG) { + memory_region_set_dirty(slot->region, 0, slot->size); + hv_vm_protect(slot->start, slot->size, HV_MEMORY_READ | + HV_MEMORY_WRITE | HV_MEMORY_EXEC); + break; + } + } +``` + +I am reporting this issue now as I updated my git checkout with the release of QEMU 8.0.0 and was surprised to find that the patch had never made it upstream and the issue persists. diff --git a/results/classifier/gemma3:12b/performance/1652373 b/results/classifier/gemma3:12b/performance/1652373 new file mode 100644 index 00000000..a4bb1ec7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1652373 @@ -0,0 +1,8 @@ + +User-mode QEMU is not deterministic + +QEMU in user-mode (linux-user or bsd-user) has no way to get the equivalent of the "-icount" argument found in softmmu mode. + +It is true that some system calls in user-mode can prevent deterministic execution, but it would be very simple to patch time-related syscalls to return a number based on icount when in deterministic mode. + +Putting both changes together (icount + time-related syscalls) would cover the needs for deterministic execution of most benchmarks (i.e., those not interacting with the network or other programs in the system). \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1657010 b/results/classifier/gemma3:12b/performance/1657010 new file mode 100644 index 00000000..85517888 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1657010 @@ -0,0 +1,12 @@ + +RFE: Please implement -cpu best or a CPU fallback option + +QEMU should implement a -cpu best option or some other way to make this work: + +qemu -M pc,accel=kvm:tcg -cpu best + +qemu -M pc,accel=kvm:tcg -cpu host:qemu64 + +See also: + +https://bugzilla.redhat.com/show_bug.cgi?id=1277744#c6 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1659 b/results/classifier/gemma3:12b/performance/1659 new file mode 100644 index 00000000..0dbd80df --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1659 @@ -0,0 +1,28 @@ + +x86 vm fails to stop on Darwin aarch64 when qemu compiled with -O1/-O2 +Description of problem: +When compiled with `-O2` or `-O1` qemu process hangs on full VM stopping on macOS aarch64 host if `shutdown -P now` initiated from guest system. +Steps to reproduce: +1. Compile latest qemu version with -O2 (default value) or -O1 passed +2. Run qemu-system-x86_64 with ubuntu image, e.g. https://cloud-images.ubuntu.com/focal/20230215/focal-server-cloudimg-amd64.img and custom cloud-init (for user/password authentication) +3. Wait until image is loaded, connect via vnc or provide login/password in stdio +4. Initiate shutdown with `sudo shutdown -P now` +5. See that VM indefinitely shutdowns +6. Kill VM from host system with kill -9 <qemu-system-x86_64-process-pid> +7. Recompile qemu with -O0 +8. Repeat steps 2-4 +9. See that vm successfully stopped, and qemu process exited with code 0 +Additional information: +I've created thread dump from activity monitor with threads which qemu hanging on, attached below +[sample-qemu-system-x86_64.txt](/uploads/119b89b7f55f4374acb9ae1f9dc2e517/sample-qemu-system-x86_64.txt) + +Probably there is some compiler optimisation which prevents qemu threads from receive shutdown signal or appropriate notification from another threads. + +The compiler version with which qemu is built: +```bash +% cc --version +Apple clang version 14.0.3 (clang-1403.0.22.14.1) +Target: arm64-apple-darwin22.4.0 +Thread model: posix +InstalledDir: /Library/Developer/CommandLineTools/usr/bin +``` diff --git a/results/classifier/gemma3:12b/performance/1672383 b/results/classifier/gemma3:12b/performance/1672383 new file mode 100644 index 00000000..0a281d80 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1672383 @@ -0,0 +1,21 @@ + +Slow Windows XP load after commit a9353fe897ca2687e5b3385ed39e3db3927a90e0 + +I've recently discovered, that in QEMU 2.8+ my Windows XP loading time has significantly worsened. In 2.7 it took 30-40 second to boot, but in 2.8 it became 2-2,5 minutes. + +I've used Git bisect, and found out that the change happened after commit a9353fe897ca2687e5b3385ed39e3db3927a90e0, which, as far as I can tell from the commit message, handled race condition when invalidating breakpoint. + +I've set a breakpoint in static void breakpoint_invalidate(CPUState *cpu, target_ulong pc), and here's a backtrace: +#0 cpu_breakpoint_insert (cpu=cpu@entry=0x555556a73be0, pc=144, + flags=flags@entry=32, breakpoint=breakpoint@entry=0x555556a7c670) + at /media/sdd2/qemu-work/exec.c:830 +#1 0x00005555558746ac in hw_breakpoint_insert (env=env@entry=0x555556a7be60, + index=index@entry=0) at /media/sdd2/qemu-work/target-i386/bpt_helper.c:64 +#2 0x00005555558748ed in cpu_x86_update_dr7 (env=0x555556a7be60, + new_dr7=<optimised out>) + at /media/sdd2/qemu-work/target-i386/bpt_helper.c:160 +#3 0x00007fffa17421f6 in code_gen_buffer () +#4 0x000055555577fcb4 in cpu_tb_exec (itb=<optimised out>, + itb=<optimised out>, cpu=0x7fff8b7763b0) + at /media/sdd2/qemu-work/cpu-exec.c:164 +It seems that XP sets some breakpoints during it's load, and it leads to frequent TB flushes and slow execution. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1684 b/results/classifier/gemma3:12b/performance/1684 new file mode 100644 index 00000000..c0bb883b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1684 @@ -0,0 +1,46 @@ + +QEMU doesn't use multi-threaded TCG on aarch64 host with x86-64 guest +Description of problem: +Even configured to emulate more than one vCPU, at the host it only uses 1 CPU at 100%. The same test was made using same architecture (aarch64 on aarch64), and it archieves to use all phisical cores. The first VM uses TGC, the second one uses KVM. Screenshots attached. +Steps to reproduce: +1. Use official Debian distro from Rock Pi 5B +2. Install XFCE4 and VirtManager, qemu aarch64 and qemu x86_64 +3. Download debian x64 netinstall iso +4. Install system with basic features, then install stress-ng +5. Stop, configure -smp to 1 socket, 4 cores, 2 threads, it will result on 8 vCPUs +6. Login as root and run stress-ng to 8 CPU +7. Ctrl+Right to another TTY, install and run htop, you will see 8 CPUs on 100% usage +8. At host, open Terminal, install and run htop, you will see just one core at 100% +Additional information: +Both VMs tested. aarch64 as KVM that works fine, x86_64 as TGC that uses only one CPU. + + +VirtManager VM #1 config for x86_64 on aarch64 + + +VirtManager VM #2 config for aarch64 on aarch64 + + +VirtManager VM #2 hypervisor used as KVM + + +VirtManager VM #1 hypervisor used as TGC + + +100% on host of all cores being used with stress-ng at aarch64 guest + + +All cores at 100% on aarch64 guest + + +100% on host of just one core being used with stress-ng at x86_64 guest + + +Cool down after both VMs ended stress-ng process + + +virsh version + + +"dmesg | head -n50" at host machine + diff --git a/results/classifier/gemma3:12b/performance/1686980 b/results/classifier/gemma3:12b/performance/1686980 new file mode 100644 index 00000000..eb69fc5e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1686980 @@ -0,0 +1,30 @@ + +qemu is very slow when adding 16,384 virtio-scsi drives + +qemu runs very slowly when adding many virtio-scsi drives. I have attached a small reproducer shell script which demonstrates this. + +Using perf shows the following stack trace taking all the time: + + 72.42% 71.15% qemu-system-x86 qemu-system-x86_64 [.] drive_get + | + --72.32%--drive_get + | + --1.24%--__irqentry_text_start + | + --1.22%--smp_apic_timer_interrupt + | + --1.00%--local_apic_timer_interrupt + | + --1.00%--hrtimer_interrupt + | + --0.83%--__hrtimer_run_queues + | + --0.64%--tick_sched_timer + + 21.70% 21.34% qemu-system-x86 qemu-system-x86_64 [.] blk_legacy_dinfo + | + ---blk_legacy_dinfo + + 3.65% 3.59% qemu-system-x86 qemu-system-x86_64 [.] blk_next + | + ---blk_next \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1687 b/results/classifier/gemma3:12b/performance/1687 new file mode 100644 index 00000000..ec5c5a3a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1687 @@ -0,0 +1,54 @@ + +Memory leak for x86 guest on macOS ARM host +Description of problem: +QEMU is used by docker to run `x86` binaries on Apple silicon. Then using `mmap` followed by `munmap` results in a memory leak manifested by continuously growing RSS memory usage when running `mmap` and `munmap` in a loop, e.g., when running the following binary: + +``` +#include <stdio.h> +#include <unistd.h> +#include <sys/mman.h> + +const int page = 4096; + +int work(int N) { + int *ptr = mmap(NULL, N * sizeof(int), PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); + + if (ptr == MAP_FAILED) { + printf("Mapping Failed\n"); + return 1; + } + + for(int i = 0; i < N; i++) { + ptr[i] = i * 10; + } + + int err = munmap(ptr, N * sizeof(int)); + if (err != 0) { + printf("UnMapping Failed\n"); + return 1; + } + + return 0; +} + +int main() { + int N = page * 1024; + + while (1) { + int res = work(N); + if (res) { + return res; + } + printf(".\n"); + } + + return 0; +} +``` +Steps to reproduce: +``` +$ LEAK=$(docker run --platform linux/amd64 -d -it martin2718/mmap-leak ./a.out) +$ docker exec -it $LEAK top # you should observe that RES for a.out keeps growing +$ docker exec -it $LEAK pmap -x 1 # you should see a single memory mapping whose RSS memory usage keeps growing +$ docker kill $LEAK # abort the experiment +``` diff --git a/results/classifier/gemma3:12b/performance/1689499 b/results/classifier/gemma3:12b/performance/1689499 new file mode 100644 index 00000000..b1244eed --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1689499 @@ -0,0 +1,30 @@ + +copy-storage-all/inc does not easily converge with load going on + +Hi, +for now this is more a report to discuss than a "bug", but I wanted to be sure if there are things I might overlook. + +I'm regularly testing the qemu's we have in Ubuntu which currently are 2.0, 2.5, 2.6.1, 2.8 plus a bunch of patches. And for all sorts of verification upstream every now and then. + +I recently realized that the migration options around --copy-storage-[all/inc] seem to have got worse at converging on migration. Although it is not a hard commit that is to be found, it just seems more likely to occur the newer the qemu versions is. I assume that is partially due to guest performance optimization that keep it busy. +To a user it appears as a hanging migration being locked up. + +But let me outline what actually happens: +- Setup without shared storage +- Migration using --copy-storage-all/--copy-storage-inc +- Working fine with idle guests +- If the guests is busy the migration does take like forever (1 vCPU that are busy with 1 CPU, 1 memory and one disk hogging processes) +- statistically seems to trigger more likely on newer qemu's (might be a red herring) + +The background workloads are most trivial burners: +- cpu: md5sum /dev/urandom +- memory: stress-ng -m 1 --vm-keep --vm-bytes 256M +- disk: while /bin/true; do dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=100; done + +We are talking about ~1-2 minutes on qemu 2.5 (4 tries x 3 architectures) and 2-10+ hours on >=qemu 2.6.1. + +I say it is likely not a bug, but more a discussion as I can easily avoid hanging via either: +- timeouts (--timeout, ...) to abort or suspend to migrate it +- --auto-converge ( I had only one try, but it seemed to help by slowing down the load generators) + +So you might say "that is all as it should be, and the users can use the further options to mitigate" and I'm all fine with that. In that case the bug still serves as a "searchable" document of some kind for others triggering the same case. But if anything comes to your mind that need better handling around this case lets start to discuss more deeply about it. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1698574 b/results/classifier/gemma3:12b/performance/1698574 new file mode 100644 index 00000000..ab47936d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1698574 @@ -0,0 +1,57 @@ + +slow boot windows 7 + +Hello, +I have a nice working qemu with gpu passthrough setup. +I pass through my nvidia gtx 880m. +It boots in 4mins 18secs. + +If I remove the "-vga none" switch and allow qemu to create a vga adapter I can boot in 1min. + +Why does a normal boot with the nvidia card hang for 3mins (yes, the hd light just flickers for that long)? + +Nothing major but I'd like to know, especially if it can be fixed. + +I cannot leave -vga none turned on as the vga adapter grabs up resources and the nvidia card complains it cannot start due to lack of resources. I'd love to just add resources if possible and keep both cards running to get the 1min boot time. + +Here is my script: + +qemu-system-x86_64 -machine type=q35,accel=kvm -cpu host,kvm=off \ +-smp 8,sockets=1,cores=4,threads=2 \ +-bios /usr/share/seabios/bios.bin \ +-serial none \ +-parallel none \ +-vga none \ +-m 7G \ +-mem-prealloc \ +-balloon none \ +-rtc clock=host,base=localtime \ +-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \ +-device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \ +-device virtio-scsi-pci,id=scsi \ +-drive id=disk0,if=virtio,cache=none,format=raw,file=/home/bob/qemu/windows7.img \ +-drive file=/home/bob/qemu/qemu2/virtio-win-0.1.126.iso,id=isocd,format=raw,if=none -device scsi-cd,drive=isocd \ +-netdev type=tap,id=net0,ifname=tap0 \ +-device virtio-net-pci,netdev=net0,mac=00:16:3e:00:01:01 \ +-usbdevice host:413c:a503 \ +-usbdevice host:13fe:3100 \ +-usbdevice host:0bc2:ab21 \ +-boot menu=on \ +-boot order=c + + + +Here are my specs: + +System: Host: MSI-GT70-2PE Kernel: 4.8.0-51-generic x86_64 (64 bit gcc: 5.4.0) + Desktop: Cinnamon 3.2.7 (Gtk 3.18.9) Distro: Linux Mint 18.1 Serena +Machine: Mobo: Micro-Star model: MS-1763 v: REV:0.C Bios: American Megatrends v: E1763IMS.51B date: 01/29/2015 +CPU: Quad core Intel Core i7-4810MQ (-HT-MCP-) cache: 6144 KB + flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 22348 + clock speeds: max: 2801 MHz 1: 2801 MHz 2: 800 MHz 3: 900 MHz 4: 900 MHz 5: 900 MHz 6: 1700 MHz + 7: 800 MHz 8: 900 MHz +Graphics: Card-1: Intel 4th Gen Core Processor Integrated Graphics Controller bus-ID: 00:02.0 + Card-2: NVIDIA GK104M [GeForce GTX 880M] bus-ID: 01:00.0 + Display Server: X.Org 1.18.4 driver: nvidia Resolution: 1920x1080@60.00hz + GLX Renderer: GeForce GTX 880M/PCIe/SSE2 GLX Version: 4.5.0 NVIDIA 375.66 +Direct Rendering: Yes \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1701449 b/results/classifier/gemma3:12b/performance/1701449 new file mode 100644 index 00000000..5d9b326b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1701449 @@ -0,0 +1,63 @@ + +high memory usage when using rbd with client caching + +Hi, +we are experiencing a quite high memory usage of a single qemu (used with KVM) process when using RBD with client caching as a disk backend. We are testing with 3GB memory qemu virtual machines and 128MB RBD client cache. When running 'fio' in the virtual machine you can see that after some time the machine uses a lot more memory (RSS) on the hypervisor than she should. We have seen values (in real production machines, no artificially fio tests) of 250% memory overhead. I reproduced this with qemu version 2.9 as well. + +Here the contents of our ceph.conf on the hypervisor: +""" +[client] +rbd cache writethrough until flush = False +rbd cache max dirty = 100663296 +rbd cache size = 134217728 +rbd cache target dirty = 50331648 +""" + +How to reproduce: +* create a virtual machine with a RBD backed disk (100GB or so) +* install a linux distribution on it (we are using Ubuntu) +* install fio (apt-get install fio) +* run fio multiple times with (e.g.) the following test file: +""" +# This job file tries to mimic the Intel IOMeter File Server Access Pattern +[global] +description=Emulation of Intel IOmeter File Server Access Pattern +randrepeat=0 +filename=/root/test.dat +# IOMeter defines the server loads as the following: +# iodepth=1 Linear +# iodepth=4 Very Light +# iodepth=8 Light +# iodepth=64 Moderate +# iodepth=256 Heavy +iodepth=8 +size=80g +direct=0 +ioengine=libaio + +[iometer] +stonewall +bs=4M +rw=randrw + +[iometer_just_write] +stonewall +bs=4M +rw=write + +[iometer_just_read] +stonewall +bs=4M +rw=read +""" + +You can measure the virtual machine RSS usage on the hypervisor with: + virsh dommemstat <machine name> | grep rss +or if you are not using libvirt: + grep RSS /proc/<PID of qemu process>/status + +When switching off the RBD client cache, all is ok again, as the process does not use so much memory anymore. + +There is already a ticket on the ceph bug tracker for this ([1]). However I can reproduce that memory behaviour only when using qemu (maybe it is using librbd in a special way?). Running directly 'fio' with the rbd engine does not result in that high memory usage. + +[1] http://tracker.ceph.com/issues/20054 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1703506 b/results/classifier/gemma3:12b/performance/1703506 new file mode 100644 index 00000000..c35aeae0 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1703506 @@ -0,0 +1,10 @@ + +SMT not supported by QEMU on AMD Ryzen CPU + +HyperThreading/SMT is supported by AMD Ryzen CPUs but results in this message when setting the topology to threads=2: + +qemu-system-x86_64: AMD CPU doesn't support hyperthreading. Please configure -smp options properly. + +Checking in a Windows 10 guest reveals that SMT is not enabled, and from what I understand, QEMU converts the topology from threads to cores internally on AMD CPUs. This appears to cause performance problems in the guest perhaps because programs are assuming that these threads are actual cores. + +Software: Linux 4.12, qemu 2.9.0 host with KVM enabled, Windows 10 pro guest \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1718 b/results/classifier/gemma3:12b/performance/1718 new file mode 100644 index 00000000..e010299a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1718 @@ -0,0 +1,48 @@ + +Strange throttle-group test results +Description of problem: +I have a question about throttle-group test results. + +I did a test to limit IO by applying THROTTLE-GROUP and the expected result is not what I expected + +The setup environment looks like this throttle-group to x-iops-total=500, x-bps-total=524288000 and throttling vdb, benchmarked with fio command + +``` +# mount -t xfs /dev/vdb1 /mnt/disk + +# fio --direct=1 --bs=1M --iodepth=128 --rw=read --size=1G --numjobs=1 --runtime=600 --time_based --name=/mnt/disk/fio-file --ioengine=libaio --output=/mnt/disk/read-1M +``` + +When I test with a --bs value of 1M, I get 500Mib throughput. + + + +When I test with a --bs value of 2m, I don't get 500Mibs but 332Mibs throughput. +``` +fio --direct=1 --bs=2M --iodepth=128 --rw=read --size=1G --numjobs=1 --runtime=600 --time_based --name=/mnt/disk/fio-file --ioengine=libaio --output=/mnt/disk/read-2M +``` + + + +If I set the qemu x-iops-total value to 1500 and the fio --bs value to 2M test again, I get 500Mib throughput. + + + + +To summarize, here is the Test result. + +| fio bs | qemu x-iops-total | qemu x-bps-total | Result iops |Result throughput +| ------ | ------ |------ |------ |------ | +| 2M | 1500 | 524288000 | 250 | 500 | +| **2M** |**500** | **524288000** | **166** | **332** | +| 1M | 1500 | 524288000 | 500 | 500 | +| 1M | 500. | 524288000 | 500 | 500 | + + +When the --bs value is 2M and the x-iops-total value is 500, the throughput should be 500, but it is not, so I don't know what the problem is. + +If there is anything I missed, please let me know. +Steps to reproduce: +1. Apply throttle-group to vdb and start the VM +2. mount vdb1 +3. test fio diff --git a/results/classifier/gemma3:12b/performance/1720969 b/results/classifier/gemma3:12b/performance/1720969 new file mode 100644 index 00000000..83ff667b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1720969 @@ -0,0 +1,10 @@ + +qemu/memory.c:206: pointless copies of large structs ? + +[qemu/memory.c:206]: (performance) Function parameter 'a' should be passed by reference. +[qemu/memory.c:207]: (performance) Function parameter 'b' should be passed by reference. + +Source code is + +static bool memory_region_ioeventfd_equal(MemoryRegionIoeventfd a, + MemoryRegionIoeventfd b) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1727737 b/results/classifier/gemma3:12b/performance/1727737 new file mode 100644 index 00000000..cccb03cf --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1727737 @@ -0,0 +1,26 @@ + +qemu-arm stalls on a GCC sanitizer test since qemu-2.7 + +Hi, + +I have noticed that several GCC/sanitizer tests fail with timeout when executed under QEMU. + +After a bit of investigation, I have noticed that this worked with qemu-2.7, and started failing with qemu-2.8, and still fails with qemu-2.10.1 + +I'm attaching a tarball containing: +alloca_instruments_all_paddings.exe : the testcase, and the needed libs: +lib/librt.so.1 +lib/libdl.so.2 +lib/ld-linux-armhf.so.3 +lib/libasan.so.5 +lib/libc.so.6 +lib/libgcc_s.so.1 +lib/libpthread.so.0 +lib/libm.so.6 + +To reproduce the problem: +$ qemu-arm -cpu any -R 0 -L $PWD $PWD/alloca_instruments_all_paddings.exe +returns in less than a second with qemu-2.7, and never with qemu-2.8 + +Using -d in_asm suggests that the program "almost" completes and qemu seems to stall on: +0x40b6eb44: e08f4004 add r4, pc, r4 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1734810 b/results/classifier/gemma3:12b/performance/1734810 new file mode 100644 index 00000000..117e1778 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1734810 @@ -0,0 +1,22 @@ + +Windows guest virtual PC running abnormally slow + +Guest systems running Windows 10 in a virtualized environment run unacceptably slow, with no option in Boxes to offer the virtual machine more (or less) cores from my physical CPU. + +ProblemType: Bug +DistroRelease: Ubuntu 17.10 +Package: gnome-boxes 3.26.1-1 +ProcVersionSignature: Ubuntu 4.13.0-17.20-lowlatency 4.13.8 +Uname: Linux 4.13.0-17-lowlatency x86_64 +ApportVersion: 2.20.7-0ubuntu3.5 +Architecture: amd64 +CurrentDesktop: ubuntu:GNOME +Date: Tue Nov 28 00:37:11 2017 +ProcEnviron: + TERM=xterm-256color + PATH=(custom, no user) + XDG_RUNTIME_DIR=<set> + LANG=en_US.UTF-8 + SHELL=/bin/bash +SourcePackage: gnome-boxes +UpgradeStatus: No upgrade log present (probably fresh install) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1740219 b/results/classifier/gemma3:12b/performance/1740219 new file mode 100644 index 00000000..84c708fe --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1740219 @@ -0,0 +1,60 @@ + +static linux-user ARM emulation has several-second startup time + +static linux-user emulation has several-second startup time + +My problem: I'm a Parabola packager, and I'm updating our +qemu-user-static package from 2.8 to 2.11. With my new +statically-linked 2.11, running `qemu-arm /my/arm-chroot/bin/true` +went from taking 0.006s to 3s! This does not happen with the normal +dynamically linked 2.11, or the old static 2.8. + +What happens is it gets stuck in +`linux-user/elfload.c:init_guest_space()`. What `init_guest_space` +does is map 2 parts of the address space: `[base, base+guest_size]` +and `[base+0xffff0000, base+0xffff0000+page_size]`; where it must find +an acceptable `base`. Its strategy is to `mmap(NULL, guest_size, +...)` decide where the first range is, and then check if that ++0xffff0000 is also available. If it isn't, then it starts trying +`mmap(base, ...)` for the entire address space from low-address to +high-address. + +"Normally," it finds an accaptable `base` within the first 2 tries. +With a static 2.11, it's taking thousands of tries. + +---- + +Now, from my understanding, there are 2 factors working together to +cause that in static 2.11 but not the other builds: + + - 2.11 increased the default `guest_size` from 0xf7000000 to 0xffff0000 + - PIE (and thus ASLR) is disabled for static builds + +For some reason that I don't understand, with the smaller +`guest_size` the initial `mmap(NULL, guest_size, ...)` usually +returns an acceptable address range; but larger `guest_size` makes it +consistently return a block of memory that butts right up against +another already mapped chunk of memory. This isn't just true on the +older builds, it's true with the 2.11 builds if I use the `-R` flag to +shrink the `guest_size` back down to 0xf7000000. That is with +linux-hardened 4.13.13 on x86-64. + +So then, it it falls back to crawling the entire address space; so it +tries base=0x00001000. With ASLR, that probably succeeds. But with +ASLR being disabled on static builds, the text segment is at +0x60000000; which is does not leave room for the needed +0xffff1000-size block before it. So then it tries base=0x00002000. +And so on, more than 6000 times until it finally gets to and passes +the text segment; calling mmap more than 12000 times. + +---- + +I'm not sure what the fix is. Perhaps try to mmap a continuous chunk +of size 0xffff1000, then munmap it and then mmap the 2 chunks that we +actually need. The disadvantage to that is that it does not support +the sparse address space that the current algorithm supports for +`guest_size < 0xffff0000`. If `guest_size < 0xffff0000` *and* the big +mmap fails, then it could fall back to a sparse search; though I'm not +sure the current algorithm is a good choice for it, as we see in this +bug. Perhaps it should inspect /proc/self/maps to try to find a +suitable range before ever calling mmap? \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1743 b/results/classifier/gemma3:12b/performance/1743 new file mode 100644 index 00000000..4675a396 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1743 @@ -0,0 +1,17 @@ + +QEm+Android emulator crashes on x86 host (but not mac M1) +Description of problem: +Using QEmu+Android emulator crashes when using tflite on x86 hosts (but not M1 macs). +Steps to reproduce: +1. Install android toolchain, including emulator (sdkmanager, adb, avdmanager etc) +2. Start android emulator on an x86 host +3. Follow instructions to download and run tflite benchmarking tool [here](https://www.tensorflow.org/lite/performance/measurement) +4. Crashes with the following error + +``` +06-27 17:38:28.093 8355 8355 F ndk_translation: vendor/unbundled_google/libs/ndk_translation/intrinsics/intrinsics_impl_x86_64.cc:86: CHECK failed: 524288 == 0 +``` + +We have tried with many different models and the result is always the same. The same models run fine when the emulator runs on a mac M1 host. +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/1751264 b/results/classifier/gemma3:12b/performance/1751264 new file mode 100644 index 00000000..67e97718 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1751264 @@ -0,0 +1,26 @@ + +qemu-img convert issue in a tmpfs partition + +qemu-img convert command is slow when the file to convert is located in a tmpfs formatted partition. + +v2.1.0 on debian/jessie x64, ext4: 10m14s +v2.1.0 on debian/jessie x64, tmpfs: 10m15s + +v2.1.0 on debian/stretch x64, ext4: 11m9s +v2.1.0 on debian/stretch x64, tmpfs: 10m21.362s + +v2.8.0 on debian/jessie x64, ext4: 10m21s +v2.8.0 on debian/jessie x64, tmpfs: Too long + +v2.8.0 on debian/stretch x64, ext4: 10m42s +v2.8.0 on debian/stretch x64, tmpfs: Too long + +It seems that the issue is caused by this commit : https://github.com/qemu/qemu/commit/690c7301600162421b928c7f26fd488fd8fa464e + +In order to reproduce this bug : + +1/ mount a tmpfs partition : mount -t tmpfs tmpfs /tmp +2/ get a vmdk file (we used a 15GB image) and put it on /tmp +3/ run the 'qemu-img convert -O qcow2 /tmp/file.vmdk /path/to/destination' command + +When we trace the process, we can see that there's a lseek loop which is very slow (compare to outside a tmpfs partition). \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1756538 b/results/classifier/gemma3:12b/performance/1756538 new file mode 100644 index 00000000..c4b2d1d2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1756538 @@ -0,0 +1,14 @@ + +Minimal Ubuntu vs. Debian differences + +I'm using Qemu on Ubuntu (minimal) and Debian (minimal) on Android (Arch64) via Linux Deploy to run Windows guests. Here's a few issues I encountered: + +1) Qemu on (minimal) Debian 9 and Ubuntu cannot run Windows 7-10 guests (only Windows XP and below) because there's a black screen after the boot menu. Qemu on Debian 10, however, can run Windows 7. Incidentally, these distros run on the host in bios compatibility mode instead of UEFI. Ubuntu Desktop (full distro) on other hosts does not display the black screen when running Qemu. + +2) Qemu on Debian 9-10 (minimal) does not display fullscreen - but Ubuntu minimal does display full-screen. + +3) Qemu on Limbo PC Emulator and on Debian 9-10 only run windows guests at 1 GHz using the default Qemu CPU, but Ubuntu runs windows guests at 2 GHz using the default Qemu CPU. + +4) Enable KVM doesn't work, and virtualization isn't detected through Limbo PC Emulator and minimal Linux distros running on Android - perhaps is a problem with running Linux distros via Linux Deploy using Chroot on Android (not so much a Qemu-KVM issue) and failing to detect ARMv8-A CPUs that are indeed capable of virtualization. + +Can anyone explain these differences? I believe they are all using the latest versions of Qemu. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1756807 b/results/classifier/gemma3:12b/performance/1756807 new file mode 100644 index 00000000..fc79fd7e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1756807 @@ -0,0 +1,68 @@ + +performance regression in qemu-user + proot + +To reproduce: + +1. Install qemu-user-static and proot +2. Enter some arm chroot using them: + + proot -0 -q qemu-arm-static -w / -r chroot/ /bin/bash + +3. Run a command which normally takes a short but measurable amount of time: + + cd /usr/share/doc && time grep -R hello + +Result: + +This is over 100 times slower on 18.04 than it was on 16.04. I am not sure if proot or qemu is causing the problem, but proot version has not changed. Also note that on 16.04 I am using the cloud repo version of qemu, which is newer than 16.04 stock, but still older than 18.04. + +Although system 2 is lower spec than system 1, it should not be this much slower. No other software seems to be affected. + +If required I can supply a chroot tarball. It is essentially just a Debian bootstrap though. + +Logs: + + + +System 1: i7-6700 3.4GHz, 32GB RAM, 512GB Crucial MX100 SSD, Ubuntu 16.04 +qemu-arm version 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.4~cloud0) +proot 5.1.0 + +al@al-desktop:~/rpi-ramdisk/raspbian$ proot -0 -q qemu-arm-static -w / -r root/ /bin/bash +root@al-desktop:/# cd /usr/share/doc +root@al-desktop:/usr/share/doc# time grep -R hello +dash/copyright:Debian GNU/Linux hello source package as the file COPYING. If not, + +real 0m0.066s +user 0m0.040s +sys 0m0.008s + + + + + +System 2: i5-5300U 2.30GHz, 8GB RAM, 256GB Crucial MX300 SSD, Ubuntu 18.04 +qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu4) +proot 5.1.0 + +al@al-thinkpad:~/rpi-ramdisk/raspbian$ proot -0 -q qemu-arm-static -w / -r root/ /bin/bash +root@al-thinkpad:/# cd /usr/share/doc +root@al-thinkpad:/usr/share/doc# time grep -R hello +dash/copyright:Debian GNU/Linux hello source package as the file COPYING. If not, + +real 0m24.176s +user 0m0.366s +sys 0m11.352s + +ProblemType: Bug +DistroRelease: Ubuntu 18.04 +Package: qemu (not installed) +ProcVersionSignature: Ubuntu 4.15.0-12.13-generic 4.15.7 +Uname: Linux 4.15.0-12-generic x86_64 +ApportVersion: 2.20.8-0ubuntu10 +Architecture: amd64 +Date: Mon Mar 19 07:13:25 2018 +InstallationDate: Installed on 2018-03-18 (0 days ago) +InstallationMedia: Xubuntu 18.04 LTS "Bionic Beaver" - Alpha amd64 (20180318) +SourcePackage: qemu +UpgradeStatus: No upgrade log present (probably fresh install) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/176 b/results/classifier/gemma3:12b/performance/176 new file mode 100644 index 00000000..0994c005 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/176 @@ -0,0 +1,2 @@ + +virtual machine cpu soft lockup when qemu attach disk diff --git a/results/classifier/gemma3:12b/performance/1763536 b/results/classifier/gemma3:12b/performance/1763536 new file mode 100644 index 00000000..a317d5b5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1763536 @@ -0,0 +1,84 @@ + +go build fails under qemu-ppc64le-static (qemu-user) + +I am using qemu-user (built static) in a docker container environment. When running multi-threaded go commands in the container (go build for example) the process may hang, report segfaults or other errors. I built qemu-ppc64le from the upstream git (master). + +I see the problem running on a multi core system with Intel i7 processors. +# cat /proc/cpuinfo | grep "model name" +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz +model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz + +Steps to reproduce: +1) Build qemu-ppc64le as static and copy into docker build directory named it qemu-ppc64le-static. + +2) Add hello.go to docker build dir. + +package main +import "fmt" +func main() { + fmt.Println("hello world") +} + +3) Create the Dockerfile from below: + +FROM ppc64le/golang:1.10.1-alpine3. +COPY qemu-ppc64le-static /usr/bin/ +COPY hello.go /go + +4) Build container +$ docker build -t qemutest -f Dockerfile ./go + +5) Run test +$ docker run -it qemutest + +/go # /usr/bin/qemu-ppc64le-static --version +qemu-ppc64le version 2.11.93 (v2.12.0-rc3-dirty) +Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers + +/go # go version +go version go1.10.1 linux/ppc64le + +/go # go build hello.go +fatal error: fatal error: stopm holding locksunexpected signal during runtime execution + +panic during panic +[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1003528c] + +runtime stack: +runtime: unexpected return pc for syscall.Syscall6 called from 0xc42007f500 +stack: frame={sp:0xc4203be840, fp:0xc4203be860} stack=[0x4000b7ecf0,0x4000b928f0) + +syscall.Syscall6(0x100744e8, 0x3d, 0xc42050c140, 0x20, 0x18, 0x10422b80, 0xc4203be968[signal , 0x10012d88SIGSEGV: segmentation violation, 0xc420594000 code=, 0x00x1 addr=0x0 pc=0x1003528c) +] + +runtime stack: + /usr/local/go/src/syscall/asm_linux_ppc64x.s:61runtime.throw(0x10472d19, 0x13) + + /usr/local/go/src/runtime/panic.go:0x6c616 +0x68 + + +runtime.stopm() + /usr/local/go/src/runtime/proc.go:1939goroutine +10x158 + [runtime.exitsyscall0semacquire(0xc42007f500) + /usr/local/go/src/runtime/proc.go:3129 +]: +0x130 +runtime.mcall(0xc42007f500) + /usr/local/go/src/runtime/asm_ppc64x.s:183 +0x58sync.runtime_Semacquire +(0xc4201fab1c) + /usr/local/go/src/runtime/sema.go:56 +0x38 + +---- +Note the results may differ between attempts, hangs and other faults sometimes happen. +---- +If I run "go: single threaded I don't see the problem, for example: + +/go # GOMAXPROCS=1 go build -p 1 hello.go +/go # ./hello +hello world + +I see the same issue with arm64. I don't think this is a go issue, but don't have a real evidence to prove that. This problem looks similar to other problem I have seen reported against qemu running multi-threaded applications. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1770859 b/results/classifier/gemma3:12b/performance/1770859 new file mode 100644 index 00000000..952fc662 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1770859 @@ -0,0 +1,4 @@ + +qemu-img compare -m option is missing + +Comparing images using multiple streams (like qemu-img convert) maybe effectively sped up when one of the images (or both) is RBD. qemu-img convert does it's job perfectly while converting. Please implement the same for image comparison. Since operations are read-only, -W is useless, but may be introduced as well for debugging/performance purposes. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1774677 b/results/classifier/gemma3:12b/performance/1774677 new file mode 100644 index 00000000..5be3bb3d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1774677 @@ -0,0 +1,21 @@ + +-icount increases boot time by >10x + +When I specify the -icount option, some guest operations such as booting a Linux kernel take more than 10 times longer than otherwise. For example, the following will boot Aboriginal Linux to the login prompt about 6 seconds on my system (using TCG, not KVM): + +wget http://landley.net/aboriginal/downloads/old/binaries/1.4.5/system-image-i686.tar.gz +gunzip <system-image-i686.tar.gz | tar xfv - +cd system-image-i686 +sh run-emulator.sh + +If I replace the last line with + +QEMU_EXTRA="-icount shift=auto" sh run-emulator.sh + +booting to the login prompt takes about 1 minute 20 seconds. + +I have tried different values for "shift" other than the "auto" used above, but have not been able to find one that gives reasonable performance. Specifying "sleep=off" also did not help. + +During the slow boots, qemu appears to spend most of its time sleeping, not using the host CPU. + +I see this with multiple versions of qemu, including current git sources (c181ddaa176856b3cd2dfd12bbcf25fa9c884a97), and on multiple host OSes, including Debian 9 on x86_64. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1775702 b/results/classifier/gemma3:12b/performance/1775702 new file mode 100644 index 00000000..8d01f081 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1775702 @@ -0,0 +1,12 @@ + +High host CPU load and slower guest after upgrade guest OS Windows 10 to ver 1803 + +After upgrading Windows 10 guest to version 1803, guests VM runs slower and there is high host CPU load even when guest is almost idle. Did not happened with windows 10 up to version 1709. + +See my 1st report here: +https://askubuntu.com/questions/1033985/kvm-high-host-cpu-load-after-upgrading-vm-to-windows-10-1803 + +Another user report is here: +https://lime-technology.com/forums/topic/71479-windows-10-vm-cpu-usage/ + +Tested on: Ubuntu 16.04 with qemu 2.5.0 and i3-3217U, Arch with qemu 2.12 i5-7200U, Ubuntu 18.04 qemu 2.11.1 AMD FX-4300. All three platform showing the same slowdown and higher host cpu load with windows 10 1803 VM compared to windows 10 1709 VM. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1777786 b/results/classifier/gemma3:12b/performance/1777786 new file mode 100644 index 00000000..873bfd57 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1777786 @@ -0,0 +1,47 @@ + +virtio-gpu-3d.c: change virtio_gpu_fence_poll timer scale + +We use virtio-gpu to accelerate Unigine Heaven Benchmark in VM. But we get only 5 FPS when we use AMD RX460 in our host. +We found that guest os spent a lot of time in waiting for the return of glMapBufferRange/glUnmapBuffer commad. We suspected the host GPU was waiting for fence. So we finally change the timer of fence_poll. Afer change timer from + ms to us, Benchmark result raise up to 22 FPS. + +From a4003af5c4fe92d55353f42767d0c45de95bb78f Mon Sep 17 00:00:00 2001 +From: chen wei <email address hidden> +Date: Fri, 8 Jun 2018 17:34:45 +0800 +Subject: [PATCH] virtio-gpu:improve 3d performance greatly + + opengl function need fence support.when CPU execute opengl function, it need wait fence for synchronize GPU. +so qemu must deal with fence timely as possible. but now the expire time of the timer to deal with fence is 10 ms. +I think it is too long for opengl. So i will change it to 20 ns. + Before change, when i play Unigine_Heaven 3d game with virglrenderer, the fps is 3. atfer change the fps up to 23. + +Signed-off-by: chen wei <email address hidden> +Signed-off-by: wang qiang <email address hidden> +--- + hw/display/virtio-gpu-3d.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/hw/display/virtio-gpu-3d.c b/hw/display/virtio-gpu-3d.c +index 3558f38..c0a5d21 100644 +--- a/hw/display/virtio-gpu-3d.c ++++ b/hw/display/virtio-gpu-3d.c +@@ -582,7 +582,7 @@ static void virtio_gpu_fence_poll(void *opaque) + virgl_renderer_poll(); + virtio_gpu_process_cmdq(g); + if (!QTAILQ_EMPTY(&g->cmdq) || !QTAILQ_EMPTY(&g->fenceq)) { +- timer_mod(g->fence_poll, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 10); ++ timer_mod(g->fence_poll, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 20); + } + } + +@@ -629,7 +629,7 @@ int virtio_gpu_virgl_init(VirtIOGPU *g) + return ret; + } + +- g->fence_poll = timer_new_ms(QEMU_CLOCK_VIRTUAL, ++ g->fence_poll = timer_new_us(QEMU_CLOCK_VIRTUAL, + virtio_gpu_fence_poll, g); + + if (virtio_gpu_stats_enabled(g->conf)) { +-- +2.7.4 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1783422 b/results/classifier/gemma3:12b/performance/1783422 new file mode 100644 index 00000000..8d2d75d5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1783422 @@ -0,0 +1,30 @@ + +qemu_clock_get_ns does not take into account icount_time_shift + +Hello, + +If you check the qemu/util/qemu-timer.c you will find the following function: + +597: int64_t qemu_clock_get_ns(QEMUClockType type) +598: { +.... +602: switch (type) { +.... +606: case QEMU_CLOCK_VIRTUAL: +607: if (use_icount) { +608: return cpu_get_icount(); + + +Now on line 606, in case we requested QEMU_CLOCK_VIRTUAL, and we are using icount, the value of cpu_get_icount(); will be returned. + +However if I understand correctly, in order to convert icount to ns, you must use take into account the icount shift -- as defined in the documentation: "The virtual cpu will execute one instruction every 2^N ns of virtual time.". + +Therefor, the correct value to return would be cpu_icount_to_ns(cpu_get_icount()), where cpu_icount_to_ns is defined in cpus.c: + +296: int64_t cpu_icount_to_ns(int64_t icount) +297: { +298: return icount << icount_time_shift; +299: } + +Best Regards, +Humberto "SilverOne" Carvalho \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1790460 b/results/classifier/gemma3:12b/performance/1790460 new file mode 100644 index 00000000..4d109d4f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1790460 @@ -0,0 +1,26 @@ + +-icount,sleep=off mode is broken (target slows down or hangs) + +QEMU running with options "-icount,sleep=off -rtc clock=vm" doesn't execute emulation at maximum possible speed. +Target virtual clock may run faster or slower than realtime clock by N times, where N value depends on various unrelated conditions (i.e. random from the user point of view). The worst case is when target hangs (hopefully, early in booting stage). +Example scenarios I've described here: http://lists.nongnu.org/archive/html/qemu-discuss/2018-08/msg00102.html + +QEMU process just sleeps most of the time (polling, waiting some condition, etc.). I've tried to debug issue and came to 99% conclusion that there are racing somewhere in qemu internals. + +The feature is broken since v2.6.0 release. +Bad commit is 281b2201e4e18d5b9a26e1e8d81b62b5581a13be by Pavel Dovgalyuk, 03/10/2016 05:56 PM: + + icount: remove obsolete warp call + + qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed + anymore, because it is called in every iteration of main_loop_wait. + + Reviewed-by: Paolo Bonzini <email address hidden> + + Signed-off-by: Pavel Dovgalyuk <email address hidden> + Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP> + Signed-off-by: Paolo Bonzini <email address hidden> + +I've reverted commit to all major releases and latest git master branch. Issue was fixed for all of them. My adaptation is trivial: just restoring removed function call before "qemu_cond_wait(...)" line. + +I'm sure following bugs are just particular cases of the issue: #1774677, #1653063 . \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1794285 b/results/classifier/gemma3:12b/performance/1794285 new file mode 100644 index 00000000..c49dbfbf --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1794285 @@ -0,0 +1,101 @@ + +100% Host CPU usage while guest idling + +Hi, + +We have an appliance that runs a FreeBSD guest on a Yocto-based host via qemu-system-x86_64. +Everything functions fine however the host uses n00% of the CPU (where n = #smp) and RAM allocated to it whilst the 1 guest is sat nearing idle. + +Host: +PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND +4406 root 20 0 16.7g 16g 26m S 500 53.0 17958:38 qemu-system-x86 + +Guest: +CPU 0: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle +CPU 1: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle +CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle +CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle +CPU 4: 0.4% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.6% idle +Mem: 43M Active, 4783M Inact, 1530M Wired, 911M Buf, 9553M Free +Swap: 3072M Total, 3072M Free + +I have logged this with the appliance vendor and received the response: +"This is expected behaviour and you will see the same in any case where a Guest OS runs over a Host OS. +Host here has 5 CPUs and it has assigned all of them to Guest. +Since the Host is not being shared by any Guest OS; you will always see the 500% (or the 5 CPUs) given to qemu-system-x86. +I do see the same in lab and is very much expected" + +This feels fundamentally wrong to me. +I'm somewhat limited by what can be tested due to the nature of this being an appliance rather than a mainstream distro. + +I'm looking for feedback that I can use to push the vendor into investigating this issue. + +Versions below. + +Many thanks, +Gareth + + + +Host: +Linux 204a-node 3.10.100-ovp-rt110-WR6.0.0.31_preempt-rt #1 SMP Fri +Aug 3 01:59:01 PDT 2018 x86_64 x86_64 x86_64 GNU/Linux + +Qemu: +QEMU emulator version 1.7.2, Copyright (c) 2003-2008 Fabrice Bellard + +Command: +(Vendor identifying information has been removed) + +/usr/bin/qemu-system-x86_64 \ +-name REMOVED \ +-S \ +-machine pc-i440fx-1.7,accel=kvm,usb=off \ +-m 16384 \ +-realtime mlock=on \ +-smp 5,sockets=5,cores=1,threads=1 \ +-uuid 76277b29-3bd4-4dd4-a705-ed34d6449d6d \ +-nographic \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/REMOVED.monitor,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ +-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x17 \ +-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 \ +-device virtio-net-pci,netdev=hostnet0,id=net0,mac=REMOVED,bus=pci.0,addr=0x11 \ +-netdev tap,ifname=tap1,script=/etc/vehostd/XXX-em3-ifup,id=hostnet1,vhost=on,vhostfd=24 \ +-device virtio-net-pci,netdev=hostnet1,id=net1,mac=REMOVED,bus=pci.0,addr=0x12 \ +-netdev tap,ifname=tap2,script=/etc/vehostd/REMOVED-em4-ifup-SUMMIT,id=hostnet2,vhost=on,vhostfd=25 \ +-device virtio-net-pci,netdev=hostnet2,id=net2,mac=REMOVED,bus=pci.0,addr=0x1c \ +-netdev tap,ifname=tap3,script=/etc/vehostd/REMOVED-em4-re-re-ifup,id=hostnet3,vhost=on,vhostfd=26 \ +-device virtio-net-pci,netdev=hostnet3,id=net3,mac=REMOVED,bus=pci.0,addr=0x1d \ +-chardev pty,id=charserial0 \ +-device isa-serial,chardev=charserial0,id=serial0 \ +-chardev tty,id=charserial1,path=/dev/ttyS1 \ +-device isa-serial,chardev=charserial1,id=serial1 \ +-chardev tty,id=charserial2,path=/dev/ttyS2 \ +-device isa-serial,chardev=charserial2,id=serial2 \ +-chardev tty,id=charserial3,path=/dev/ttyS3 \ +-device isa-serial,chardev=charserial3,id=serial3 \ +-device i6300esb,id=watchdog0,bus=pci.0,addr=0x10 \ +-watchdog-action reset \ +-object rng-random,id=rng0,filename=/dev/random \ +-device virtio-rng-pci,rng=rng0,max-bytes=1024,period=2000,bus=pci.0,addr=0x1e \ +-smbios type=0,vendor="INSYDE Corp.",version=REMOVED,date=11/03/2017,release=1.00 \ +-smbios type=1,manufacturer=REMOVED,product=REMOVED,version=REMOVED,serial=VF-NET \ +-device REMOVED-pci,host=0000:1c:00.0 \ +-device kvm-pci-assign,host=0000:00:14.0 \ +-device pci-hgcommdev,vmindex=0,bus=pci.0,addr=0x16 \ +-drive file=/REMOVED/REMOVED-current.img,if=none,id=drive-virtio-disk0,format=raw,cache=directsync,aio=native \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x13,drive=drive-virtio-disk0,id=virtio-disk0,config-wce=off,x-data-plane=on,bootindex=1 \ +-drive file=/REMOVED/REMOVED-var-config.img,if=none,id=drive-virtio-disk1,format=raw,cache=directsync,aio=native \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x15,drive=drive-virtio-disk1,id=virtio-disk1,config-wce=off,x-data-plane=on,bootindex=-1 \ +-drive file=/REMOVED/REMOVED-aux-disk.img,if=none,id=drive-ide0-0-1,format=raw,cache=directsync,discard=unmap \ +-device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=-1 \ +-drive file=/REMOVED/images/0/REMOVED-platform.img,if=none,id=drive-ide1-0-1,format=raw,cache=directsync,discard=unmap \ +-device ide-hd,bus=ide.1,unit=1,drive=drive-ide1-0-1,id=ide1-0-1,bootindex=-1 \ +-msg timestamp=on \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1805256 b/results/classifier/gemma3:12b/performance/1805256 new file mode 100644 index 00000000..149bd551 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1805256 @@ -0,0 +1,27 @@ + +qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images + +On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img frequently hangs (~50% of the time) with this command: + +qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2 + +Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This qcow2->qcow2 conversion happens to be something uvtool does every time it fetches images. + +Once hung, attaching gdb gives the following backtrace: + +(gdb) bt +#0 0x0000ffffae4f8154 in __GI_ppoll (fds=0xaaaae8a67dc0, nfds=187650274213760, + timeout=<optimized out>, timeout@entry=0x0, sigmask=0xffffc123b950) + at ../sysdeps/unix/sysv/linux/ppoll.c:39 +#1 0x0000aaaabbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, + __fds=<optimized out>) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77 +#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, + timeout=timeout@entry=-1) at util/qemu-timer.c:322 +#3 0x0000aaaabbefbf80 in os_host_main_loop_wait (timeout=-1) + at util/main-loop.c:233 +#4 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497 +#5 0x0000aaaabbe2aa30 in convert_do_copy (s=0xffffc123bb58) at qemu-img.c:1980 +#6 img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2456 +#7 0x0000aaaabbe2333c in main (argc=7, argv=<optimized out>) at qemu-img.c:4975 + +Reproduced w/ latest QEMU git (@ 53744e0a182) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1811653 b/results/classifier/gemma3:12b/performance/1811653 new file mode 100644 index 00000000..5f93fbc9 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1811653 @@ -0,0 +1,42 @@ + +usbredir slow when multi bulk packet per second + +QEMU Ver: all version +Client: virt-viewer by spice +Guest VM: win7 +Bug description: + Use Qemu 2.1 or later with usbredir, When I redirect a bulk usb-device from virt-viewer client, +the bulk-usb-device driver or app in GuestVM will send 50 bulk-urb per times. + In VM, using the usblyzer to monitor the usb packet, it show these 50 bulk-urb packet send in 1ms, +But in the QEMU VM log, It shows as below +========================= +2019-01-14T08:27:26.096809Z qemu-kvm: usb-redir: bulk-out ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.105680Z qemu-kvm: usb-redir: bulk-in status 0 ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.108219Z qemu-kvm: usb-redir: bulk-out ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.116742Z qemu-kvm: usb-redir: bulk-in status 0 ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.119242Z qemu-kvm: usb-redir: bulk-out ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.129851Z qemu-kvm: usb-redir: bulk-in status 0 ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.132349Z qemu-kvm: usb-redir: bulk-out ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.141248Z qemu-kvm: usb-redir: bulk-in status 0 ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.144932Z qemu-kvm: usb-redir: bulk-out ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +2019-01-14T08:27:26.154035Z qemu-kvm: usb-redir: bulk-in status 0 ep 86 stream 0 len 49152 id 2114122112 0x7f0ffa300b40 +========================= + + It shows that the bulk packet is single thread send and recv, per bulk packet will use 10-20ms, all 50 bulk-packets will use 500~1000ms, so the in the VM, bulk-urb will timeout always! + + How to send the bulk packet by multithread to speedup the bulk-urb send and recv, for example: +------------ + bulk-out ep 86 stream 0 len 49152 id xxxx1 + bulk-out ep 86 stream 0 len 49152 id xxxx2 + bulk-out ep 86 stream 0 len 49152 id xxxx3 + bulk-out ep 86 stream 0 len 49152 id xxxx4 + bulk-out ... + bulk-out ep 86 stream 0 len 49152 id xxxx50 +... + bulk-in status 0 ep 86 stream 0 len 49152 id xxxx1 + bulk-in status 0 ep 86 stream 0 len 49152 id xxxx2 + bulk-in status 0 ep 86 stream 0 len 49152 id xxxx3 + bulk-in status 0 ep 86 stream 0 len 49152 id xxxx4 + bulk-out ... + bulk-in status 0 ep 86 stream 0 len 49152 id xxxx50 +------------ \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1812694 b/results/classifier/gemma3:12b/performance/1812694 new file mode 100644 index 00000000..cc90ecbb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1812694 @@ -0,0 +1,6 @@ + +qemu-system-x86_64 version 3.0+ is 20 times slower than version 2.12 + +version 3.0+ is 20 times slower than version 2.12 +I wrote a small 64-bit operating system (SIGMAOS) in which I use the lzma decoder. unpacking the file takes 20 times slower than in version 2.12. +You can download it from https://drive.google.com/drive/folders/0B_wEiYjzVkC0ZGtkbENENzF1Nms \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1818075 b/results/classifier/gemma3:12b/performance/1818075 new file mode 100644 index 00000000..6378ae9e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1818075 @@ -0,0 +1,54 @@ + +qemu x86 TCG doesn't support AVX insns + +I'm trying to execute code that has been built with -march=skylake -mtune=generic -mavx2 under qemu-user x86-64 with -cpu Skylake-Client. However this code just hangs at 100% CPU. + +Adding input tracing shows that it is likely hanging when dealing with an AVX instruction: + +warning: TCG doesn't support requested feature: CPUID.01H:ECX.fma [bit 12] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.pcid [bit 17] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.avx [bit 28] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.f16c [bit 29] +warning: TCG doesn't support requested feature: CPUID.01H:ECX.rdrand [bit 30] +warning: TCG doesn't support requested feature: CPUID.07H:EBX.hle [bit 4] +warning: TCG doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5] +warning: TCG doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10] +warning: TCG doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11] +warning: TCG doesn't support requested feature: CPUID.07H:EBX.rdseed [bit 18] +warning: TCG doesn't support requested feature: CPUID.80000001H:ECX.3dnowprefetch [bit 8] +warning: TCG doesn't support requested feature: CPUID.0DH:EAX.xsavec [bit 1] + +IN: +0x4000b4ef3b: c5 fb 5c ca vsubsd %xmm2, %xmm0, %xmm1 +0x4000b4ef3f: c4 e1 fb 2c d1 vcvttsd2si %xmm1, %rdx +0x4000b4ef44: 4c 31 e2 xorq %r12, %rdx +0x4000b4ef47: 48 85 d2 testq %rdx, %rdx +0x4000b4ef4a: 79 9e jns 0x4000b4eeea + +[ hangs ] + +Attaching a gdb produces this stacktrace: + +(gdb) bt +#0 canonicalize (status=0x55a20ff67a88, parm=0x55a20bb807e0 <float64_params>, part=...) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:350 +#1 float64_unpack_canonical (s=0x55a20ff67a88, f=0) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:547 +#2 float64_sub (a=0, b=4890909195324358656, status=0x55a20ff67a88) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/fpu/softfloat.c:776 +#3 0x000055a20baa1949 in helper_subsd (env=<optimized out>, d=0x55a20ff67ad8, s=<optimized out>) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/target/i386/ops_sse.h:623 +#4 0x000055a20cfcfea8 in static_code_gen_buffer () +#5 0x000055a20ba3f764 in cpu_tb_exec (itb=<optimized out>, cpu=0x55a20cea2180 <static_code_gen_buffer+15684720>) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:171 +#6 cpu_loop_exec_tb (tb_exit=<synthetic pointer>, last_tb=<synthetic pointer>, tb=<optimized out>, + cpu=0x55a20cea2180 <static_code_gen_buffer+15684720>) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:615 +#7 cpu_exec (cpu=cpu@entry=0x55a20ff5f4d0) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/accel/tcg/cpu-exec.c:725 +#8 0x000055a20ba6d728 in cpu_loop (env=0x55a20ff67780) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/linux-user/x86_64/../i386/cpu_loop.c:93 +#9 0x000055a20ba049ff in main (argc=<optimized out>, argv=0x7ffc58572868, envp=<optimized out>) + at /data/poky-tmp/master/work/x86_64-linux/qemu-native/3.1.0-r0/qemu-3.1.0/linux-user/main.c:819 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1820 b/results/classifier/gemma3:12b/performance/1820 new file mode 100644 index 00000000..54126d9d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1820 @@ -0,0 +1,11 @@ + +whpx is slower than tcg +Description of problem: +I find whpx much slower than tcg, which is rather odd. +Steps to reproduce: +1. Enable Hyper-V +2. run qemu with **-accel whpx,kernel-irqchip=off** +Additional information: +my cpu: intel i7 6500u +memory: 8go +my gpu: intel graphics 520 hd diff --git a/results/classifier/gemma3:12b/performance/1821 b/results/classifier/gemma3:12b/performance/1821 new file mode 100644 index 00000000..2be09da5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1821 @@ -0,0 +1,54 @@ + +snapshot-save very slow in 8.1-rc2 +Description of problem: +Before commit 813cd61669 ("migration: Use migration_transferred_bytes() to calculate rate_limit") the above script will take about 1.5 seconds to execute, after the commit, 1 minute 30 seconds. More RAM makes it take longer still. +Steps to reproduce: +1. Execute the script given as the command line above. +Additional information: +Creating the issue here, so it doesn't get lost and is documented. + +The following series by @juan.quintela would've avoided the regression, but seems like it never landed: https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg07971.html + +Logs: + +Before commit 813cd61669 +``` +root@pve8a1 /home/febner/repos/qemu/build # time ~/save-snap.sh +Formatting '/tmp/test.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1073741824 lazy_refcounts=off refcount_bits=16 +{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 8}, "package": "v8.0.0-967-g3db9c05a90-dirty"}, "capabilities": ["oob"]}} +VNC server running on ::1:5900 +{"return": {}} +{"timestamp": {"seconds": 1691572701, "microseconds": 708660}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "save0"}} +{"timestamp": {"seconds": 1691572701, "microseconds": 708731}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "save0"}} +{"return": {}} +{"timestamp": {"seconds": 1691572701, "microseconds": 709239}, "event": "STOP"} +{"timestamp": {"seconds": 1691572702, "microseconds": 939059}, "event": "RESUME"} +{"timestamp": {"seconds": 1691572702, "microseconds": 939565}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "save0"}} +{"timestamp": {"seconds": 1691572702, "microseconds": 939605}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "save0"}} +{"timestamp": {"seconds": 1691572702, "microseconds": 939638}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "save0"}} +{"return": {}} +{"timestamp": {"seconds": 1691572702, "microseconds": 939730}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}} +{"timestamp": {"seconds": 1691572702, "microseconds": 941746}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "save0"}} +~/save-snap.sh 1.18s user 0.09s system 85% cpu 1.476 total +``` + +After commit 813cd61669 +``` +root@pve8a1 /home/febner/repos/qemu/build # time ~/save-snap.sh +Formatting '/tmp/test.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1073741824 lazy_refcounts=off refcount_bits=16 +{"QMP": {"version": {"qemu": {"micro": 92, "minor": 0, "major": 8}, "package": "v8.1.0-rc2-102-ga8fc5165aa"}, "capabilities": ["oob"]}} +VNC server running on ::1:5900 +{"return": {}} +{"timestamp": {"seconds": 1691572864, "microseconds": 944026}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "save0"}} +{"timestamp": {"seconds": 1691572864, "microseconds": 944115}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "save0"}} +{"return": {}} +{"timestamp": {"seconds": 1691572864, "microseconds": 944631}, "event": "STOP"} +{"timestamp": {"seconds": 1691572954, "microseconds": 697523}, "event": "RESUME"} +{"timestamp": {"seconds": 1691572954, "microseconds": 697962}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "save0"}} +{"timestamp": {"seconds": 1691572954, "microseconds": 697996}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "save0"}} +{"timestamp": {"seconds": 1691572954, "microseconds": 698020}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "save0"}} +{"return": {}} +{"timestamp": {"seconds": 1691572954, "microseconds": 698089}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}} +{"timestamp": {"seconds": 1691572954, "microseconds": 701263}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "save0"}} +~/save-snap.sh 31.81s user 41.69s system 81% cpu 1:30.03 total +``` diff --git a/results/classifier/gemma3:12b/performance/1824053 b/results/classifier/gemma3:12b/performance/1824053 new file mode 100644 index 00000000..842f5bfa --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1824053 @@ -0,0 +1,34 @@ + +Qemu-img convert appears to be stuck on aarch64 host with low probability + +Hi, I found a problem that qemu-img convert appears to be stuck on aarch64 host with low probability. + +The convert command line is "qemu-img convert -f qcow2 -O raw disk.qcow2 disk.raw ". + +The bt is below: + +Thread 2 (Thread 0x40000b776e50 (LWP 27215)): +#0 0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6 +#1 0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0 +#2 0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at util/compatfd.c:37 +#3 0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) at util/qemu_thread_posix.c:496 +#4 0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0 +#5 0x000040000a492b2c in thread_start () from /lib64/libc.so.6 + +Thread 1 (Thread 0x40000b573370 (LWP 27214)): +#0 0x000040000a489020 in ppoll () from /lib64/libc.so.6 +#1 0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 +#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at qemu_timer.c:391 +#3 0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at main_loop.c:272 +#4 0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at main_loop.c:534 +#5 0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at qemu-img.c:1923 +#6 0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized out>) at qemu-img.c:2414 +#7 0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at qemu-img.c:5305 + + +The problem seems to be very similar to the phenomenon described by this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch), + +which force main loop wakeup with SIGIO. But this patch was reverted by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch). + +The problem still seems to exist in aarch64 host. The qemu version I used is 2.8.1. The host version is 4.19.28-1.2.108.aarch64. + Do you have any solutions to fix it? Thanks for your reply ! \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1826393 b/results/classifier/gemma3:12b/performance/1826393 new file mode 100644 index 00000000..d0bdaf81 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1826393 @@ -0,0 +1,81 @@ + +QEMU 3.1.0 stuck waiting for 800ms (5 times slower) in pre-bios phase + +Yesterday I have upgraded my laptop from Ubuntu 18.10 to 19.04 and that way got newer QEMU 3.1.0 along vs QEMU 2.12.0 before. I have noticed that everytime I start QEMU to run OSv, QEMU seems to hand noticably longer (~1 second) before showing SeaBIOS output. I have tried all kind of combinations to get rid of that pause and nothing helped. + +Here is my start command: +time qemu-system-x86_64 -m 256M -smp 1 -nographic -nodefaults \ + -device virtio-blk-pci,id=blk0,bootindex=0,drive=hd0,scsi=off \ + -drive file=usr.img,if=none,id=hd0,cache=none,aio=thre\ + -enable-kvm \ + -cpu host,+x2apic -chardev stdio,mux=on,id=stdio,signal=off \ + -mon chardev=stdio,mode=readline -device isa-serial,chardev=stdio + +It looks like qemu process starts, waits almost a second for something and then print SeaBIOS splashscreen and continues boot: + +--> waits here +SeaBIOS (version 1.12.0-1) +Booting from Hard Disk..OSv v0.53.0-6-gc8395118 + disk read (real mode): 27.25ms, (+27.25ms) + uncompress lzloader.elf: 46.22ms, (+18.97ms) + TLS initialization: 46.79ms, (+0.57ms) + .init functions: 47.82ms, (+1.03ms) + SMP launched: 48.08ms, (+0.26ms) + VFS initialized: 49.25ms, (+1.17ms) + Network initialized: 49.48ms, (+0.24ms) + pvpanic done: 49.57ms, (+0.08ms) + pci enumerated: 52.42ms, (+2.85ms) + drivers probe: 52.42ms, (+0.00ms) + drivers loaded: 55.33ms, (+2.90ms) + ROFS mounted: 56.37ms, (+1.04ms) + Total time: 56.37ms, (+0.00ms) +Found optarg +dev etc hello libenviron.so libvdso.so proc tmp tools usr + +real 0m0.935s +user 0m0.426s +sys 0m0.490s + +With version 2.12.0 I used to see real below 200ms. So it seems qemu slowed down 5 times. + +I ran strace -tt against it and I have noticed a pause here: +... +07:31:41.848579 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 0 +07:31:41.848604 futex(0x55c4a2ff6308, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 +07:31:41.848649 ioctl(10, KVM_SET_PIT2, 0x7ffdd272d1f0) = 0 +07:31:41.848674 ioctl(9, KVM_CHECK_EXTENSION, KVM_CAP_KVMCLOCK_CTRL) = 1 +07:31:41.848699 ioctl(10, KVM_SET_CLOCK, 0x7ffdd272d230) = 0 +07:31:41.848724 futex(0x55c4a49a9a9c, FUTEX_WAKE_PRIVATE, 2147483647) = 1 +07:31:41.848747 getpid() = 5162 +07:31:41.848769 tgkill(5162, 5166, SIGUSR1) = 0 +07:31:41.848791 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 0 +07:31:41.848814 futex(0x55c4a49a9a98, FUTEX_WAKE_PRIVATE, 2147483647) = 1 +07:31:41.848837 getpid() = 5162 +07:31:41.848858 tgkill(5162, 5166, SIGUSR1) = 0 +07:31:41.848889 write(8, "\1\0\0\0\0\0\0\0", 8) = 8 +07:31:41.848919 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 1 +07:31:41.848943 ppoll([{fd=0, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, +{fd=8, events=POLLIN}], 5, {tv_sec=0, tv_nsec=0}, NULL, 8) = 1 ([{fd=8, revents=POLLIN}], left {tv_sec=0, tv_nsec=0 +}) +07:31:41.849003 futex(0x55c4a2fd34c0, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) +07:31:41.849031 read(8, "\5\0\0\0\0\0\0\0", 16) = 8 +07:31:41.849064 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 0 +07:31:41.849086 ppoll([{fd=0, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, +{fd=8, events=POLLIN}], 5, {tv_sec=0, tv_nsec=984624000}, NULL, 8) = 1 ([{fd=7, revents=POLLIN}], left {tv_sec=0, t +v_nsec=190532609}) + +--> waits for almost 800ms + +07:31:42.643272 futex(0x55c4a2fd34c0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0 +07:31:42.643522 read(7, "\1\0\0\0\0\0\0\0", 512) = 8 +07:31:42.643625 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 1 +07:31:42.643646 ppoll([{fd=0, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, +{fd=8, events=POLLIN}], 5, {tv_sec=0, tv_nsec=190066000}, NULL, 8) = 2 ([{fd=4, revents=POLLIN}, {fd=8, revents=POL +LIN}], left {tv_sec=0, tv_nsec=189909632}) +07:31:42.643836 futex(0x55c4a2fd34c0, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) +07:31:42.643859 read(8, "\2\0\0\0\0\0\0\0", 16) = 8 +07:31:42.643880 futex(0x55c4a2fd34c0, FUTEX_WAKE_PRIVATE, 1) = 1 + +... + +when I run same command using qemu 3.0.5 that I still happen to have on the same machine that I built directly from source I see total boot time at around 200ms. It seems like a regression. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1826401 b/results/classifier/gemma3:12b/performance/1826401 new file mode 100644 index 00000000..230d3a4e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1826401 @@ -0,0 +1,39 @@ + +qemu-system-aarch64 has a high cpu usage on windows + +Running qemu-system-aarch64 leads to a high CPU consumption on windows 10. + +Tested with qemu: 4.0.0-rc4 & 3.1.0 & 2.11.0 + +Command: qemu_start_command = [ + qemu-system-aarch64, + "-pidfile", + target_path + "/qemu" + str(instance) + ".pid", + "-machine", + "virt", + "-cpu", + "cortex-a57", + "-nographic", + "-smp", + "2", + "-m", + "2048", + "-kernel", + kernel_path, + "--append", + "console=ttyAMA0 root=/dev/vda2 rw ipx=" + qemu_instance_ip + "/64 net.ifnames=0 biosdevname=0", + "-drive", + "file=" + qemu_instance_img_path + ",if=none,id=blk", + "-device", + "virtio-blk-device,drive=blk", + "-netdev", + "socket,id=mynet0,udp=127.0.0.1:2000,localaddr=127.0.0.1:" + qemu_instance_port, + "-device", + "virtio-net-device,netdev=mynet0", + "-serial", + "file:" + target_path + "/qemu" + str(instance) + ".log" + ] + +*The cpu consumption is ~70%. +*No acceleration used. +*This CPU consumption is obtained only by running the above command. No workload on the guest OS. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1829696 b/results/classifier/gemma3:12b/performance/1829696 new file mode 100644 index 00000000..a8724fac --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1829696 @@ -0,0 +1,260 @@ + +qemu-kvm takes 100% CPU when running redhat/centos 7.6 guest VM OS + +Description +=========== +When running redhat or centos 7.6 guest os on vm, +the cpu usage is very low on vm(100% idle), but on host, +qemu-kvm reports 100% cpu busy usage. + +After searching some related bugs report, +I suspect that it is due to the clock settings in vm's domain xml. +My Openstack cluster uses the default clock settings as follow: + <clock offset='utc'> + <timer name='rtc' tickpolicy='catchup'/> + <timer name='pit' tickpolicy='delay'/> + <timer name='hpet' present='no'/> + </clock> +And in this report, https://bugs.launchpad.net/qemu/+bug/1174654 +it claims that <timer name='rtc' track='guest'/> can solve the 100% cpu usage problem when using Windows Image Guest OS, +but I makes some tests, the solusion dose not work for me. + + +Steps to reproduce +================== +* create a vm using centos or redhat 7.6 image +* using sar tool inside vm and host to check the cpu usage, and compare them + + +Expected result +=============== +host's cpu usage report should be same with vm's cpu usage + + +Actual result +============= +vm's cpu usage is 100% idle, host's cpu usage is 100% busy + + +Environment +=========== +1. Exact version of OpenStack you are running. +# rpm -qa | grep nova +openstack-nova-compute-13.1.2-1.el7.noarch +python2-novaclient-3.3.2-1.el7.noarch +python-nova-13.1.2-1.el7.noarch +openstack-nova-common-13.1.2-1.el7.noarch + +2. Which hypervisor did you use? + (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) + What's the version of that? +# libvirtd -V +libvirtd (libvirt) 3.9.0 + +# /usr/libexec/qemu-kvm --version +QEMU emulator version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.6.1), Copyright (c) 2003-2008 Fabrice Bellard + + +Logs & Configs +============== +The VM xml: +<domain type='kvm' id='29'> + <name>instance-00005022</name> + <uuid>7f5a66a5-****-****-****-75dec****bbb</uuid> + <metadata> + <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> + <nova:package version="13.1.2-1.el7"/> + <nova:name>*******</nova:name> + <nova:creationTime>2019-05-20 03:08:46</nova:creationTime> + <nova:flavor name="2d2dab36-****-****-****-246e9****110"> + <nova:memory>2048</nova:memory> + <nova:disk>12</nova:disk> + <nova:swap>2048</nova:swap> + <nova:ephemeral>0</nova:ephemeral> + <nova:vcpus>1</nova:vcpus> + </nova:flavor> + <nova:owner> + <nova:user uuid="********************">****</nova:user> + <nova:project uuid="********************">****</nova:project> + </nova:owner> + <nova:root type="image" uuid="4496a420-****-****-****-b50f****ada3"/> + </nova:instance> + </metadata> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static'>1</vcpu> + <cputune> + <shares>1024</shares> + <vcpupin vcpu='0' cpuset='27'/> + <emulatorpin cpuset='27'/> + </cputune> + <numatune> + <memory mode='strict' nodeset='1'/> + <memnode cellid='0' mode='strict' nodeset='1'/> + </numatune> + <resource> + <partition>/machine</partition> + </resource> + <sysinfo type='smbios'> + <system> + <entry name='manufacturer'>Fedora Project</entry> + <entry name='product'>OpenStack Nova</entry> + <entry name='version'>13.1.2-1.el7</entry> + <entry name='serial'>64ab0e89-****-****-****-05312ef66983</entry> + <entry name='uuid'>7f5a66a5-****-****-****-75decaf82bbb</entry> + <entry name='family'>Virtual Machine</entry> + </system> + </sysinfo> + <os> + <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> + <boot dev='hd'/> + <smbios mode='sysinfo'/> + </os> + <features> + <acpi/> + <apic/> + </features> + <cpu mode='custom' match='exact' check='full'> + <model fallback='forbid'>IvyBridge</model> + <topology sockets='1' cores='1' threads='1'/> + <feature policy='require' name='hypervisor'/> + <feature policy='require' name='arat'/> + <feature policy='require' name='xsaveopt'/> + <numa> + <cell id='0' cpus='0' memory='2097152' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'> + <timer name='pit' tickpolicy='delay'/> + <timer name='rtc' tickpolicy='catchup'/> + <timer name='hpet' present='no'/> + </clock> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/qemu-kvm</emulator> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' cache='none'/> + <source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk'/> + <backingStore/> + <target dev='vda' bus='virtio'/> + <alias name='virtio-disk0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' cache='none'/> + <source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk.swap'/> + <backingStore/> + <target dev='vdb' bus='virtio'/> + <alias name='virtio-disk1'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw' cache='none'/> + <source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk.config'/> + <backingStore/> + <target dev='hdd' bus='ide'/> + <readonly/> + <alias name='ide0-1-1'/> + <address type='drive' controller='0' bus='1' target='0' unit='1'/> + </disk> + <controller type='usb' index='0' model='piix3-uhci'> + <alias name='usb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci.0'/> + </controller> + <controller type='ide' index='0'> + <alias name='ide'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <interface type='bridge'> + <mac address='fa:16:3e:a6:ea:4f'/> + <source bridge='brq52c66dc3-64'/> + <bandwidth> + <inbound average='102400'/> + <outbound average='102400'/> + </bandwidth> + <target dev='tapa29e94e5-42'/> + <model type='virtio'/> + <alias name='net0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <serial type='file'> + <source path='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/console.log'/> + <target type='isa-serial' port='0'> + <model name='isa-serial'/> + </target> + <alias name='serial0'/> + </serial> + <serial type='pty'> + <source path='/dev/pts/10'/> + <target type='isa-serial' port='1'> + <model name='isa-serial'/> + </target> + <alias name='serial1'/> + </serial> + <console type='file'> + <source path='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/console.log'/> + <target type='serial' port='0'/> + <alias name='serial0'/> + </console> + <input type='tablet' bus='usb'> + <alias name='input0'/> + <address type='usb' bus='0' port='1'/> + </input> + <input type='mouse' bus='ps2'> + <alias name='input1'/> + </input> + <input type='keyboard' bus='ps2'> + <alias name='input2'/> + </input> + <graphics type='vnc' port='5910' autoport='yes' listen='0.0.0.0' keymap='en-us'> + <listen type='address' address='0.0.0.0'/> + </graphics> + <video> + <model type='cirrus' vram='16384' heads='1' primary='yes'/> + <alias name='video0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </video> + <memballoon model='virtio'> + <stats period='10'/> + <alias name='balloon0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </memballoon> + </devices> + <seclabel type='dynamic' model='dac' relabel='yes'> + <label>+107:+107</label> + <imagelabel>+107:+107</imagelabel> + </seclabel> +</domain> + +CPU Usage Report inside VM: +# sar -u -P 0 1 5 +Linux 3.10.0-957.el7.x86_64 (******) 05/20/2019 _x86_64_ (1 CPU) + +11:34:40 AM CPU %user %nice %system %iowait %steal %idle +11:34:41 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 +11:34:42 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 +11:34:43 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 +11:34:44 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 +11:34:45 AM 0 0.00 0.00 0.00 0.00 0.00 100.00 +Average: 0 0.00 0.00 0.00 0.00 0.00 100.00 + +CPU Usage Report ON HOST(the vm's cpu is pinned on host's no.27 physic cpu): +# sar -u -P 27 1 5 +Linux 3.10.0-862.el7.x86_64 (******) 05/20/2019 _x86_64_ (48 CPU) + +11:34:40 AM CPU %user %nice %system %iowait %steal %idle +11:34:41 AM 27 100.00 0.00 0.00 0.00 0.00 0.00 +11:34:42 AM 27 100.00 0.00 0.00 0.00 0.00 0.00 +11:34:43 AM 27 100.00 0.00 0.00 0.00 0.00 0.00 +11:34:44 AM 27 100.00 0.00 0.00 0.00 0.00 0.00 +11:34:45 AM 27 100.00 0.00 0.00 0.00 0.00 0.00 +Average: 27 100.00 0.00 0.00 0.00 0.00 0.00 + +clocksource inside VM: +# cat /sys/devices/system/clocksource/clocksource0/current_clocksource +kvm_clock \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1830 b/results/classifier/gemma3:12b/performance/1830 new file mode 100644 index 00000000..49cb7361 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1830 @@ -0,0 +1,27 @@ + +command hangs in CentOS 7 arm64 container with Ubuntu 22 amd64 host +Description of problem: +The command hangs in the container, taking over the CPU: + +``` +$ docker run -it centos:7 +[root@42e655bf3d60 /]# LD_DEBUG=all /lib64/ld-2.17.so --list /usr/bin/true & +[1] 74 +[root@42e655bf3d60 /]# 74: file=/usr/bin/true [0]; generating link map + +[root@42e655bf3d60 /]# ps -e -o pid,ppid,etime,time,state,args + PID PPID ELAPSED TIME S COMMAND + 1 0 34:59 00:00:00 S /usr/libexec/qemu-binfmt/aarch64-binfmt-P /bin/bash /bin/bash + 74 1 03:16 00:03:13 R /usr/libexec/qemu-binfmt/aarch64-binfmt-P /lib64/ld-2.17.so /lib64/ld-2.17.so + 80 1 4-19:34:01 00:00:00 R ps -e -o pid,ppid,etime,time,state,args +[root@42e655bf3d60 /]# +``` +Steps to reproduce: +1. Start container +2. Run `/lib64/ld-2.17.so --list /usr/bin/true` +Additional information: +1. The problem is not observed in an Ubuntu 20.04 host system performing the same scenario. +2. My team build environment has amd64 native architecture hardware. I ran a similar scenario on an AWS arm64 native machine (QEMU is not needed) and the command works fine in the container. +3. My team builds several Linux images daily - about a dozen amd64 and eight arm64. This is the only image that's causing us this problem. +4. I built trace-cmd but when I tried to start a trace it told me `No events enabled with kvm`. +5. I built qemu-8.1.0-rc3 and saw the same behavior but I don't think `/usr/libexec/qemu-binfmt/aarch64-binfmt-P` was replaced with a new version so I don't think the old version was used for my container. diff --git a/results/classifier/gemma3:12b/performance/1831750 b/results/classifier/gemma3:12b/performance/1831750 new file mode 100644 index 00000000..c820a074 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1831750 @@ -0,0 +1,71 @@ + +virtual machine cpu soft lockup when qemu attach disk + +Hi, I found a problem that virtual machine cpu soft lockup when I attach a disk to the vm in the case that + +backend storage network has a large delay or IO pressure is too large. + +1) The disk xml which I attached is: + + <disk type='block' device='lun' rawio='yes'> + <driver name='qemu' type='raw' cache='none' io='native'/> + <source dev='/dev/mapper/360022a11000c1e0a0787c23a000001cb'/> + <backingStore/> + <target dev='sdb' bus='scsi'/> + <alias name='scsi0-0-1-0'/> + <address type='drive' controller='0' bus='0' target='1' unit='0'/> + </disk> + +2) The bt of qemu main thread: + +#0 0x0000ffff9d78402c in pread64 () from /lib64/libpthread.so.0 +#1 0x0000aaaace3357d8 in pread64 (__offset=0, __nbytes=4096, __buf=0xaaaad47a5200, __fd=202) at /usr/include/bits/unistd.h:99 +#2 raw_is_io_aligned (fd=fd@entry=202, buf=buf@entry=0xaaaad47a5200, len=len@entry=4096) at block/raw_posix.c:294 +#3 0x0000aaaace33597c in raw_probe_alignment (bs=bs@entry=0xaaaad32ea920, fd=202, errp=errp@entry=0xfffffef7a330) at block/raw_posix.c:349 +#4 0x0000aaaace335a48 in raw_refresh_limits (bs=0xaaaad32ea920, errp=0xfffffef7a330) at block/raw_posix.c:811 +#5 0x0000aaaace3404b0 in bdrv_refresh_limits (bs=0xaaaad32ea920, errp=0xfffffef7a330, errp@entry=0xfffffef7a360) at block/io.c:122 +#6 0x0000aaaace340504 in bdrv_refresh_limits (bs=bs@entry=0xaaaad09ce800, errp=errp@entry=0xfffffef7a3b0) at block/io.c:97 +#7 0x0000aaaace2eb9f0 in bdrv_open_common (bs=bs@entry=0xaaaad09ce800, file=file@entry=0xaaaad0e89800, options=<optimized out>, errp=errp@entry=0xfffffef7a450) +at block.c:1194 +#8 0x0000aaaace2eedec in bdrv_open_inherit (filename=<optimized out>, filename@entry=0xaaaad25f92d0 "/dev/mapper/36384c4f100630193359db7a80000011d", +reference=reference@entry=0x0, options=<optimized out>, options@entry=0xaaaad3d0f4b0, flags=<optimized out>, flags@entry=128, parent=parent@entry=0x0, +child_role=child_role@entry=0x0, errp=errp@entry=0xfffffef7a710) at block.c:1895 +#9 0x0000aaaace2ef510 in bdrv_open (filename=filename@entry=0xaaaad25f92d0 "/dev/mapper/36384c4f100630193359db7a80000011d", reference=reference@entry=0x0, +options=options@entry=0xaaaad3d0f4b0, flags=flags@entry=128, errp=errp@entry=0xfffffef7a710) at block.c:1979 +#10 0x0000aaaace331ef0 in blk_new_open (filename=filename@entry=0xaaaad25f92d0 "/dev/mapper/36384c4f100630193359db7a80000011d", reference=reference@entry=0x0, +options=options@entry=0xaaaad3d0f4b0, flags=128, errp=errp@entry=0xfffffef7a710) at block/block_backend.c:213 +#11 0x0000aaaace0da1f4 in blockdev_init (file=file@entry=0xaaaad25f92d0 "/dev/mapper/36384c4f100630193359db7a80000011d", bs_opts=bs_opts@entry=0xaaaad3d0f4b0, +errp=errp@entry=0xfffffef7a710) at blockdev.c:603 +#12 0x0000aaaace0dc478 in drive_new (all_opts=all_opts@entry=0xaaaad4dc31d0, block_default_type=<optimized out>) at blockdev.c:1116 +#13 0x0000aaaace0e3ee0 in add_init_drive ( +optstr=optstr@entry=0xaaaad0872ec0 "file=/dev/mapper/36384c4f100630193359db7a80000011d,format=raw,if=none,id=drive-scsi0-0-0-3,cache=none,aio=native") +at device_hotplug.c:46 +#14 0x0000aaaace0e3f78 in hmp_drive_add (mon=0xfffffef7a810, qdict=0xaaaad0c8f000) at device_hotplug.c:67 +#15 0x0000aaaacdf7d688 in handle_hmp_command (mon=0xfffffef7a810, cmdline=<optimized out>) at /usr/src/debug/qemu-kvm-2.8.1/monitor.c:3199 +#16 0x0000aaaacdf7d778 in qmp_human_monitor_command ( +command_line=0xaaaacfc8e3c0 "drive_add dummy file=/dev/mapper/36384c4f100630193359db7a80000011d,format=raw,if=none,id=drive-scsi0-0-0-3,cache=none,aio=native", +has_cpu_index=false, cpu_index=0, errp=errp@entry=0xfffffef7a968) at /usr/src/debug/qemu-kvm-2.8.1/monitor.c:660 +#17 0x0000aaaace0fdb30 in qmp_marshal_human_monitor_command (args=<optimized out>, ret=0xfffffef7a9e0, errp=0xfffffef7a9d8) at qmp-marshal.c:2223 +#18 0x0000aaaace3b6ad0 in do_qmp_dispatch (request=<optimized out>, errp=0xfffffef7aa20, errp@entry=0xfffffef7aa40) at qapi/qmp_dispatch.c:115 +#19 0x0000aaaace3b6d58 in qmp_dispatch (request=<optimized out>) at qapi/qmp_dispatch.c:142 +#20 0x0000aaaacdf79398 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-kvm-2.8.1/monitor.c:4010 +#21 0x0000aaaace3bd6c0 in json_message_process_token (lexer=0xaaaacf834c80, input=<optimized out>, type=JSON_RCURLY, x=214, y=274) at qobject/json_streamer.c:105 +#22 0x0000aaaace3f3d4c in json_lexer_feed_char (lexer=lexer@entry=0xaaaacf834c80, ch=<optimized out>, flush=flush@entry=false) at qobject/json_lexer.c:319 +#23 0x0000aaaace3f3e6c in json_lexer_feed (lexer=0xaaaacf834c80, buffer=<optimized out>, size=<optimized out>) at qobject/json_lexer.c:369 +#24 0x0000aaaacdf77c64 in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-kvm-2.8.1/monitor.c:4040 +#25 0x0000aaaace0eab18 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0xaaaacf90b280) at qemu_char.c:3260 +#26 0x0000ffff9dadf200 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 +#27 0x0000aaaace3c4a00 in glib_pollfds_poll () at util/main_loop.c:230 +--Type <RET> for more, q to quit, c to continue without paging-- +#28 0x0000aaaace3c4a88 in os_host_main_loop_wait (timeout=<optimized out>) at util/main_loop.c:278 +#29 0x0000aaaace3c4bf0 in main_loop_wait (nonblocking=<optimized out>) at util/main_loop.c:534 +#30 0x0000aaaace0f5d08 in main_loop () at vl.c:2120 +#31 0x0000aaaacdf3a770 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:5017 + + +From the bt we can see, when do qmp sush as drive_add, qemu main thread locks the qemu_global_mutex and do pread in raw_probe_alignmen. Pread is a synchronous operation. If backend storage network has a large delay or IO pressure is too large, the pread operation will not return for a long time, which make vcpu thread can't acquire qemu_global_mutex for a long time and make the vcpu thread unable to be scheduled for a long time. So virtual machine cpu soft lockup happened. + + +I thank qemu main thread should not hold qemu_global_mutex for a long time when do qmp that involving IO synchronous operation sush pread , ioctl, etc. + +Do you have any solutions or good ideas about it? Thanks for your reply! \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1836558 b/results/classifier/gemma3:12b/performance/1836558 new file mode 100644 index 00000000..143fdff8 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1836558 @@ -0,0 +1,49 @@ + +Qemu-ppc Memory leak creating threads + +When creating c++ threads (with c++ std::thread), the resulting binary has memory leaks when running with qemu-ppc. + +Eg the following c++ program, when compiled with gcc, consumes more and more memory while running at qemu-ppc. (does not have memory leaks when compiling for Intel, when running same binary on real powerpc CPU hardware also no memory leaks). + +(Note I used function getCurrentRSS to show available memory, see https://stackoverflow.com/questions/669438/how-to-get-memory-usage-at-runtime-using-c; calls commented out here) + +Compiler: powerpc-linux-gnu-g++ (Debian 8.3.0-2) 8.3.0 (but same problem with older g++ compilers even 4.9) +Os: Debian 10.0 ( Buster) (but same problem seen on Debian 9/stetch) +qemu: qemu-ppc version 3.1.50 + + + +--- + +#include <iostream> +#include <thread> +#include <chrono> + + +using namespace std::chrono_literals; + +// Create/run and join a 100 threads. +void Fun100() +{ +// auto b4 = getCurrentRSS(); +// std::cout << getCurrentRSS() << std::endl; + for(int n = 0; n < 100; n++) + { + std::thread t([] + { + std::this_thread::sleep_for( 10ms ); + }); +// std::cout << n << ' ' << getCurrentRSS() << std::endl; + t.join(); + } + std::this_thread::sleep_for( 500ms ); // to give OS some time to wipe memory... +// auto after = getCurrentRSS(); + std::cout << b4 << ' ' << after << std::endl; +} + + +int main(int, char **) +{ + Fun100(); + Fun100(); // memory used keeps increasing +} \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1840250 b/results/classifier/gemma3:12b/performance/1840250 new file mode 100644 index 00000000..f863c082 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1840250 @@ -0,0 +1,16 @@ + +'make -j1 docker-test-build' uses more than one job + +version: v4.1.0-rc5 + +Run 'make -j1 docker-test-build', wait a few, various containers get instantiated. + +$ make -j1 docker-test-build 2>&1 > /dev/null + +On another terminal: + +$ docker ps +CONTAINER ID IMAGE COMMAND CREATED STATUS +62264a2d777a qemu:debian-mips-cross "/var/tmp/qemu/run t…" 10 minutes ago Up 10 minutes +80807c47d0df qemu:debian-armel-cross "/var/tmp/qemu/run t…" 10 minutes ago Up 10 minutes +06027b5dfd4a qemu:debian-amd64 "/var/tmp/qemu/run t…" 10 minutes ago Up 10 minutes \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1842787 b/results/classifier/gemma3:12b/performance/1842787 new file mode 100644 index 00000000..1c8fe8f4 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1842787 @@ -0,0 +1,82 @@ + +Writes permanently hang with very heavy I/O on virtio-scsi - worse on virtio-blk + +Up to date Arch Linux on host and guest. linux 5.2.11. QEMU 4.1.0. Full command line at bottom. + +Host gives QEMU two thin LVM volumes. The first is the root filesystem, and the second is for heavy I/O, on a Samsung 970 Evo 1TB. + +When maxing out the I/O on the second virtual block device using virtio-blk, I often get a "lockup" in about an hour or two. From the advise of iggy in IRC, I switched over to virtio-scsi. It ran perfectly for a few days, but then "locked up" in the same way. + +By "lockup", I mean writes to the second virtual block device permanently hang. I can read files from it, but even "touch foo" never times out, cannot be "kill -9"'ed, and is stuck in uninterruptible sleep. + +When this happens, writes to the first virtual block device with the root filesystem are fine, so the O/S itself remains responsive. + +The second virtual block device uses BTRFS. But, I have also tried XFS and reproduced the issue. + +In guest, when this starts, it starts logging "task X blocked for more than Y seconds". Below is an example of one of these. At this point, anything that is or does in the future write to this block device gets stuck in uninterruptible sleep. + +----- + +INFO: task kcompactd:232 blocked for more than 860 seconds. + Not tained 5.2.11-1 #1 +"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messae. +kcompactd0 D 0 232 2 0x80004000 +Call Trace: + ? __schedule+0x27f/0x6d0 + schedule+0x3d/0xc0 + io_schedule+0x12/0x40 + __lock_page+0x14a/0x250 + ? add_to_page_cache_lru+0xe0/0xe0 + migrate_pages+0x803/0xb70 + ? isolate_migratepages_block+0x9f0/0x9f0 + ? __reset_isolation_suitable+0x110/0x110 + compact_zone+0x6a2/0xd30 + kcompactd_do_work+0x134/0x260 + ? kvm_clock_read+0x14/0x30 + ? kvm_sched_clock_read+0x5/0x10 + kcompactd+0xd3/0x220 + ? wait_woken+0x80/0x80 + kthread+0xfd/0x130 + ? kcompactd_do_work+0x260/0x260 + ? kthread_park+0x80/0x80 + ret_from_fork+0x35/0x40 + +----- + +In guest, there are no other dmesg/journalctl entries other than "task...blocked". + +On host, there are no dmesg/journalctl entries whatsoever. Everything else in host continues to work fine, including other QEMU VM's on the same underlying SSD (but obviously different lvm volumes.) + +I understand there might not be enough to go on here, and I also understand it's possible this isn't a QEMU bug. Happy to run given commands or patches to help diagnose what's going on here. + +I'm now running a custom compiled QEMU 4.1.0, with debug symbols, so I can get a meaningful backtrace from the host point of view. + +----- + +/usr/bin/qemu-system-x86_64 + -name arch,process=qemu:arch + -no-user-config + -nodefaults + -nographic + -uuid 0528162b-2371-41d5-b8da-233fe61b6458 + -pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid + -machine q35,accel=kvm,vmport=off,dump-guest-core=off + -cpu SandyBridge-IBRS + -smp cpus=24,cores=12,threads=1,sockets=2 + -m 24G + -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd + -drive if=pflash,format=raw,readonly,file=/var/qemu/0528162b-2371-41d5-b8da-233fe61b6458.fd + -monitor telnet:localhost:8000,server,nowait,nodelay + -spice unix,addr=/tmp/0528162b-2371-41d5-b8da-233fe61b6458.sock,disable-ticketing + -device ioh3420,id=pcie.1,bus=pcie.0,slot=0 + -device virtio-vga,bus=pcie.1,addr=0 + -usbdevice tablet + -netdev bridge,id=network0,br=br0 + -device virtio-net-pci,netdev=network0,mac=02:37:de:79:19:09,bus=pcie.0,addr=3 + -device virtio-scsi-pci,id=scsi1 + -drive driver=raw,node-name=hd0,file=/dev/lvm/arch_root,if=none,discard=unmap + -device scsi-hd,drive=hd0,bootindex=1 + -drive driver=raw,node-name=hd1,file=/dev/lvm/arch_nvme,if=none,discard=unmap + -device scsi-hd,drive=hd1,bootindex=2 + +----- \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1844 b/results/classifier/gemma3:12b/performance/1844 new file mode 100644 index 00000000..65013e54 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1844 @@ -0,0 +1,23 @@ + +qemu process memory usage greater than windows guest memory usage +Description of problem: +The Windows Guest internal memory usage is low,but is very high on host of qemu progress. But the linux guest is no such case.Is there any way to trigger the host to reclaim virtual machine memory? +Steps to reproduce: +1.install a windows guest with 128GB of memory and start it. + +2.When the machine is stable, the VM internal memory usage is low,but is very high on host of qemu progress. + +3.on host,use "free -g" to query,the memory used is also very high + +4.when migrate or dormancy,it can recovery,but I want to know is there any way to trigger the host to reclaim virtual machine memory? + + +host: + + + + + +guest: + + diff --git a/results/classifier/gemma3:12b/performance/1847525 b/results/classifier/gemma3:12b/performance/1847525 new file mode 100644 index 00000000..d64d03d1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1847525 @@ -0,0 +1,83 @@ + +qemu-system-i386 eats a lot of cpu after just few hours, with sdl,gl=on + +I already send this email to <email address hidden> , but I can't see it arriving in archives, so here is copy. + +Hello, all! + +I use qemu-system-i386/qemu-system_x86_64 for rebuilding Slax-like live cd/dvd. +Usually guests (with various self-compiled kernels and X stack with kde3 on top of them) +boot up normally, but if I left them to run in GUI mode for few hours - qemu process on host +started to eat more and more cpu for itself - more notiecable if I set host cpu to lowest possible +frequency via trayfreq applet (1400Mhz in my case). + +Boot line a bit complicated, but I really prefer to have sound and usb inside VM. +qemu-system-i386 -cdrom /dev/shm/CDROM-4.4.194_5.iso -m 1.9G -enable-kvm -soundhw es1370 -smp 2 -display sdl,gl=on -usb -cpu host -rtc clock=vm + +rtc clock=vm was taken from https://bugs.launchpad.net/qemu/+bug/1174654 but apparently not helping. +After just 3 hours of uptime (copied line from 'top' on host) + +31943 guest 20 0 2412m 791m 38m R 51 6.7 66:36.51 qemu-system-i38 + +I use Xorg 1.19.7 on host, with mesa git/nouveau as GL driver. But my card has not very big amount of VRAM - only 384Mb. +May be this limitation is playing some role .. but 'end-user' result was after 1-2 day of guest uptime I run into completely frozen guest +(may be when qemu was hitting 100 one core usage on host some internal timer just made guest kernel too upset/froze? + I was sleeping or doing other things on host for all this time, with VM just supposedly running at another virtual desktop - +in KDE3 + built-in compositor ....) + +I wonder if more mainstream desktop users (on GNOME, Xfce, etc) and/or users of other distros (I use self-re-compiled Slackware) +actually can see same problem? + +qemu-system-i386 --version +QEMU emulator version 4.1.50 (v4.1.0-1188-gc6f5012ba5-dirty) +but I saw same behavior for quite some time .. just never reported it in hope it will go away. + +cat /proc/cpuinfo +processor : 0 +vendor_id : AuthenticAMD +cpu family : 21 +model : 2 +model name : AMD FX(tm)-4300 Quad-Core Processor +stepping : 0 +microcode : 0x6000852 +cpu MHz : 1399.977 +cache size : 2048 KB +physical id : 0 +siblings : 4 +core id : 0 +cpu cores : 2 +apicid : 16 +initial apicid : 0 +fpu : yes +fpu_exception : yes +cpuid level : 13 +wp : yes +flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold +bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass +bogomips : 7600.06 +TLB size : 1536 4K pages +clflush size : 64 +cache_alignment : 64 +address sizes : 48 bits physical, 48 bits virtual +power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro + +[and 3x more of the same, for 3 remaining cores] + +Gcc is Slackware 14.2's gcc 5.5.0, but I saw this with 4.9.2 too. +This might be 32-bit host problem. But may be just no-one tried to run qemu with GUI guest for literaly days? + +Host kernel is + uname -a +Linux slax 5.1.12-x64 #1 SMP PREEMPT Wed Jun 19 12:31:05 MSK 2019 x86_64 AMD FX(tm)-4300 Quad-Core Processor AuthenticAMD GNU/Linux + +I was trying newish 5.3.2 but my compilation was not as stable as this one +(I tend to change few things, like max cpu count, preemption mode, numa support .... +for more distribution-like, yet most stable and performant for me kernel) + +Kernel world is moving fast, so I'll try to recompile new 5.3.x too .... + + +I guess I should provide perf/profiler output, but for this I need to recompile qemu. +I'll try to come back with more details soon. + +Thanks for your attention and possible feedback! \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1847861 b/results/classifier/gemma3:12b/performance/1847861 new file mode 100644 index 00000000..99a418c2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1847861 @@ -0,0 +1,31 @@ + +Guest stuttering under high disk IO (virtio) + +Performing io intensive tasks on virtualized Windows causes the system to visually stutter. I can often reproduce the problem by running fio on windows: + +fio --randrepeat=1 --ioengine=windowsaio --direct=1 --gtod_reduce=1 --name=test --filename=\\.\PhysicalDrive0 --bs=4k --iodepth=128 --size=4G --readwrite=randread + +While the fio command is running, moving the mouse pointer will be be laggy. The stuttering does not appear with iodepth <= 32 . The stuttering also manifests while playing games, the music and video pauses for a fraction of second in a playable but disturbing way. + +Here are my system specs: + +Host OS: archlinux +Guest OS: Windows 10 Enterprise +qemu version: qemu-git 8:v4.1.0.r1378.g98b2e3c9ab-1 (from AUR, compiled with -march=native) +CPU: AMD Ryzen Threadripper 1900X 8-Core Processor +Huge Pages: vm.nr_hugepages=4128 +Disk: nvme type=raw, io=threads bus=virtio +GPU (passthrough): Radeon RX 570 + +Here are some fio test results on my windows guest: + +[size=512M,iodepth=1 -> min=30k,avg=31k,stddev=508] +[size=2G,iodepth=8 -> min=203k,avg=207k,stddev=2.3k] +[size=2G,iodepth=16 -> min=320k,avg=330k,stddev=4.3k] +[size=4G,iodepth=32 -> min=300k,avg=310k,stddev=4.8k] +[size=4G,iodepth=64 -> min=278k,avg=366k,stddev=68.6k] -> STUTTER +[size=4G,iodepth=64 -> min=358k,avg=428k,stddev=52.6k] -> STUTTER +[size=4G,iodepth=128 -> min=92k,avg=217k,stddev=185k] -> STUTTER +[size=4G,iodepth=128 -> min=241k,avg=257k,stddev=14k] -> same config as above, but no stuttering + +The min and avg values are the bandwidth values reported in KB/s by fio. You can see that, when the stuttering occurs, the stardard deviation is high and the minimum bandwidth is way below the average. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1851095 b/results/classifier/gemma3:12b/performance/1851095 new file mode 100644 index 00000000..b51e4dd3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1851095 @@ -0,0 +1,4 @@ + +[feature request] awareness of instructions that are well emulated + +While qemu's scalar emulation tends to be excellent, qemu's SIMD emulation tends to be incorrect (except for arm64 from x86_64). Until these code paths are audited, which is probably a large job, it would be nice if qemu knew its emulation of this class of instructions was not very good, and thus it would give up on finding these instructions if a "careful" operation is passed. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1853042 b/results/classifier/gemma3:12b/performance/1853042 new file mode 100644 index 00000000..b9e0ae11 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1853042 @@ -0,0 +1,82 @@ + +Ubuntu 18.04 - vm disk i/o performance issue when using file system passthrough + +== Comment: #0 - I-HSIN CHUNG <email address hidden> - 2019-11-15 12:35:05 == +---Problem Description--- +Ubuntu 18.04 - vm disk i/o performance issue when using file system passthrough + +Contact Information = <email address hidden> + +---uname output--- +Linux css-host-22 4.15.0-1039-ibm-gt #41-Ubuntu SMP Wed Oct 2 10:52:25 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux (host) Linux ubuntu 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:08:54 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux (vm) + +Machine Type = p9/ac922 + +---Debugger--- +A debugger is not configured + +---Steps to Reproduce--- + 1. Env: Ubuntu 18.04.3 LTS?Genesis kernel linux-ibm-gt - 4.15.0-1039.41?qemu 1:2.11+dfsg-1ubuntu7.18 ibmcloud0.3 or 1:2.11+dfsg-1ubuntu7.19 ibm-cloud1?fio-3.15-4-g029b + +2. execute run.sh to run fio benchmark: + +2.1) run.sh: +#!/bin/bash + +for bs in 4k 16m +do + +for rwmixread in 0 25 50 75 100 +do + +for numjobs in 1 4 16 64 +do +echo ./fio j1.txt --bs=$bs --rwmixread=$rwmixread --numjobs=$numjobs +./fio j1.txt --bs=$bs --rwmixread=$rwmixread --numjobs=$numjobs + +done +done +done + +2.2) j1.txt: + +[global] +direct=1 +rw=randrw +refill_buffers +norandommap +randrepeat=0 +ioengine=libaio +iodepth=64 +runtime=60 + +allow_mounted_write=1 + +[job2] +new_group +filename=/dev/vdb +filesize=1000g +cpus_allowed=0-63 +numa_cpu_nodes=0 +numa_mem_policy=bind:0 + +3. performance profile: +device passthrough performance for the nvme: +http://css-host-22.watson.ibm.com/rundir/nvme_vm_perf_vm/20191011-112156/html/#/measurement/vm/ubuntu (I/O bandwidth achieved inside VM in GB/s range) + +file system passthrough +http://css-host-22.watson.ibm.com/rundir/nvme_vm_perf_vm/20191106-123613/html/#/measurement/vm/ubuntu (I/o bandwidth achieved inside the VM is very low) + +desired performance when using file system passthrough should be similar to the device passthrough + +Userspace tool common name: fio + +The userspace tool has the following bit modes: should be 64 bit + +Userspace rpm: ? + +Userspace tool obtained from project website: na + +*Additional Instructions for <email address hidden>: +-Post a private note with access information to the machine that the bug is occuring on. +-Attach ltrace and strace of userspace application. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1857 b/results/classifier/gemma3:12b/performance/1857 new file mode 100644 index 00000000..dc542eb7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1857 @@ -0,0 +1,53 @@ + +Major qemu-aarch64 performance slowdown since commit 59b6b42cd3 +Description of problem: +I have observed a major performance slowdown between qemu 8.0.0 and 8.1.0: + + +qemu 8.0.0: 0.8s + +qemu 8.1.0: 6.8s + + +After bisecting the commits between 8.0.0 and 8.1.0, the offending commit is 59b6b42cd3: + + +commit 59b6b42cd3446862567637f3a7ab31d69c9bef51 +Author: Richard Henderson <richard.henderson@linaro.org> +Date: Tue Jun 6 10:19:39 2023 +0100 + + target/arm: Enable FEAT_LSE2 for -cpu max + + Reviewed-by: Peter Maydell <peter.maydell@linaro.org> + Signed-off-by: Richard Henderson <richard.henderson@linaro.org> + Message-id: 20230530191438.411344-21-richard.henderson@linaro.org + Signed-off-by: Peter Maydell <peter.maydell@linaro.org> + + +Reverting the commit in latest master fixes the problem: + +qemu 8.0.0: 0.8s + +qemu 8.1.0: 6.8s + +qemu master + revert 59b6b42cd3: 0.8s + +Alternatively, specify `-cpu cortex-a35` to disable LSE2: + +`time ./qemu-aarch64 -cpu cortex-a35`: 0.8s + +`time ./qemu-aarch64`: 6.77s + +The slowdown is also observed when running qemu-aarch64 on aarch64 machine: + +`time ./qemu-aarch64 /usr/bin/node -e 1`: 2.91s + +`time ./qemu-aarch64 -cpu cortex-a35 /usr/bin/node -e 1`: 1.77s + +The slowdown on x86_64 machine is small: 362ms -> 378ms. +Steps to reproduce: +1. Run `time ./qemu-aarch64 node-aarch64 -e 1` (node-aarch64 is NodeJS v16 built for AArch64) +2. Using qemu master, the output says `0.8s` +3. Using qemu master with commit 59b6b42cd3 reverted, the output says `6.77s` +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/1860053 b/results/classifier/gemma3:12b/performance/1860053 new file mode 100644 index 00000000..350e85f2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1860053 @@ -0,0 +1,21 @@ + +Possible lack of precision when calling clock_gettime via vDSO on user mode ppc64le + +Occurs on QEMU v4.2.0 run on docker (via the qemu-user-static:v4.2.0-2 image) on an AMD64 Ubuntu 18.04.3 LTS machine provided by travis-ci.org. + +From golang's https://github.com/golang/go/issues/36592: + +It was discovered that golang's time.NewTicker() and time.Sleep() malfunction when a compiled application was run via QEMU's ppc64le emulator in user mode. + +The methods did not malfunction on actual PowerPC hardware or when the same golang application was compiled for golang's arm, arm64 or 386 targets and was run via user mode QEMU on the same system. + +Curiously, the methods also worked when the program was compiled under go 1.11, but do malfunction in go 1.12 and 1.13. + +It was identified the change in behaviour was most likely attributable to golang switching to using vSDO for calling clock_gettime() on PowerPC 64 architectures in 1.12. I.E: +https://github.com/golang/go/commit/dbd8af74723d2c98cbdcc70f7e2801f69b57ac5b + +We therefore suspect there may be a bug in QEMU's user-mode emulation of ppc64le as relates to vDSO calls to clock_gettime(). + +The nature of the malfunction of time.NewTicker() and time.Sleep() is such that sleeps or ticks with a granularity of less than one second do not appear to be possible (they all revert to 1 second sleeps/ticks). Could it be that the nanoseconds field of clock_gettime() is getting lost in the vDSO version but not in the syscall? Or some other issue calling these methods via vDSO? + +Thanks in advance. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1861161 b/results/classifier/gemma3:12b/performance/1861161 new file mode 100644 index 00000000..b3080d01 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1861161 @@ -0,0 +1,60 @@ + +qemu-arm-static stuck with 100% CPU when cross-compiling emacs + +Hello, + +I'm trying to build multi-arch docker images for https://hub.docker.com/r/silex/emacs. + +Here is the machine I'm building on: + +root@ubuntu-4gb-fsn1-1:~# lsb_release -a +No LSB modules are available. +Distributor ID: Ubuntu +Description: Ubuntu 18.04.3 LTS +Release: 18.04 +Codename: bionic +root@ubuntu-4gb-fsn1-1:~# uname -a +Linux ubuntu-4gb-fsn1-1 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux + +Whenever I try to build the following alpine Dockerfile https://gitlab.com/Silex777/docker-emacs/blob/master/26.3/alpine/3.9/dev/Dockerfile with this command: + +$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes +$ docker build --pull -t test --platform arm . + +It builds fine until this: + +root@ubuntu-4gb-fsn1-1:~# ps -ef | grep qemu +root 26473 26465 99 14:26 pts/0 01:59:58 /usr/bin/qemu-arm-static ../src/bootstrap-emacs -batch --no-site-file --no-site-lisp --eval (setq load-prefer-newer t) -f batch-byte-compile emacs-lisp/macroexp.el + +This is supposed to take a few seconds, but in practice it takes 100% CPU and never ends. When I strace the process I see this: + +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 120 +tgkill(5875, 5878, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 120 +tgkill(5875, 5878, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 120 +tgkill(5875, 5878, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 120 +tgkill(5875, 5878, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) +getdents64(5, /* 0 entries */, 2048) = 0 +lseek(5, 0, SEEK_SET) = 0 +getdents64(5, /* 5 entries */, 2048) = 120 +tgkill(5875, 5878, SIGRT_2) = -1 EAGAIN (Resource temporarily unavailable) + +It happens with all the QEMU versions I tested: +- 2.11.1 (OS version) +- 4.1.1-1 (from multiarch/qemu-user-static:4.1.1-1) +- 4.2.0-2 (from multiarch/qemu-user-static) + +Any ideas of what I could do to debug it further? + +Kind regards, +Philippe \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1862874 b/results/classifier/gemma3:12b/performance/1862874 new file mode 100644 index 00000000..1eee5c74 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1862874 @@ -0,0 +1,67 @@ + +java may stuck for a long time in system mode with "-cpu max" + +Bug Description: +Run "java -version" in guest VM, java may stuck for a long time (several hours) and then recover. + +Steps to reproduce: +1. Launch VM by attached simple script: launch.sh +2. Execute "java -version" and then print "date" in a loop + while : + do + /home/bot/jdk/bin/java -version + date + done +3. A long time gap will be observed: may > 24 hours. + +Technical details: +* host: x86_64 Linux 4.15.0-70-generic +* qemu v4.2.0 +* java: tried two versions: openjdk-11-jre-headless or compiled java-13 +* command-line: (See details in launch.sh) +/home/bot/qemu/qemu-build/qemu-4.2.0/binaries/bin/qemu-system-x86_64 \ + -drive "file=${img},format=qcow2" \ + -drive "file=${user_data},format=raw" \ + -cpu max \ + -m 24G \ + -serial mon:stdio \ + -smp 8 \ + -nographic \ +; + +* Observed by java core dump generated by "kill -SIGSEGV" when java stucked: +Different pthreads are blocked on their own condition variables: + + Id Target Id Frame + 1 Thread 0x7f48a041a080 (LWP 22470) __GI_raise (sig=sig@entry=6) + at ../sysdeps/unix/sysv/linux/raise.c:51 + 2 Thread 0x7f487197d700 (LWP 22473) 0x00007f489f5c49f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f48980197c0) + at ../sysdeps/unix/sysv/linux/futex-internal.h:88 + 3 Thread 0x7f4861b89700 (LWP 22483) 0x00007f489f5c4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f4861b88960, expected=0, + futex_word=0x7f489801b084) + at ../sysdeps/unix/sysv/linux/futex-internal.h:142 + 4 Thread 0x7f4861e8c700 (LWP 22480) 0x00007f489f5c76d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f48980107c0) + at ../sysdeps/unix/sysv/linux/futex-internal.h:205 + 5 Thread 0x7f4861c8a700 (LWP 22482) 0x00007f489f5c4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f4861c89800, expected=0, + futex_word=0x7f489801ed44) + at ../sysdeps/unix/sysv/linux/futex-internal.h:142 + 6 Thread 0x7f48a0418700 (LWP 22471) 0x00007f4880b13200 in ?? () + 7 Thread 0x7f48703ea700 (LWP 22478) 0x00007f489f5c49f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f489801dfc0) + at ../sysdeps/unix/sysv/linux/futex-internal.h:88 + 8 Thread 0x7f48702e9700 (LWP 22479) 0x00007f489f5c49f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f489838cd84) + at ../sysdeps/unix/sysv/linux/futex-internal.h:88 + 9 Thread 0x7f4870f71700 (LWP 22475) 0x00007f489f5c49f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7f489801a300) + at ../sysdeps/unix/sysv/linux/futex-internal.h:88 + 10 Thread 0x7f487187b700 (LWP 22474) 0x00007f489f5c76d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f48980cf770) + at ../sysdeps/unix/sysv/linux/futex-internal.h:205 + 11 Thread 0x7f4871a7f700 (LWP 22472) 0x00007f489f5c76d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f489809ba30) + at ../sysdeps/unix/sysv/linux/futex-internal.h:205 + 12 Thread 0x7f4861d8b700 (LWP 22481) 0x00007f489f5c4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f4861d8a680, expected=0, + futex_word=0x7f489801ed44) + at ../sysdeps/unix/sysv/linux/futex-internal.h:142 + 13 Thread 0x7f48704ec700 (LWP 22477) 0x00007f489f5c4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f48704eb910, expected=0, + futex_word=0x7f489801d120) + at ../sysdeps/unix/sysv/linux/futex-internal.h:142 + 14 Thread 0x7f4870e6f700 (LWP 22476) 0x00007f489f5c4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f4870e6eb20, expected=0, + futex_word=0x7f489828abd0) + at ../sysdeps/unix/sysv/linux/futex-internal.h:142 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1867786 b/results/classifier/gemma3:12b/performance/1867786 new file mode 100644 index 00000000..fabaa868 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1867786 @@ -0,0 +1,28 @@ + +Qemu PPC64 freezes with multi-core CPU + +I installed Debian 10 on a Qemu PPC64 VM running with the following flags: + +qemu-system-ppc64 \ + -nographic -nodefaults -monitor pty -serial stdio \ + -M pseries -cpu POWER9 -smp cores=4,threads=1 -m 4G \ + -drive file=debian-ppc64el-qemu.qcow2,format=qcow2,if=virtio \ + -netdev user,id=network01,$ports -device rtl8139,netdev=network01 \ + + +Within a couple minutes on any operation (could be a Go application or simply changing the hostname with hostnamectl, the VM freezes and prints this on the console: + +``` +root@debian:~# [ 950.428255] rcu: INFO: rcu_sched self-detected stall on CPU +[ 950.428453] rcu: 3-....: (5318 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=2544 +[ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] + +Message from syslogd@debian at Mar 17 11:35:24 ... + kernel:[ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] +[ 980.110018] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-... } 5276 jiffies s: 93 root: 0x8/. +[ 980.111177] rcu: blocking rcu_node structures: +[ 1013.442268] rcu: INFO: rcu_sched self-detected stall on CPU +[ 1013.442365] rcu: 3-....: (21071 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=9342 +``` + +If I change to 1 core on the command line, I haven't seen these freezes. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1868055 b/results/classifier/gemma3:12b/performance/1868055 new file mode 100644 index 00000000..9b5c0e01 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1868055 @@ -0,0 +1,76 @@ + +cannot run golang app with docker, version 17.09.1-ce, disabling core 0 and qemu-arm, version 2.7. + +Hello! +I figure out that sometimes simple go application is not working. +I am using docker + qemu-arm + go( for armv7l). + +These are version info below. + +root@VDBS1535:~# docker -v +Docker version 17.09.1-ce, build 19e2cf6 + +bash-3.2# qemu-arm --version +qemu-arm version 2.7.0, Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers + +$ go version +go version go1.12.6 linux/arm +$ go env +GOARCH="arm" +GOBIN="" +GOCACHE="/home/quickbuild/.cache/go-build" +GOEXE="" +GOFLAGS="" +GOHOSTARCH="arm" +GOHOSTOS="linux" +GOOS="linux" +GOPATH="/home/quickbuild/go" +GOPROXY="" +GORACE="" +GOROOT="/usr/lib/golang" +GOTMPDIR="" +GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_arm" +GCCGO="gccgo" +GOARM="7" +CC="gcc" +CXX="g++" +CGO_ENABLED="1" +GOMOD="" +CGO_CFLAGS="-g -O2" +CGO_CPPFLAGS="" +CGO_CXXFLAGS="-g -O2" +CGO_FFLAGS="-g -O2" +CGO_LDFLAGS="-g -O2" +PKG_CONFIG="pkg-config" +GOGCCFLAGS="-fPIC -marm -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build242285369=/tmp/go-build -gno-record-gcc-switches" + +This issue is come only when I disable core 0 using a command below. +please check "--cpuset-cpus=1-55" option. + +sudo docker run --privileged -d -i -t --cpuset-cpus=1-55 --mount type=bind,source="/home/dw83kim/mnt",destination="/mnt" --network host --name="ubuntu_core1" ubuntu:xenial-20200212 + + +This is what I have tested in the environment above. + +package main +func main(){ + for i:=0; i<1000; i++ { + println("Hello world") + } +} + +This is one of the error logs have faced sometimes not always. + +bash-3.2# go run test.go +fatal error: schedule: holding locks +panic during panic +SIGILL: illegal instruction +PC=0xc9ec4c m=3 sigcode=2 + +goroutine 122 [runnable]: +qemu: uncaught target signal 11 (Segmentation fault) - core dumped +Segmentation fault (core dumped) +bash-3.2# + +Please check it. +Thanks in advance. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1870477 b/results/classifier/gemma3:12b/performance/1870477 new file mode 100644 index 00000000..b7508a65 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1870477 @@ -0,0 +1,34 @@ + +qemu-arm hangs when golang running test + + +1. Environment: +Ubuntu 16.04.5 X86_64 +qemu-arm version 4.2.0 +go version go 1.14.1 linux/arm + +2. Summary: +Sometimes "go run test.go" command hang + + +3. Reproduction Method (Attempts: 500 Occurred: 1 ): Frequency: 1/500 + + +test.go +====================================== +package main +import "fmt" +func main(){ + for i:=0; i<1000; i++ { + fmt.Printf("[%d] Hello world\n", i) + } +} +====================================== + +i tested "go run test.go" command called 500 times at qemu arm env. +qemu hangs about 200~300. + +attached strace log. + +please check. +thanks \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1873032 b/results/classifier/gemma3:12b/performance/1873032 new file mode 100644 index 00000000..f33941e5 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1873032 @@ -0,0 +1,17 @@ + +After upgrade qemu to 5.0.0-0.3.rc2.fc33 the virtual machine with Windows 10 after a while starts to work very slowly + +Description of problem: + +After upgrade qemu to 5.0.0-0.3.rc2.fc33 the virtual machine with Windows 10 after a while starts to work very slowly + +I created the virtual machine with Windows 10 with the following config: +- 1 CPU +- 2GB RAM +- With network access + +I launch there a web browser there with flash content. +And usually, the system (Windows 10) does not work there for more than an hour. +When the system starts to work very slowly it doesn't respond to "Reboot" and "Shut Down" commands. Only works "Force Reset" and "Force Off". But when I reboot the system with "Force Reset" it usually stuck at boot at the Windows splash screen. https://imgur.com/yGyacDG + +The last version of qemu which not contain this issue is 5.0.0-0.2.rc0.fc33 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1875762 b/results/classifier/gemma3:12b/performance/1875762 new file mode 100644 index 00000000..bfd41f50 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1875762 @@ -0,0 +1,12 @@ + +Poor disk performance on sparse VMDKs + +Found in QEMU 4.1, and reproduced on master. + +QEMU appears to suffer from remarkably poor disk performance when writing to sparse-extent VMDKs. Of course it's to be expected that allocation takes time and sparse VMDKs peform worse than allocated VMDKs, but surely not on the orders of magnitude I'm observing. On my system, the fully allocated write speeds are approximately 1.5GB/s, while the fully sparse write speeds can be as low as 10MB/s. I've noticed that adding "cache unsafe" reduces the issue dramatically, bringing speeds up to around 750MB/s. I don't know if this is still slow or if this perhaps reveals a problem with the default caching method. + +To reproduce the issue I've attached two 4GiB VMDKs. Both are completely empty and both are technically sparse-extent VMDKs, but one is 100% pre-allocated and the other is 100% unallocated. If you attach these VMDKs as second and third disks to an Ubuntu VM running on QEMU (with KVM) and measure their write performance (using dd to write to /dev/sdb and /dev/sdc for example) the difference in write speeds is clear. + +For what it's worth, the flags I'm using that relate to the VMDK are as follows: + +`-drive if=none,file=sparse.vmdk,id=hd0,format=vmdk -device virtio-scsi-pci,id=scsi -device scsi-hd,drive=hd0` \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1877137 b/results/classifier/gemma3:12b/performance/1877137 new file mode 100644 index 00000000..b9ced62b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1877137 @@ -0,0 +1,17 @@ + +32-bit Arm emulation spins at 100% during emacs build + +This test case exposes a QEMU bug when crossbuilding to arm32. The bug is only +exposed on amd64 architecture or aarch64 hosts that can *only* execute +64 bit applications. + +Usage: + +./setup.sh # installs docker and tweaks sysctl +./qemu.sh # register qemu so you are able to run containers from other +architectures +./build.sh # try to build the docker image. Process should get stuck +in a 100% CPU loop after a while + +Originally reported by, and test case developed by, +Philippe Vaucher. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1877716 b/results/classifier/gemma3:12b/performance/1877716 new file mode 100644 index 00000000..7fbba82e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1877716 @@ -0,0 +1,14 @@ + +Win10 guest unusable after a few minutes + +On Arch Linux, the recent qemu package update seems to misbehave on some systems. In my case, my Windows 10 guest runs fine for around 5 minutes and then start to get really sluggish, even unresponsive. It needs to be forced off. I could reproduce this on a minimal VM with no passthrough, although my current testing setup involves an nvme pcie passthrough. + +I bisected it to the following commit which rapidly starts to run sluggishly on my setup: +https://github.com/qemu/qemu/commit/73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf + +I've ran the previous commit ( https://github.com/qemu/qemu/commit/b321051cf48ccc2d3d832af111d688f2282f089b ) for the entire night without an issue so far. + +I believe this might be a duplicate of https://bugs.launchpad.net/qemu/+bug/1873032 , although I'm not sure. + +Linux cc 5.6.10-arch1-1 #1 SMP PREEMPT Sat, 02 May 2020 19:11:54 +0000 x86_64 GNU/Linux +AMD Ryzen 7 2700X Eight-Core Processor \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1880355 b/results/classifier/gemma3:12b/performance/1880355 new file mode 100644 index 00000000..87154c6a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1880355 @@ -0,0 +1,30 @@ + +Length restrictions for fw_cfg_dma_transfer? + +For me, this takes close to 3 minutes at 100% CPU: +echo "outl 0x518 0x9596ffff" | ./i386-softmmu/qemu-system-i386 -M q35 -m 32 -nographic -accel qtest -monitor none -serial none -qtest stdio + +#0 phys_page_find (d=0x606000035d80, addr=136728041144404) at /exec.c:338 +#1 address_space_lookup_region (d=0x606000035d80, addr=136728041144404, resolve_subpage=true) at /exec.c:363 +#2 address_space_translate_internal (d=0x606000035d80, addr=136728041144404, xlat=0x7fff1fc0d070, plen=0x7fff1fc0d090, resolve_subpage=true) at /exec.c:382 +#3 flatview_do_translate (fv=0x606000035d20, addr=136728041144404, xlat=0x7fff1fc0d070, plen_out=0x7fff1fc0d090, page_mask_out=0x0, is_write=true, is_mmio=true, target_as=0x7fff1fc0ce10, attrs=...) + pment/qemu/exec.c:520 +#4 flatview_translate (fv=0x606000035d20, addr=136728041144404, xlat=0x7fff1fc0d070, plen=0x7fff1fc0d090, is_write=true, attrs=...) at /exec.c:586 +#5 flatview_write_continue (fv=0x606000035d20, addr=136728041144404, attrs=..., ptr=0x7fff1fc0d660, len=172, addr1=136728041144400, l=172, mr=0x557fd54e77e0 <io_mem_unassigned>) + pment/qemu/exec.c:3160 +#6 flatview_write (fv=0x606000035d20, addr=136728041144064, attrs=..., buf=0x7fff1fc0d660, len=512) at /exec.c:3177 +#7 address_space_write (as=0x557fd54e7a00 <address_space_memory>, addr=136728041144064, attrs=..., buf=0x7fff1fc0d660, len=512) at /exec.c:3271 +#8 dma_memory_set (as=0x557fd54e7a00 <address_space_memory>, addr=136728041144064, c=0 '\000', len=1378422272) at /dma-helpers.c:31 +#9 fw_cfg_dma_transfer (s=0x61a000001e80) at /hw/nvram/fw_cfg.c:400 +#10 fw_cfg_dma_mem_write (opaque=0x61a000001e80, addr=4, value=4294940309, size=4) at /hw/nvram/fw_cfg.c:467 +#11 memory_region_write_accessor (mr=0x61a000002200, addr=4, value=0x7fff1fc0e3d0, size=4, shift=0, mask=4294967295, attrs=...) at /memory.c:483 +#12 access_with_adjusted_size (addr=4, value=0x7fff1fc0e3d0, size=4, access_size_min=1, access_size_max=8, access_fn=0x557fd2288c80 <memory_region_write_accessor>, mr=0x61a000002200, attrs=...) + pment/qemu/memory.c:539 +#13 memory_region_dispatch_write (mr=0x61a000002200, addr=4, data=4294940309, op=MO_32, attrs=...) at /memory.c:1476 +#14 flatview_write_continue (fv=0x606000035f00, addr=1304, attrs=..., ptr=0x7fff1fc0ec40, len=4, addr1=4, l=4, mr=0x61a000002200) at /exec.c:3137 +#15 flatview_write (fv=0x606000035f00, addr=1304, attrs=..., buf=0x7fff1fc0ec40, len=4) at /exec.c:3177 +#16 address_space_write (as=0x557fd54e7bc0 <address_space_io>, addr=1304, attrs=..., buf=0x7fff1fc0ec40, len=4) at /exec.c:3271 + + +It looks like fw_cfg_dma_transfer gets the address(136728041144064) and length(1378422272) for the read from the value provided as input 4294940309 (0xFFFF9695) which lands in pcbios. Should there be any limits on the length of guest-memory that fw_cfg should populate? +Found by libfuzzer \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1880722 b/results/classifier/gemma3:12b/performance/1880722 new file mode 100644 index 00000000..30179753 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1880722 @@ -0,0 +1,15 @@ + +Problems related to checking page crossing in use_goto_tb() + +The discussion that led to this bug discovery can be found in this +mailing list thread: +https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg05426.html + +A workaround for this problem would be to check for page crossings for +both the user and system modes in the use_goto_tb() function across +targets. Some targets like "hppa" already implement this fix but others +don't. + +To solve the root cause of this problem, the linux-user/mmap.c should +be fixed to do all the invalidations required. By doing so, up to 6.93% +performance improvements will be achieved. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1882 b/results/classifier/gemma3:12b/performance/1882 new file mode 100644 index 00000000..356a85c6 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1882 @@ -0,0 +1,10 @@ + +Test suite hangs on FreeBSD 13.2 +Description of problem: +The 80 minute timeout for the x64-freebsd-13-build CI job is insufficient: +https://gitlab.com/qemu-project/qemu/-/jobs/5058610599 + +``` +672/832 qemu:block / io-qcow2-041 OK 39.77s 1 subtests passed +Timed out! +``` diff --git a/results/classifier/gemma3:12b/performance/1882497 b/results/classifier/gemma3:12b/performance/1882497 new file mode 100644 index 00000000..ef5ac397 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1882497 @@ -0,0 +1,8 @@ + +Missing 'cmp' utility makes build take 10 times as long + +I have been doing some work cross compiling qemu for Windows using a minimal Fedora container. Recently I started hitting some timeouts on the CI service and noticed a build of all targets was going over 1 hour. + +It seems like the 'cmp' utility from diffutils is used somewhere in the process and if it's missing, either a configure or a make gets run way too many times - I'll try to pull logs from the CI system at some stage soon. + +Could a warning or error be added if cmp is missing? \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1883400 b/results/classifier/gemma3:12b/performance/1883400 new file mode 100644 index 00000000..93a79d09 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1883400 @@ -0,0 +1,19 @@ + +Windows 10 extremely slow and unresponsive + +Hi, + +Fedora 32, x64 +qemu-5.0.0-2.fc32.x86_64 + +https://www.microsoft.com/en-us/software-download/windows10ISO +Win10_2004_English_x64.iso + +Windows 10 is excruciatingly slow since upgrading to 5.0.0-2.fc32. Disabling your repo and downgrading to 2:4.2.0-7.fc32 and corrects the issue (the package in the Fedora repo). + +You can duplicate this off of the Windows 10 ISO (see above) and do not even have to install Windows 10 itself. + +Please fix, + +Many thanks, +-T \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1886602 b/results/classifier/gemma3:12b/performance/1886602 new file mode 100644 index 00000000..082c88e1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1886602 @@ -0,0 +1,13 @@ + +Windows 10 very slow with OVMF + +Debian Buster + +Kernel 4.19.0-9-amd64 +qemu-kvm 1:3.1+dfsg-8+deb10u5 +ovmf 0~20181115.85588389-3+deb10u1 + +Machine: Thinkpad T470, i7-7500u, 20GB RAM +VM: 4 CPUs, 8GB RAM, Broadwell-noTSX CPU Model + +Windows 10, under this VM, seems to be exceedingly slow with all operations. This is a clean install with very few services running. Task Manager can take 30% CPU looking at an idle system. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1888714 b/results/classifier/gemma3:12b/performance/1888714 new file mode 100644 index 00000000..5d5f66fb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1888714 @@ -0,0 +1,56 @@ + +Memory Leak in hpet_timer results in unusable machine + +Fair warning: this might be specific to QTest (specifically its clock_step) command. This reproducer only works with -accel qtest. Build with --enable-sanitizers to exit once we hit 1G RSS. + +export ASAN_OPTIONS=hard_rss_limit_mb=1000 +cat << EOF | ./i386-softmmu/qemu-system-i386 -nographic \ +-nodefaults -qtest stdio -accel qtest +writeq 0xfed0000e 0x15151515151515f1 +clock_step +clock_step +clock_step +clock_step +writeq 0xfed00100 0x5e90c5be00ff5e9e +writeq 0xfed00109 0xffffe0ff5cfec0ff +clock_step +EOF + +On my machine it takes around 10 seconds to reach the RSS limit. + +Unfortunately, I can't find a way to tell ASAN to log each malloc to figure out whats going on, but running the original fuzzing test case with the libfuzzer -trace_malloc=2 flag, I found that the allocations happen here: + +MALLOC[130968] 0x60300069ac90 32 + #0 0x55fa3f615851 in __sanitizer_print_stack_trace (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x2683851) + #1 0x55fa3f55fe88 in fuzzer::PrintStackTrace() (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x25cde88) + #2 0x55fa3f5447d6 in fuzzer::MallocHook(void const volatile*, unsigned long) (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x25b27d6) + #3 0x55fa3f61bbb7 in __sanitizer::RunMallocHooks(void const*, unsigned long) (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x2689bb7) + #4 0x55fa3f596d75 in __asan::Allocator::Allocate(unsigned long, unsigned long, __sanitizer::BufferedStackTrace*, __asan::AllocType, bool) (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x2604d75) + #5 0x55fa3f596f7a in __asan::asan_calloc(unsigned long, unsigned long, __sanitizer::BufferedStackTrace*) (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x2604f7a) + #6 0x55fa3f60d173 in calloc (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-fuzz-i386+0x267b173) + #7 0x7fb300737548 in g_malloc0 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x54548) + #8 0x55fa40157689 in async_run_on_cpu /home/alxndr/Development/qemu/cpus-common.c:163:10 + #9 0x55fa409fab83 in hpet_timer /home/alxndr/Development/qemu/hw/timer/hpet.c:376:9 + #10 0x55fa416a5751 in timerlist_run_timers /home/alxndr/Development/qemu/util/qemu-timer.c:572:9 + #11 0x55fa3fcfdac4 in qtest_clock_warp /home/alxndr/Development/qemu/softmmu/cpus.c:507:9 + #12 0x55fa3fd65c35 in qtest_process_command /home/alxndr/Development/qemu/softmmu/qtest.c:665:9 + #13 0x55fa3fd5e128 in qtest_process_inbuf /home/alxndr/Development/qemu/softmmu/qtest.c:710:9 + #14 0x55fa3fd5de67 in qtest_server_inproc_recv /home/alxndr/Development/qemu/softmmu/qtest.c:817:9 + #15 0x55fa4142b64b in qtest_sendf /home/alxndr/Development/qemu/tests/qtest/libqtest.c:424:5 + #16 0x55fa4142c482 in qtest_clock_step_next /home/alxndr/Development/qemu/tests/qtest/libqtest.c:864:5 + #17 0x55fa414b12d1 in general_fuzz /home/alxndr/Development/qemu/tests/qtest/fuzz/general_fuzz.c:581:17 + +It doesn't look like we ever exit out of the loop in timerlist_run_timers, ie timer_list->active_timers is always True. + + +Info From GDB: +#0 0x0000555558070d31 in address_space_stl_internal (as=0x55555f0e8f20 <address_space_memory>, addr=0x0, val=0x0, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at /home/alxndr/Development/qemu/memory_ldst.inc.c:323 +#1 0x0000555558071339 in address_space_stl_le (as=0x55555f0e8f20 <address_space_memory>, addr=0x0, val=0x0, attrs=..., result=0x0) at /home/alxndr/Development/qemu/memory_ldst.inc.c:357 +#2 0x000055555a6a6f95 in update_irq (timer=0x61f0000005b8, set=0x1) at /home/alxndr/Development/qemu/hw/timer/hpet.c:210 +#3 0x000055555a6ae55f in hpet_timer (opaque=0x61f0000005b8) at /home/alxndr/Development/qemu/hw/timer/hpet.c:386 +#4 0x000055555c03d178 in timerlist_run_timers (timer_list=0x60b0000528f0) at /home/alxndr/Development/qemu/util/qemu-timer.c:572 +#5 0x000055555c03d6b5 in qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at /home/alxndr/Development/qemu/util/qemu-timer.c:586 +#6 0x0000555558c3d0c4 in qtest_clock_warp (dest=0x3461864) at /home/alxndr/Development/qemu/softmmu/cpus.c:507 + + +-Alex \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1888923 b/results/classifier/gemma3:12b/performance/1888923 new file mode 100644 index 00000000..dbaaee4e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1888923 @@ -0,0 +1,58 @@ + +Configured Memory access latency and bandwidth not taking effect + +I was trying to configure latencies and bandwidths between nodes in a NUMA emulation using QEMU 5.0.0. + +Host : Ubuntu 20.04 64 bit +Guest : Ubuntu 18.04 64 bit + +The machine configured has 2 nodes. Each node has 2 CPUs and has been allocated 3GB of memory. The memory access latencies and bandwidths for a local access (i.e from initiator 0 to target 0, and from initiator 1 to target 1) are set as 40ns and 10GB/s respectively. The memory access latencies and bandwidths for a remote access (i.e from initiator 1 to target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s respectively. + +The command line launch is as follows. + +sudo x86_64-softmmu/qemu-system-x86_64 \ +-machine hmat=on \ +-boot c \ +-enable-kvm \ +-m 6G,slots=2,maxmem=7G \ +-object memory-backend-ram,size=3G,id=m0 \ +-object memory-backend-ram,size=3G,id=m1 \ +-numa node,nodeid=0,memdev=m0 \ +-numa node,nodeid=1,memdev=m1 \ +-smp 4,sockets=4,maxcpus=4 \ +-numa cpu,node-id=0,socket-id=0 \ +-numa cpu,node-id=0,socket-id=1 \ +-numa cpu,node-id=1,socket-id=2 \ +-numa cpu,node-id=1,socket-id=3 \ +-numa dist,src=0,dst=1,val=20 \ +-net nic \ +-net user \ +-hda testing.img \ +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ + +Then the latencies and bandwidths between the nodes were tested using the Intel Memory Latency Checker v3.9 (https://software.intel.com/content/www/us/en/develop/articles/intelr-memory-latency-checker.html). But the obtained results did not match the configuration. The following are the results obtained. + +Latency_matrix with idle latencies (in ns) + +Numa +Node 0 1 +0 36.2 36.4 +1 34.9 35.4 + +Bandwidth_matrix with memory bandwidths (in MB/s) + +Numa +Node 0 1 +0 15167.1 15308.9 +1 15226.0 15234.0 + +A test was also conducted with the tool “lat_mem_rd” from lmbench to measure the memory read latencies. This also gave results which did not match the config. + +Any information on why the config latency and bandwidth values are not applied, would be appreciated. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1889 b/results/classifier/gemma3:12b/performance/1889 new file mode 100644 index 00000000..2095036d --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1889 @@ -0,0 +1,48 @@ + +IO delays on live migration lv initialization +Description of problem: +Hi, + +When I live migrate a VM via Proxmox and the destination is an LVM thin pool I see that at the start of copying the disk it's first initialized. + +This leads the thin volume to be directly 100% allocated which needs to be discarded afterwards. Not ideal but .... + +The more annoying thing is that this initialization step used 100% of disk IO. In iotop I see it writing over 1000MB/sec. The nasty side effect is that other VM's on that host are negatively affected. It's not completely locked up, I can ssh in and look around, but storage intensive things see more delay. With e.g. http requests timing out. And even a simple ls command could take 10+ seconds which is normally instant. + + +I've previously reported it on the [proxmox forum](https://forum.proxmox.com/threads/io-delays-on-live-migration-lv-initialization.132296/#post-582050) but the call was made that this is behavior from Qemu. + +> The zeroing happens, because of what QEMU does when the discard option is enabled: + + +When I disable discard for the VM disk I can see that it's not pre-initialized during migration, but not having that defeats the purpose of having an lvm thin pool. + +For the (disk) migration itself I can set a bandwidth limit ... could we do something similar for initialization? + + +Even better would be to not initialize at all when using LVM thin. As far as I understand it the new blocks allocated by lvm thin should always be empty. +Steps to reproduce: +1. Migrate a vm with a large disk +2. look in iotop on the new host, would be see more write IO then the network could handle.. just before the disk content is transferred. +3. look in another VM on the destination host, reading from disk would be significantly slower then normal. +Additional information: +An example VM config +``` +agent: 1,fstrim_cloned_disks=1 +balloon: 512 +bootdisk: scsi0 +cores: 6 +ide2: none,media=cdrom +memory: 8196 +name: ... +net0: virtio=...,bridge=... +numa: 0 +onboot: 1 +ostype: l26 +scsi0: thin_pool_hwraid:vm-301-disk-0,discard=on,format=raw,size=16192M +scsi1: thin_pool_hwraid:vm-301-disk-1,discard=on,format=raw,size=26G +scsihw: virtio-scsi-pci +serial0: socket +smbios1: uuid=... +sockets: 1 +``` diff --git a/results/classifier/gemma3:12b/performance/1890069 b/results/classifier/gemma3:12b/performance/1890069 new file mode 100644 index 00000000..bd22b5e0 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1890069 @@ -0,0 +1,24 @@ + +QEMU is not allowing multiple cores with mips architecture + +I may have found a bug as when trying to boot up an QEMU VM with the following command: "qemu-system-mips -M malta -m 512 -hda hda.img -kernel vmlinux-4.19.0-10-4kc-malta -initrd initrd.img-4.19.0-10-4kc-malta -append "root=/dev/sda1 console=ttyS0" -nographic -device e1000,netdev=user.0 -netdev user,id=user.0,hostfwd=tcp::5555-:22 -smp cores=12,threads=1,sockets=1", it will use up all of the CPU cores on the host, but not bootup? + +Kernel log also shows: +[ 100.303136] perf: interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 +[ 107.656869] perf: interrupt took too long (3195 > 3132), lowering kernel.perf_event_max_sample_rate to 62500 +[ 117.668390] perf: interrupt took too long (4033 > 3993), lowering kernel.perf_event_max_sample_rate to 49500 +[ 217.166763] perf: interrupt took too long (5126 > 5041), lowering kernel.perf_event_max_sample_rate to 39000 +[ 231.910132] perf: interrupt took too long (6445 > 6407), lowering kernel.perf_event_max_sample_rate to 31000 +[ 250.170677] perf: interrupt took too long (8087 > 8056), lowering kernel.perf_event_max_sample_rate to 24500 +[ 285.391451] perf: interrupt took too long (10126 > 10108), lowering kernel.perf_event_max_sample_rate to 19750 +[ 778.588911] perf: interrupt took too long (12770 > 12657), lowering kernel.perf_event_max_sample_rate to 15500 +[ 1554.825129] perf: interrupt took too long (15982 > 15962), lowering kernel.perf_event_max_sample_rate to 12500 +[ 1622.838910] hrtimer: interrupt took 14758063 ns +[ 1712.932777] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs +[ 1712.932780] perf: interrupt took too long (59793 > 19977), lowering kernel.perf_event_max_sample_rate to 3250 + + +System details: + +OS: Ubuntu 20.04 +QEMU emulator version 4.2.0 (Debian 1:4.2-3ubuntu6.3) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1892081 b/results/classifier/gemma3:12b/performance/1892081 new file mode 100644 index 00000000..d7d16639 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1892081 @@ -0,0 +1,15 @@ + +Performance improvement when using "QEMU_FLATTEN" with softfloat type conversions + +Attached below is a matrix multiplication program for double data +types. The program performs the casting operation "(double)rand()" +when generating random numbers. + +This operation calls the integer to float softfloat conversion +function "int32_to_float_64". + +Adding the "QEMU_FLATTEN" attribute to the function definition +decreases the instructions per call of the function by about 63%. + +Attached are before and after performance screenshots from +KCachegrind. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1895703 b/results/classifier/gemma3:12b/performance/1895703 new file mode 100644 index 00000000..0bfcbacc --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1895703 @@ -0,0 +1,19 @@ + +performance degradation in tcg since Meson switch + +The buildsys conversion to Meson (1d806cef0e3..7fd51e68c34) +introduced a degradation in performance in some TCG targets: + +-------------------------------------------------------- +Test Program: matmult_double +-------------------------------------------------------- +Target Instructions Previous Latest + 1d806cef 7fd51e68 +---------- -------------------- ---------- ---------- +alpha 3 233 957 639 ----- +7.472% +m68k 3 919 110 506 ----- +18.433% +-------------------------------------------------------- + +Original report from Ahmed Karaman with further testing done +by Aleksandar Markovic: +https://<email address hidden>/msg740279.html \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1896298 b/results/classifier/gemma3:12b/performance/1896298 new file mode 100644 index 00000000..2781f258 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1896298 @@ -0,0 +1,20 @@ + +TCG memory leak with FreeDOS 'edit' + +qemu trunk as of today leaks memory FAST when freedos' edit is running. + +To reproduce, download: + +https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/repositories/1.3/cdrom.iso + +Then run: + +$ qemu-system-i386 -cdrom cdrom.iso + +select your language then select "return to DOS", then type + +> edit + +it will consume memory at ~10MB/s + +This does NOT happen when adding -enable-kvm \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1896754 b/results/classifier/gemma3:12b/performance/1896754 new file mode 100644 index 00000000..5be9f356 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1896754 @@ -0,0 +1,6 @@ + +Performance degradation for WinXP boot time after b55f54bc + +Qemu 5.1 loads Windows XP in TCG mode 5-6 times slower (~2 minutes) than 4.2 (25 seconds), I git bisected it, and it appears that commit b55f54bc965607c45b5010a107a792ba333ba654 causes this issue. Probably similar to an older fixed bug https://bugs.launchpad.net/qemu/+bug/1672383 + +Command line is trivial: qemu-system-x86_64 -nodefaults -vga std -m 4096M -hda WinXP.qcow2 -monitor stdio -snapshot \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1898 b/results/classifier/gemma3:12b/performance/1898 new file mode 100644 index 00000000..d69fc702 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1898 @@ -0,0 +1,33 @@ + +Ninja makeserver support +Description of problem: +Building `qemu` using a patched version of `ninja`[0] to utilize `make`'s jobserver feature doesn't work when building `qemu`. Usually, when using a jobserver to control the number of jobs being built in parallel across multiple different builds (i.e. when building with `open-embedded` or `buildroot`), the `-j$(nproc)` argument is left out. In this case, the `Qemu` `Makefile` interprets the absent `-j` argument as a wish for a single process only, and adds a `-j1` argument to the `ninja` call. +Steps to reproduce: +1. Built/install the patched `ninja` from [0]: `export PATH=<path/to/ninja>:$PATH` +2. Start the attached [jobserver.py](/uploads/8215e8a470c97cd456d2d14e2c71c6a5/jobserver.py) script: `python jobserver.py /tmp/jobserver 4` +3. Configure `qemu`: `mkdir build; ../configure` +4. Build `qemu`: `MAKEFLAGS="--jobserver-auth=fifo:/tmp/jobserver" make` +5. Observe that only a single CPU/core is being used. + +Now, to avoid passing `-j1` to `ninja`, remove filtering of `-j` arguments from the `Makefile`: + +```patch +diff --git a/Makefile b/Makefile +index bfc4b2c8e9..d66141787e 100644 +--- a/Makefile ++++ b/Makefile +@@ -142,7 +142,6 @@ MAKE.k = $(findstring k,$(firstword $(filter-out --%,$(MAKEFLAGS)))) + MAKE.q = $(findstring q,$(firstword $(filter-out --%,$(MAKEFLAGS)))) + MAKE.nq = $(if $(word 2, $(MAKE.n) $(MAKE.q)),nq) + NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \ +- $(filter-out -j, $(lastword -j1 $(filter -l% -j%, $(MAKEFLAGS)))) \ + -d keepdepfile + ninja-cmd-goals = $(or $(MAKECMDGOALS), all) + ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g)) +``` + +Run the build again, and see four jobs being run in parallel: + +`make clean; MAKEFLAGS="--jobserver-auth=fifo:/tmp/jobserver" make` +Additional information: +[0] https://github.com/stefanb2/ninja/tree/topic-issue-1139-part-3-jobserver-fifo diff --git a/results/classifier/gemma3:12b/performance/1904259 b/results/classifier/gemma3:12b/performance/1904259 new file mode 100644 index 00000000..016a538e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1904259 @@ -0,0 +1,30 @@ + +include/qemu/atomic.h:495:5: error: misaligned atomic operation may incur significant performance penalty (Clang 11; Ubuntu 16 i686) + +Hello. +I haven't found any "official" executables, for emulating RISC-V (32bit; 64bit) so I had to compile those myself. + +I found that auto-generated build scripts, for Ninja, contained some warnings interpreted as errors: + + +oceanfish81@gollvm:~/Desktop/qemu/build$ ninja -j 1 +[2/1977] Compiling C object libqemuutil.a.p/util_qsp.c.o +FAILED: libqemuutil.a.p/util_qsp.c.o +clang-11 -Ilibqemuutil.a.p -I. -I.. -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/glib-2.0 -I/usr/lib/i386-linux-gnu/glib-2.0/include -I/usr/include/gio-unix-2.0/ -Xclang -fcolor-diagnostics -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -O2 -g -m32 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -Wno-psabi -fstack-protector-strong -isystem /home/oceanfish81/Desktop/qemu/linux-headers -isystem linux-headers -iquote /home/oceanfish81/Desktop/qemu/tcg/i386 -iquote . -iquote /home/oceanfish81/Desktop/qemu -iquote /home/oceanfish81/Desktop/qemu/accel/tcg -iquote /home/oceanfish81/Desktop/qemu/include -iquote /home/oceanfish81/Desktop/qemu/disas/libvixl -pthread -Wno-unused-function -fPIC -MD -MQ libqemuutil.a.p/util_qsp.c.o -MF libqemuutil.a.p/util_qsp.c.o.d -o libqemuutil.a.p/util_qsp.c.o -c ../util/qsp.c +In file included from ../util/qsp.c:62: +In file included from /home/oceanfish81/Desktop/qemu/include/qemu/thread.h:4: +In file included from /home/oceanfish81/Desktop/qemu/include/qemu/processor.h:10: +/home/oceanfish81/Desktop/qemu/include/qemu/atomic.h:495:5: error: misaligned atomic operation may incur significant performance penalty [-Werror,-Watomic-alignment] + qatomic_set__nocheck(ptr, val); + ^ +/home/oceanfish81/Desktop/qemu/include/qemu/atomic.h:138:5: note: expanded from macro 'qatomic_set__nocheck' + __atomic_store_n(ptr, i, __ATOMIC_RELAXED) + ^ +/home/oceanfish81/Desktop/qemu/include/qemu/atomic.h:485:12: error: misaligned atomic operation may incur significant performance penalty [-Werror,-Watomic-alignment] + return qatomic_read__nocheck(ptr); + ^ +/home/oceanfish81/Desktop/qemu/include/qemu/atomic.h:129:5: note: expanded from macro 'qatomic_read__nocheck' + __atomic_load_n(ptr, __ATOMIC_RELAXED) + ^ +2 errors generated. +ninja: build stopped: subcommand failed. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1906184 b/results/classifier/gemma3:12b/performance/1906184 new file mode 100644 index 00000000..971a0466 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1906184 @@ -0,0 +1,29 @@ + +Lots of stuttering/crackling in guest sound + +When listening to music (e.g. with VLC) or watching Youtube on the guest, there's lots of stuttering and crackling in the sound. + + +Tested with the following QEMU start commands: + +qemu-system-x86_64 -enable-kvm -m 6G -cpu host -smp 3 -cdrom ./linux/kubuntu-20.04-desktop-amd64.iso -boot d -vga virtio -soundhw hda -display sdl,gl=on + +qemu-system-x86_64 -enable-kvm -m 6G -cpu host -smp 3 -cdrom ./linux/kubuntu-20.04-desktop-amd64.iso -boot d -vga qxl -soundhw hda -display sdl + +qemu-system-x86_64 -enable-kvm -m 6G -cpu host -smp 3 -cdrom ./linux/kubuntu-20.04-desktop-amd64.iso -boot d -vga qxl -soundhw hda -display gtk + + +If I use the following command (QXL graphics, "remote" access via SPICE over unix socket), stuttering is not completely gone but MUCH less annoying: + +qemu-system-x86_64 -enable-kvm -m 6G -cpu host -smp 3 -cdrom ./linux/kubuntu-20.04-desktop-amd64.iso -boot d -soundhw hda -vga qxl -device virtio-serial-pci -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -chardev spicevmc,id=spicechannel0,name=vdagent -spice unix,addr=/tmp/vm_spice.socket,disable-ticketing + +and this command for accessing the VM: +remote-viewer spice+unix:///tmp/vm_spice.socket + + + +Guest: Kubuntu 20.04 64-bit (live), but occurs with many other as well +Host: Arch Linux, with KDE desktop +CPU: Intel Xeon E3-1230v2 (4 cores + hyperthreading) +RAM: 16 GB +GPU: Nvidia GTX 980 Ti \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1906536 b/results/classifier/gemma3:12b/performance/1906536 new file mode 100644 index 00000000..c5c0d91f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1906536 @@ -0,0 +1,31 @@ + +Unable to set SVE VL to 1024 bits or above since 7b6a2198 + +Prior to 7b6a2198e71794c851f39ac7a92d39692c786820, the QEMU option sve-max-vq could be used to set the vector length of the implementation. This is useful (among other reasons) for testing software compiled with a fixed SVE vector length. Since this commit, the vector length is capped at 512 bits. + +To reproduce the issue: + +$ cat rdvl.s +.global _start +_start: + rdvl x0, #1 + asr x0, x0, #4 + mov x8, #93 // exit + svc #0 +$ aarch64-linux-gnu-as -march=armv8.2-a+sve rdvl.s -o rdvl.o +$ aarch64-linux-gnu-ld rdvl.o +$ for vl in 1 2 4 8 16; do ../build-qemu/aarch64-linux-user/qemu-aarch64 -cpu max,sve-max-vq=$vl a.out; echo $?; done +1 +2 +4 +4 +4 + +For a QEMU built prior to the above revision, we get the output: +1 +2 +4 +8 +16 + +as expected. It seems that either the old behavior should be restored, or there should be an option to force a higher vector length? \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1908 b/results/classifier/gemma3:12b/performance/1908 new file mode 100644 index 00000000..094346ea --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1908 @@ -0,0 +1,50 @@ + +8.1.0 regression: abnormal segfault in qemu-riscv64-static +Description of problem: +loading_from_clipboard_test of Cockatrice segfaults in qemu-riscv64-static. +Steps to reproduce: +1. Setup an Arch Linux riscv64 qemu-user container: https://github.com/felixonmars/archriscv-packages/wiki/Setup-Arch-Linux-RISC-V-Development-Environment +2. Start the container: `sudo systemd-nspawn -D ./archriscv -a -U` +3. Build cockatrice 2.9.0 with tests in the container: https://github.com/Cockatrice/Cockatrice/releases/tag/2023-09-14-Release-2.9.0 +4. Run tests/loading_from_clipboard/loading_from_clipboard_test in the container +Additional information: +I have done bisection and find out that this commit caused the regression: 2d708164e0475064e0e2167bd73e8570e22df1e0 + +qemu built from HEAD(494a6a2) is still affected by this bug. + +Backtrace: + +``` +#0 0x00007fffe849f133 in code_gen_buffer () +#1 0x00007ffff7b3a433 in cpu_tb_exec (cpu=0x7ffff7f71010, itb=0x7fffe849f040 <code_gen_buffer+4845587>, +tb_exit=0x7fffffffde20) at ../qemu/accel/tcg/cpu-exec.c:457 +#2 0x00007ffff7b3aeac in cpu_loop_exec_tb (cpu=0x7ffff7f71010, tb=0x7fffe849f040 <code_gen_buffer+4845587>, +pc=46912625654024, last_tb=0x7fffffffde30, tb_exit=0x7fffffffde20) at ../qemu/accel/tcg/cpu-exec.c:919 +#3 0x00007ffff7b3b0e0 in cpu_exec_loop (cpu=0x7ffff7f71010, sc=0x7fffffffdeb0) at ../qemu/accel/tcg/cpu-exec.c:1040 +#4 0x00007ffff7b3b19e in cpu_exec_setjmp (cpu=0x7ffff7f71010, sc=0x7fffffffdeb0) +at ../qemu/accel/tcg/cpu-exec.c:1057 +#5 0x00007ffff7b3b225 in cpu_exec (cpu=0x7ffff7f71010) at ../qemu/accel/tcg/cpu-exec.c:1083 +#6 0x00007ffff7a53707 in cpu_loop (env=0x7ffff7f71330) at ../qemu/linux-user/riscv/cpu_loop.c:37 +#7 0x00007ffff7b5d0e0 in main (argc=4, argv=0x7fffffffe768, envp=0x7fffffffe790) at ../qemu/linux-user/main.c:999 +``` + +``` +0x7fffe849f105 <code_gen_buffer+4845784> jl 0x7fffe849f265 <code_gen_buffer+4846136> │ +│ 0x7fffe849f10b <code_gen_buffer+4845790> mov 0x50(%rbp),%rbx │ +│ 0x7fffe849f10f <code_gen_buffer+4845794> mov %rbx,%r12 │ +│ 0x7fffe849f112 <code_gen_buffer+4845797> mov %r12,0x70(%rbp) │ +│ 0x7fffe849f116 <code_gen_buffer+4845801> movabs $0x2aaaaf9bb000,%r13 │ +│ 0x7fffe849f120 <code_gen_buffer+4845811> mov %r13,0x38(%rbp) │ +│ 0x7fffe849f124 <code_gen_buffer+4845815> movq $0xffffffffaf9bb000,0x60(%rbp) │ +│ 0x7fffe849f12c <code_gen_buffer+4845823> mov $0xffffffffaf9bb4e0,%r13 │ +│ > 0x7fffe849f133 <code_gen_buffer+4845830> movzwl 0x0(%r13),%r13d │ +│ 0x7fffe849f138 <code_gen_buffer+4845835> and $0x7f,%ebx │ +│ 0x7fffe849f13b <code_gen_buffer+4845838> shl $0x7,%r13 │ +│ 0x7fffe849f13f <code_gen_buffer+4845842> add %r13,%rbx │ +│ 0x7fffe849f142 <code_gen_buffer+4845845> mov %rbx,0x50(%rbp) │ +│ 0x7fffe849f146 <code_gen_buffer+4845849> shl %rbx │ +│ 0x7fffe849f149 <code_gen_buffer+4845852> mov %rbx,0x38(%rbp) │ +│ 0x7fffe849f14d <code_gen_buffer+4845856> movabs $0x2aaaaf9a88e0,%r13 │ +│ 0x7fffe849f157 <code_gen_buffer+4845866> add %r13,%rbx │ +│ 0x7fffe849f15a <code_gen_buffer+4845869> mov %rbx,0x60(%rbp) +``` diff --git a/results/classifier/gemma3:12b/performance/1913505 b/results/classifier/gemma3:12b/performance/1913505 new file mode 100644 index 00000000..7e362a32 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1913505 @@ -0,0 +1,13 @@ + +Windows XP slow on Apple M1 + +Qemu installed by using brew install qemu -s on M1 + +QEMU emulator version 5.2.0 +XP image from: https://archive.org/details/WinXPProSP3x86 + +Commands run: +$ qemu-img create -f qcow2 xpsp3.img 10G +$ qemu-system-i386 -m 512 -hda xpsp3.img -cdrom WinXPProSP3x86/en_windows_xp_professional_with_service_pack_3_x86_cd_vl_x14-73974.iso -boot d + +It's taken 3 days now with qemu running at around 94% CPU and installation hasn't finished. The mouse pointer moves and occasionally changes between the pointer and hourglass so it doesn't seem to have frozen. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1914282 b/results/classifier/gemma3:12b/performance/1914282 new file mode 100644 index 00000000..25454df1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1914282 @@ -0,0 +1,73 @@ + +block copy job sometimes hangs on the last block for minutes + +In openstack nova we use the block copy API to copy disks for volume swap requests. In our CI gate we observed that sometimes the block copy job progress will reach the last or next to last block and hang there for several minutes at a time, causing CI jobs to fail as jobs timeout. + +On the client (nova-compute) side, using the python bindings we see the following in the openstack nova logs: + +--------------- + +Jan 21 05:31:02.207785 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 0 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +Jan 21 05:31:55.688227 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 1049624576 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +[...] + +Jan 21 05:31:55.706698 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 1049624576 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +Jan 21 05:31:56.175248 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 1073741823 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +[...] + +~2.5 minutes later, it's still going at current cursor: 1073741823 final cursor: 1073741824 + +Jan 21 05:34:30.952371 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 1073741823 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +then current cursor == final cursor at 05:34:31.460595 + +Jan 21 05:34:31.460595 ubuntu-focal-vexxhost-ca-ymq-1-0022641000 nova-compute[93823]: DEBUG nova.virt.libvirt.guest [None req-d6170fbb-e023-4cdb-93dc-a2e9ae9b0a56 tempest-TestVolumeSwap-1117975117 tempest-TestVolumeSwap-1117975117] COPY block job progress, current cursor: 1073741824 final cursor: 1073741824 {{(pid=93823) is_job_complete /opt/stack/nova/nova/virt/libvirt/guest.py:873}} + +--------------- + +In this excerpt the cursor reaches the next to last block at Jan 21 05:31:56.175248 and hangs there repeating status at the next to last block until Jan 21 05:34:30.952371 (~2.5 minutes) and then the job shows current cursor == final cursor at Jan 21 05:34:31.460595. + +In the corresponding qemu log, we see the block copy job report being on the last block for minutes: + +--------------- + +021-01-21 05:31:02.206+0000: 60630: debug : qemuMonitorJSONIOProcessLine:220 : Line [{"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": true, "len": 1073741824, "offset": 0, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}], "id": "libvirt-501"}] +2021-01-21 05:31:02.206+0000: 60630: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7fd07813ec80 reply={"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": true, "len": 1073741824, "offset": 0, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}], "id": "libvirt-501"} + +[...] + +len == offset at 05:31:56.174 + +2021-01-21 05:31:56.174+0000: 60630: debug : qemuMonitorJSONIOProcessLine:220 : Line [{"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": true, "len": 1073741824, "offset": 1073741824, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}], "id": "libvirt-581"}] +2021-01-21 05:31:56.174+0000: 60630: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7fd07813ec80 reply={"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": true, "len": 1073741824, "offset": 1073741824, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}], "id": "libvirt-581"} + +[...] + +~2.5 minutes later, still len == offset but it's still going + +2021-01-21 05:34:31.459+0000: 60630: debug : qemuMonitorJSONIOProcessLine:220 : Line [{"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": false, "len": 1073741824, "offset": 1073741824, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}], "id": "libvirt-855"}] +2021-01-21 05:34:31.459+0000: 60630: info : qemuMonitorJSONIOProcessLine:239 : QEMU_MONITOR_RECV_REPLY: mon=0x7fd07813ec80 reply={"return": [{"auto-finalize": true, "io-status": "ok", "device": "copy-vdb-libvirt-5-format", "auto-dismiss": false, "busy": false, "len": 1073741824, "offset": 1073741824, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}], "id": "libvirt-855"} + +and then the job finishes soon after + +2021-01-21 05:34:31.467+0000: 60630: debug : qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-libvirt-5-format'(domain: 0x7fd070075720,instance-00000013) state changed to 'waiting'(6) + +2021-01-21 05:34:31.467+0000: 60630: debug : qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-libvirt-5-format'(domain: 0x7fd070075720,instance-00000013) state changed to 'pending'(7) + +2021-01-21 05:34:31.467+0000: 60630: debug : qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-libvirt-5-format'(domain: 0x7fd070075720,instance-00000013) state changed to 'concluded'(9) + +2021-01-21 05:34:31.468+0000: 60630: debug : qemuProcessHandleJobStatusChange:1002 : job 'copy-vdb-libvirt-5-format'(domain: 0x7fd070075720,instance-00000013) state changed to 'null'(11) + +2021-01-21 05:34:31.468+0000: 60634: debug : qemuBlockJobProcessEventConcludedCopyPivot:1221 : copy job 'copy-vdb-libvirt-5-format' on VM 'instance-00000013' pivoted + +2021-01-21 05:34:32.292+0000: 60634: debug : qemuDomainObjEndJob:9746 : Stopping job: modify (async=none vm=0x7fd070075720 name=instance-00000013) + +--------------- + +Is this normal for a block copy job to hang on the last block like this for minutes at a time? Why doesn't the job close out once offset == len? + +Thanks for any help you can offer. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1914667 b/results/classifier/gemma3:12b/performance/1914667 new file mode 100644 index 00000000..461602bb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1914667 @@ -0,0 +1,21 @@ + +High cpu usage when guest is idle on qemu-system-i386 + +When running Windows XP in qemu-system-i386, the cpu usage of QEMU is about 100% even when the guest CPU usage is close to 2%. The host cpu usage should be low when the guest cpu usage is low. + +Command: qemu-system-i386 -hda <Windows XP HD image> + +Using this command also shows around 100% host CPU usage: +qemu-system-i386 -m 700 -hda <Windows XP HD image> -usb -device usb-audio -net nic,model=rtl8139 -net user -hdb mountable.img -soundhw pcspk + +Using the Penryn CPU option also saw this problem: +qemu-system-i386 -hda <Windows XP HD image> -m 700 -cpu Penryn-v1 + +Using "-cpu pentium2" saw the same high host cpu usage. + + +My Info: +M1 MacBook Air +Mac OS 11.1 +qemu-system-i386 version 5.2 (1ba089f2255bfdb071be3ce6ac6c3069e8012179) +Windows XP SP3 Build 2600 \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1915531 b/results/classifier/gemma3:12b/performance/1915531 new file mode 100644 index 00000000..fa29c2bc --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1915531 @@ -0,0 +1,55 @@ + +qemu-user child process hangs when forking due to glib allocation + +I and others have recently been using qemu-user for RISCV64 extensively. We have had many hangs. We have found that hangs happen in process with multiple threads and forking. For example +`cargo` (a tool for the Rust compiler). + +It does not matter if there are a lot of calls to fork. What seems to matter most is that there are many threads running. So this happens more often on a CPU with a massive number of cores, and if nothing else is really running. The hang happens in the child process of the fork. + +To reproduce the problem, I have attached an example of C++ program to run through qemu-user. + +Here are the stacks of the child processes that hanged. This is for qemu c973f06521b07af0f82893b75a1d55562fffb4b5 with glib 2.66.4 + +------- +Thread 1: +#0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 +#1 0x00007f54e190c77c in g_mutex_lock_slowpath (mutex=mutex@entry=0x7f54e1dc7600 <allocator+96>) at ../glib/gthread-posix.c:1462 +#2 0x00007f54e190d222 in g_mutex_lock (mutex=mutex@entry=0x7f54e1dc7600 <allocator+96>) at ../glib/gthread-posix.c:1486 +#3 0x00007f54e18e39f2 in magazine_cache_pop_magazine (countp=0x7f54280e6638, ix=2) at ../glib/gslice.c:769 +#4 thread_memory_magazine1_reload (ix=2, tmem=0x7f54280e6600) at ../glib/gslice.c:845 +#5 g_slice_alloc (mem_size=mem_size@entry=40) at ../glib/gslice.c:1058 +#6 0x00007f54e18f06fa in g_tree_node_new (value=0x7f54d4066540 <code_gen_buffer+419091>, key=0x7f54d4066560 <code_gen_buffer+419123>) at ../glib/gtree.c:517 +#7 g_tree_insert_internal (tree=0x555556aed800, key=0x7f54d4066560 <code_gen_buffer+419123>, value=0x7f54d4066540 <code_gen_buffer+419091>, replace=0) at ../glib/gtree.c:517 +#8 0x00007f54e186b755 in tcg_tb_insert (tb=0x7f54d4066540 <code_gen_buffer+419091>) at ../tcg/tcg.c:534 +#9 0x00007f54e1820545 in tb_gen_code (cpu=0x7f54980b4b60, pc=274906407438, cs_base=0, flags=24832, cflags=-16252928) at ../accel/tcg/translate-all.c:2118 +#10 0x00007f54e18034a5 in tb_find (cpu=0x7f54980b4b60, last_tb=0x7f54d4066440 <code_gen_buffer+418835>, tb_exit=0, cf_mask=524288) at ../accel/tcg/cpu-exec.c:462 +#11 0x00007f54e1803bd9 in cpu_exec (cpu=0x7f54980b4b60) at ../accel/tcg/cpu-exec.c:818 +#12 0x00007f54e1735a4c in cpu_loop (env=0x7f54980bce40) at ../linux-user/riscv/cpu_loop.c:37 +#13 0x00007f54e1844b22 in clone_func (arg=0x7f5402f3b080) at ../linux-user/syscall.c:6422 +#14 0x00007f54e191950a in start_thread (arg=<optimized out>) at pthread_create.c:477 +#15 0x00007f54e19a52a3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 + +Thread 2: +#1 0x00007f54e18a8d6e in qemu_futex_wait (f=0x7f54e1dc7038 <rcu_call_ready_event>, val=4294967295) at /var/home/valentin/repos/qemu/include/qemu/futex.h:29 +#2 0x00007f54e18a8f32 in qemu_event_wait (ev=0x7f54e1dc7038 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:460 +#3 0x00007f54e18c0196 in call_rcu_thread (opaque=0x0) at ../util/rcu.c:258 +#4 0x00007f54e18a90eb in qemu_thread_start (args=0x7f5428244930) at ../util/qemu-thread-posix.c:521 +#5 0x00007f54e191950a in start_thread (arg=<optimized out>) at pthread_create.c:477 +#6 0x00007f54e19a52a3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 +------- + +Thread 1 seems to be the really hanged process. + +The problem is that glib is used in many places. Allocations are done through g_slice. g_slice has a global state that is not fork safe. + +So even though the cpu thread is set to exclusive before forking, it is not enough. Because there are other uses of glib data structures that are not part of the cpu loop (I think). So it seems not to be synchronized by `start_exclusive`, `end_exclusive`. + +So if one of the use of glib data structure is used during the fork, an allocation might lock a mutex in g_slice. + +When the cpu loop resumes in forked process, then the use of any glib data structure might just hang on a locked mutex in g_slice. + +So as a work-around we have starting using is setting environment `G_SLICE=always-malloc`. This resolves the hangs. + +I have opened an issue upstream: https://gitlab.gnome.org/GNOME/glib/-/issues/2326 + +As fork documentation says, the child should be async-signal-safe. However, glibc's malloc is safe in fork child even though it is not async-signal-safe. So it is not that obvious where the responsability is. Should glib handle this case like malloc does? Or should qemu not use glib in the fork child? \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1920767 b/results/classifier/gemma3:12b/performance/1920767 new file mode 100644 index 00000000..de9d4f25 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1920767 @@ -0,0 +1,8 @@ + +build-tools-and-docs-debian job waste cycles building pointless things + +The build-tools-and-docs-debian job waste CI cycles building softfloat: +https://gitlab.com/qemu-project/qemu/-/jobs/1117005759 + +Possible fix suggested on the list: +https://<email address hidden>/msg793872.html \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1923648 b/results/classifier/gemma3:12b/performance/1923648 new file mode 100644 index 00000000..ba5362be --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1923648 @@ -0,0 +1,26 @@ + +macOS App Nap feature gradually freezes QEMU process + +macOS version: 10.15.2 +QEMU versions: 5.2.0 (from MacPorts) + 5.2.92 (v6.0.0-rc2-23-g9692c7b037) + +If the QEMU window is not visible (hidden, minimized or another application is in full screen mode), the QEMU process gradually freezes: it still runs, but the VM does not respond to external requests such as Telnet or SSH until the QEMU window is visible on the desktop. + +This behavior is due to the work of the macOS App Nap function: +https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/AppNap.html#//apple_ref/doc/uid/TP40013929-CH2-SW1 + +It doesn't matter how the process is started -- as a background job or as a foreground shell process in case QEMU has a desktop window. + +My VM does not have a display output, only a serial line, most likely if the VM was using OpenGL, or playing sound (or any other App Nap triggers), then the problem would never have been detected. + +In my case only one starting way without this problem: +sudo qemu-system-x86_64 -nodefaults \ +-cpu host -accel hvf -smp 1 -m 384 \ +-device virtio-blk-pci,drive=flash0 \ +-drive file=/vios-adventerprisek9-m.vmdk.SPA.156-1.T.vmdk,if=none,format=vmdk,id=flash0 \ +-device e1000,netdev=local -netdev tap,id=local,ifname=tap0,script=no,downscript=no \ +-serial stdio -display none + +The typical way from the internet to disable App Nap doesn't work: +defaults write NSGlobalDomain NSAppSleepDisabled -bool YES \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/1945 b/results/classifier/gemma3:12b/performance/1945 new file mode 100644 index 00000000..9c41fe0e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1945 @@ -0,0 +1,2 @@ + +More than 8 cores for RISC-V generic `virt` machine diff --git a/results/classifier/gemma3:12b/performance/1946 b/results/classifier/gemma3:12b/performance/1946 new file mode 100644 index 00000000..367e5e97 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1946 @@ -0,0 +1,29 @@ + +High CPU Load after QEMU 8.1.1 +Description of problem: +Since the update there is a massive CPU load and this affects the CPU load of the router. +The VMs are partially for about 3min sporadically not accessible. +The VMs themselves were not adjusted and I have in the console. + +Using the VMM, I was able to see the message recorded below. + +`watchdog:_ BUG: soft lockup - CPU#0 stuck for 21s! [swapper/0:0]` + +I will also add some data like a XML file of a VM. +Additional information: + +[webproxy.log](/uploads/1d428f4c59b2397b9343a62dd8c4bce2/webproxy.log) + +[webproxy.xml](/uploads/04221c88956c49d76b4896dd8f6fd1f0/webproxy.xml) +[Host_Kernel.log](/uploads/f145bf599bf2003b89c17daaabb07143/Host_Kernel.log) + +Unfortunately I can't revert to the old QEMU version in the router OS but in the current state all my VM are not really 100% usable anymore. + +I would be very grateful if you could take a look at my case. + +many thanks in advance. + + + + +Paul diff --git a/results/classifier/gemma3:12b/performance/1953 b/results/classifier/gemma3:12b/performance/1953 new file mode 100644 index 00000000..5e7098fc --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1953 @@ -0,0 +1,147 @@ + +Segmentation fault when compiling elixir app on qemu aarch64 on x86_64 host +Description of problem: +When I try to install an elixir escript using + +``` +mix escript.install github upmaru/pakman --force +``` + +I run into a segfault with the following output + +``` + + +Build and Deploy +failed Oct 22, 2023 in 1m 27s +2s +2s +22s +56s +remote: Compressing objects: 86% (144/167) +remote: Compressing objects: 87% (146/167) +remote: Compressing objects: 88% (147/167) +remote: Compressing objects: 89% (149/167) +remote: Compressing objects: 90% (151/167) +remote: Compressing objects: 91% (152/167) +remote: Compressing objects: 92% (154/167) +remote: Compressing objects: 93% (156/167) +remote: Compressing objects: 94% (157/167) +remote: Compressing objects: 95% (159/167) +remote: Compressing objects: 96% (161/167) +remote: Compressing objects: 97% (162/167) +remote: Compressing objects: 98% (164/167) +remote: Compressing objects: 99% (166/167) +remote: Compressing objects: 100% (167/167) +remote: Compressing objects: 100% (167/167), done. +remote: Total 2568 (delta 86), reused 188 (delta 58), pack-reused 2341 +origin/HEAD set to develop +Resolving Hex dependencies... +Resolution completed in 0.872s +New: + castore 1.0.4 + finch 0.16.0 + hpax 0.1.2 + jason 1.4.1 + mime 2.0.5 + mint 1.5.1 + nimble_options 1.0.2 + nimble_pool 1.0.0 + slugger 0.3.0 + telemetry 1.2.1 + tesla 1.7.0 + yamerl 0.10.0 + yaml_elixir 2.8.0 +* Getting tesla (Hex package) +* Getting jason (Hex package) +* Getting yaml_elixir (Hex package) +* Getting slugger (Hex package) +* Getting finch (Hex package) +* Getting mint (Hex package) +* Getting castore (Hex package) +* Getting hpax (Hex package) +* Getting mime (Hex package) +* Getting nimble_options (Hex package) +* Getting nimble_pool (Hex package) +* Getting telemetry (Hex package) +* Getting yamerl (Hex package) +Resolving Hex dependencies... +Resolution completed in 0.413s +Unchanged: + castore 1.0.4 + finch 0.16.0 + hpax 0.1.2 + jason 1.4.1 + mime 2.0.5 + mint 1.5.1 + nimble_options 1.0.2 + nimble_pool 1.0.0 + slugger 0.3.0 + telemetry 1.2.1 + tesla 1.7.0 + yamerl 0.10.0 + yaml_elixir 2.8.0 +All dependencies are up to date +==> mime +Compiling 1 file (.ex) +Generated mime app +==> nimble_options +Compiling 3 files (.ex) +qemu: uncaught target signal 11 (Segmentation fault) - core dumped +Segmentation fault (core dumped) +``` +Steps to reproduce: +1. Create a repo using the github action zacksiri/setup-alpine +2. Install elixir +3. run `mix escript.install github upmaru/pakman --force` +Additional information: +You can use the following github action config as an example / starting point. + + +```yml +name: 'Deployment' + +on: + push: + branches: + - main + - master + - develop + +jobs: + build_and_deploy: + name: Build and Deploy + runs-on: ubuntu-latest + steps: + - name: 'Checkout' + uses: actions/checkout@v3 + with: + ref: ${{ github.event.workflow_run.head_branch }} + fetch-depth: 0 + + - name: 'Setup Alpine' + uses: zacksiri/setup-alpine@master + with: + branch: v3.18 + arch: aarch64 + qemu-repo: edge + packages: | + zip + tar + sudo + alpine-sdk + coreutils + cmake + elixir + + - name: 'Setup PAKman' + run: | + export MIX_ENV=prod + + mix local.rebar --force + mix local.hex --force + mix escript.install github upmaru/pakman --force + shell: alpine.sh {0} +``` + +I'm using alpine 3.18 which has otp25 with jit enabled so I suspect this is something to do with https://gitlab.com/qemu-project/qemu/-/issues/1034 diff --git a/results/classifier/gemma3:12b/performance/1966 b/results/classifier/gemma3:12b/performance/1966 new file mode 100644 index 00000000..7eabf086 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1966 @@ -0,0 +1,9 @@ + +windows xp - some VM's hangs some working (regression?) +Description of problem: +Some of my XP instances behaves strange - seems that explorer.exe is unresponsive for about half an hour after start then works +- normally. +what is worse - there are instance which behaves normally - ie. after launch everything works as expected. +Steps to reproduce: +I want to know. +Additional information: +under qemu 8.0.4 all vms works. diff --git a/results/classifier/gemma3:12b/performance/1987 b/results/classifier/gemma3:12b/performance/1987 new file mode 100644 index 00000000..c64d4139 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/1987 @@ -0,0 +1,49 @@ + +snapshot: main thread hangs for a while after 'loadvm' +Description of problem: +When I was testing qemu snapshots, I found that after the `loadvm` command, the virtual machine would often get stuck for a while, and it can **resume execution after I enter some characters into the terminal**, this is weird because my guest system doesn't need to accept input. + +After some debugging, I found that the guest kernel is executing a `wait` instruction in `__r4k_wait()`. + +And I found that the main thread usually does not sleep at `qemu_poll_ns()` during normal execution, but after `loadvm`, the timeout is set to a large value (related to the interval time of snapshot operations), causes the main thread to get stuck on 'qemu_poll_ns()'. + +After some analysis, I think it is because save/load_snapshot() does not handle timers related to QEMU_CLOCK_VIRTUAL well, such as `mips_timer_cb()`, resulting in incorrect value when calculating timeout. +Steps to reproduce: +1. Start emulation and connect monitor +2. `savevm` and wait for a moment +3. `loadvm` +Additional information: +Stack backtrace of the guest kernel: +``` +► 0 0x80104d40 __r4k_wait+32 + 1 0x80143cc4 cpu_startup_entry+284 + 2 0x80143cc4 cpu_startup_entry+284 + 3 0x80143cc4 cpu_startup_entry+284 + 4 0x80633fe0 kernel_init + 5 0x80825cb8 start_kernel+1072 +``` + +Stack backtrace of the main thread: +``` +0 0x7ffff74f0a96 ppoll+166 +1 0x555555ea4786 qemu_poll_ns+221 +2 0x555555e9fea7 os_host_main_loop_wait+93 +3 0x555555e9ffd6 main_loop_wait+187 +4 0x555555a644fd qemu_main_loop+46 +5 0x5555557d2b6a qemu_default_main+17 +6 0x5555557d2ba9 main+45 +7 0x7ffff7402083 __libc_start_main+243 +``` + +Stack backtrace of the vCPU thread: +``` +#0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556550848) at ../sysdeps/nptl/futex-internal.h:183 +#1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x5555564d0860 <qemu_global_mutex>, cond=0x555556550820) at pthread_cond_wait.c:508 +#2 __pthread_cond_wait (cond=0x555556550820, mutex=0x5555564d0860 <qemu_global_mutex>) at pthread_cond_wait.c:647 +#3 0x0000555555e85602 in qemu_cond_wait_impl (cond=0x555556550820, mutex=0x5555564d0860 <qemu_global_mutex>, file=0x5555560122ab "../system/cpus.c", line=431) at ../util/qemu-thread-posix.c:225 +#4 0x0000555555a5618f in qemu_wait_io_event (cpu=0x555556825200) at ../system/cpus.c:431 +#5 0x0000555555c8bcf1 in mttcg_cpu_thread_fn (arg=0x555556825200) at ../accel/tcg/tcg-accel-ops-mttcg.c:118 +#6 0x0000555555e85e50 in qemu_thread_start (args=0x555556550860) at ../util/qemu-thread-posix.c:541 +#7 0x00007ffff75d8609 in start_thread (arg=<optimized out>) at pthread_create.c:477 +#8 0x00007ffff74fd133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 +``` diff --git a/results/classifier/gemma3:12b/performance/2043 b/results/classifier/gemma3:12b/performance/2043 new file mode 100644 index 00000000..d255d06b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2043 @@ -0,0 +1,75 @@ + +QEMU hangs sometimes during TRIM command +Description of problem: +I encountered a virtual machine freeze when map cache invalidation request was received while executing a TRIM command. + +I did some research and i think i found the problem. + +1. `xen_invalidate_map_cache` calls `bdrv_drain_all` before invalidation +2. All BlockBackend devices run into quiesce mode (increment of `blk->quiesce_counter` in `blk_root_drained_begin`) +3. When processing another block in TRIM command coroutine `blk_co_do_pdiscard` calls `blk_wait_while_drained` +4. In `blk_wait_while_drained` we go under tre condition, decrement `in_flight` counter and yield the coroutine +5. After return from `blk_aio_complete_bh` `in_flight` counter of `BlockBackend` device remains with value 1, which prevents `AIO_WAIT_WHILE_UNLOCKED(NULL, bdrv_drain_all_poll());` loop from exiting +6. So QEMU stays in `bdrv_drain_all_begin` method + +Now why `in_flight` counter does not go to zero in point 5? + +Below is a call diagram for TRIM command. For example, consider processing of 2 blocks. + + + +s can be seen from the diagram `in_flight` counter of BlockBackend at first increments at start of command in `ide_issue_trim`, and next in `blk_aio_prwv` before start of coroutine. But for second and next blocks we get into BH method `blk_aio_complete_bh` and before decrementing `in_flight` we call `acb->common.cb` callback, that is in fact `ide_issue_trim_cb`, so we incrementing `in_flight` again to value of 3. And decrementing to value of 2 before return from `blk_aio_complete`. + +So, the value of `blk->in_flight` varies in range [2..3] during block processing. + +Now consider the situation when map cache invalidation request is received during a block processing in TRIM command. Below is a call diagram for this situation. + + + +In this example we get invalidation request before second block processing. Our BlockBackend device run into quiesce mode, and we yielding the coroutine in `blk_wait_while_drained`, decrementing `in_flight` counter from 3 to 2. Second decrement is made in `blk_aio_complete` (2 to 1). + +And now we get in situation, when we not scheduling any block processing methods, as they must be called later from `bdrv_drain_all_end`, and on the other hand, `bdrv_drain_all_poll` always returns true, as we have non-zero `in_flight` counter on one of BlockBackend devices. + +As one of possible solutions i try to call `blk_set_disable_request_queuing(s->blk, true);` in `ide_issue_trim` and corresponding `blk_set_disable_request_queuing(blk, false);` in `ide_trim_bh_cb`. Looks like it solves the problem, so TRIM command always process completely, as is ignore quiesce mode and not do coroutine yielding. But i think is not optimal. + +I try also remove incrementing and decrementing of `in_flight` counter in `ide_issue_trim` and `ide_trim_bh_cb`, so value of counter varies in range [1..2] during block processing. This also works, but i started to get warings like `Locked DMA mapping while invalidating mapcache!`, as TRIM command probably uses map cache and is not completed before actual map cache invalidation. +Steps to reproduce: +1. Run virtual machine +2. Run progrms, work with files, etc. +Additional information: +QEMU trace logs. Enabled trace events: handle_ioreq, ide_dma_cb, dma_blk_io, dma_blk_cb, dma_complete, qemu_coroutime_yield. + +Log of TRIM command without freeze excerpt: + +``` +… +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=0 df=0 ptr=0 port=0x1f4 data=0x0 count=1 size=1 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=0 df=0 ptr=0 port=0x1f5 data=0x0 count=1 size=1 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=0 df=0 ptr=0 port=0x1f7 data=0x6 count=1 size=1 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=0 df=0 ptr=0 port=0xc160 data=0x1 count=1 size=1 +ide_dma_cb IDEState 0x5559d513ff98; sector_num=0 n=1 cmd=DMA TRIM +dma_blk_io dbs=0x5559d5c6f350 bs=0x5559d513ff98 offset=0 to_dev=1 +dma_blk_cb dbs=0x5559d5c6f350 ret=0 +dma_blk_cb dbs=0x5559d5c6f350 ret=0 +dma_complete dbs=0x5559d5c6f350 ret=0 cb=0x5559d1585620 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=1 df=0 ptr=0 port=0xc162 data=0x0 count=1 size=1 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=1 df=0 ptr=0 port=0xc162 data=0x0 count=1 size=1 +handle_ioreq I/O=0x7ffc51d5e160 type=0 dir=0 df=0 ptr=0 port=0xc160 data=0x0 count=1 size=1 +… +``` + +Log of TRIM command with freeze: + +``` +… +handle_ioreq I/O=0x7ffc52722ae0 type=8 dir=0 df=0 ptr=0 port=0x0 data=0xffffffffffffffff count=0 size=4 +handle_ioreq I/O=0x7ffc52722ae0 type=8 dir=0 df=0 ptr=0 port=0x0 data=0xffffffffffffffff count=0 size=4 +handle_ioreq I/O=0x7ffc52722ae0 type=8 dir=0 df=0 ptr=0 port=0x0 data=0xffffffffffffffff count=0 size=4 +handle_ioreq I/O=0x7ffc52722ae0 type=0 dir=0 df=0 ptr=0 port=0xc160 data=0x1 count=1 size=1 +ide_dma_cb IDEState 0x55c76faccf98; sector_num=0 n=1 cmd=DMA TRIM +dma_blk_io dbs=0x55c770425b50 bs=0x55c76faccf98 offset=0 to_dev=1 +dma_blk_cb dbs=0x55c770425b50 ret=0 +handle_ioreq I/O=0x7ffc52722ae0 type=8 dir=0 df=0 ptr=0 port=0x0 data=0xffffffffffffffff count=0 size=4 +qemu_coroutine_yield from 0x55c76f4207f0 to 0x7f7fb099e0c0 +[end of log, no more events] +``` diff --git a/results/classifier/gemma3:12b/performance/2055003 b/results/classifier/gemma3:12b/performance/2055003 new file mode 100644 index 00000000..4be4a194 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2055003 @@ -0,0 +1,66 @@ + +Qemu cmdline core dumped with more(8193 or more) cpus + +---Debugger--- +A debugger is not configured + +---Steps to Reproduce--- + +---Problem Description--- + Qemu cmdline core dumped with more(8193 or more) cpus + +---Debugger--- +A debugger is not configured + +---Steps to Reproduce--- + Qemu cmdline core dumped when more number of CPUs were given. + + +[root@ltcmihawk39 ~]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=9000 +** +ERROR:../tcg/region.c:782:tcg_region_init: assertion failed: (region_size >= 2 * page_size) +Bail out! ERROR:../tcg/region.c:782:tcg_region_init: assertion failed: (region_size >= 2 * page_size) +Aborted (core dumped) + +Expected Result: +Warning message like "Number of cpus requested exceeds the cpus supported" + +Actual Result: +core dumped + +Steps to Reproduce: +-------------------- + +1. Clone the upstream qemu from https://gitlab.com/qemu-project/qemu.git +2. Compile qemu with below steps. + cd qemu/ + git submodule init + git submodule update --recursive + ./configure --target-list=ppc64-softmmu --prefix=/usr + make + make install +3. set maxcpus=8193 or more + + +[root@ltcmihawk39 ~]# qemu-system-ppc64 --version +QEMU emulator version 8.0.94 (v8.1.0-rc4) +Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers + +NOTE: This behavior is observed only when qemu is built without disabling the tcg. + +Contact Information = <email address hidden> + +Machine Type = x + +---uname output--- +x + +Action needed + +Our IBM Dev want to include this patch in latest Canonical distro. + +Need the distro to review and integrate fixes provided by IBM + +https://github.com/qemu/qemu/commit/c4f91d7b7be76c47015521ab0109c6e998a369b0 + +Need to include this commit in latest Canonical distro. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/2063 b/results/classifier/gemma3:12b/performance/2063 new file mode 100644 index 00000000..08c96c89 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2063 @@ -0,0 +1,56 @@ + +Poor performance with -accel whpx on Server 2022 host, windows 10 guest - missing CPUID hypervisor ident data? +Description of problem: +**Performance of Windows 10 x64 QEMU virtual machine is essentially unusable, compared to same image running under Hyper-V on the same host system.** + +It appears the VM is not being provided the Hyper-V enlightenment hints while running under QEMU. The hv-XXX cpu options do not appear applicable to -accel WHPX. + +Below are dumps of the 0x40000000 cpuid values on the host, QEMU guest, and Hyper-V guest (exact same .VHD file used, nested virtualization not enabled for this VM). + +host: +- 0x40000000 eax=4000000c ebx=7263694d ecx=666f736f edx=76482074 +- 0x40000001 eax=31237648 ebx=0 ecx=0 edx=0 +- 0x40000002 eax=4f7c ebx=a0000 ecx=2 edx=85d +- 0x40000003 eax=bfff ebx=2bb9ff ecx=22 edx=71fffbf6 +- 0x40000004 eax=50d1c ebx=fff ecx=0 edx=0 +- 0x40000005 eax=400 ebx=400 ecx=ba00 edx=0 +- 0x40000006 eax=1e00be ebx=0 ecx=0 edx=0 +- 0x40000007 eax=80000007 ebx=3 ecx=0 edx=0 +- 0x40000008 eax=100001 ebx=1 ecx=aaaa edx=0 +- 0x40000009 eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000a eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000b eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000c eax=0 ebx=0 ecx=0 edx=0 + +qemu guest with -accel whpx : +- 0x40000000 eax=40000010 ebx=0 ecx=0 edx=0 +- 0x40000001 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000002 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000003 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000004 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000005 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000006 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000007 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000008 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000009 eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000a eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000b eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000c eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000d eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000e eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000f eax=0 ebx=0 ecx=0 edx=0 +- 0x40000010 eax=225519 ebx=30d40 ecx=0 edx=0 + +hyperv guest VM: (nested virtualization not enabled) +- 0x40000000 eax=4000000b ebx=7263694d ecx=666f736f edx=76482074 +- 0x40000001 eax=31237648 ebx=0 ecx=0 edx=0 +- 0x40000002 eax=4f7c ebx=a0000 ecx=2 edx=85d +- 0x40000003 eax=ae7f ebx=388030 ecx=22 edx=e0bed7b2 +- 0x40000004 eax=40c2c ebx=fff ecx=0 edx=0 +- 0x40000005 eax=f0 ebx=400 ecx=ba00 edx=0 +- 0x40000006 eax=e ebx=0 ecx=0 edx=0 +- 0x40000007 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000008 eax=0 ebx=0 ecx=0 edx=0 +- 0x40000009 eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000a eax=0 ebx=0 ecx=0 edx=0 +- 0x4000000b eax=0 ebx=0 ecx=0 edx=0 diff --git a/results/classifier/gemma3:12b/performance/2094 b/results/classifier/gemma3:12b/performance/2094 new file mode 100644 index 00000000..b19ba6d3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2094 @@ -0,0 +1,8 @@ + +Various record/replay avocado tests hang when run under gitlab CI +Description of problem: +While previous fixes have gone in including #2010 and #2013 we are still seeing +hangs on CI. Some examples: + + https://gitlab.com/thuth/qemu/-/jobs/5910241580#L227 + https://gitlab.com/thuth/qemu/-/jobs/5910241593#L396 diff --git a/results/classifier/gemma3:12b/performance/2137 b/results/classifier/gemma3:12b/performance/2137 new file mode 100644 index 00000000..a3b4befe --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2137 @@ -0,0 +1,2 @@ + +RISC-V Vector Slowdowns diff --git a/results/classifier/gemma3:12b/performance/2162 b/results/classifier/gemma3:12b/performance/2162 new file mode 100644 index 00000000..4ae2f740 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2162 @@ -0,0 +1,2 @@ + +Some subtests have over-optimistic timeouts and time out on the s390 runner diff --git a/results/classifier/gemma3:12b/performance/217 b/results/classifier/gemma3:12b/performance/217 new file mode 100644 index 00000000..d538d0a2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/217 @@ -0,0 +1,2 @@ + +Qemu does not force SSE data alignment diff --git a/results/classifier/gemma3:12b/performance/2181 b/results/classifier/gemma3:12b/performance/2181 new file mode 100644 index 00000000..b74ec212 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2181 @@ -0,0 +1,4 @@ + +-icount mips/gips/kips options on QEMU for more advanced icount option +Additional information: +Changing IPS in QEMU affects the frequency of VGA updates, the duration of time before a key starts to autorepeat, and the measurement of BogoMips and other benchmarks. diff --git a/results/classifier/gemma3:12b/performance/2183 b/results/classifier/gemma3:12b/performance/2183 new file mode 100644 index 00000000..acb77a9a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2183 @@ -0,0 +1,21 @@ + +aarch-64 emulation much slower since release 8.1.5 (issue also present on 8.2.1) +Description of problem: +Since QEMU 8.1.5 our aarch64 based emulation got much slower. We use a linux 5.4 kernel which we cross-compile with the ARM toolchain. Things that are noticable: +- Boot time got a lot longer +- All memory accesses seem to take 3x longer (can be verified by e.g. executing below script, address does not matter): +``` +date +for i in $(seq 0 1000); do + devmem 0x200000000 2>/dev/null +done +date +``` +Steps to reproduce: +Just boot an ARM based kernel on the virt machine and execute above script. +Additional information: +I've tried reproducing the issue on the master branch. There the issue is not present. It only seems to be present on releases 8.1.5 and 8.2.1. + +I've narrowed the problem down to following commit on the 8.2 branch (@bonzini): ef74024b76bf285e247add8538c11cb3c7399a1a accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses. + +Let me know if any other information / tests are required. diff --git a/results/classifier/gemma3:12b/performance/2193 b/results/classifier/gemma3:12b/performance/2193 new file mode 100644 index 00000000..350ec9fd --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2193 @@ -0,0 +1,31 @@ + +qemu-system-mips64el 70 times slower than qemu -ppc64, -riscv64, -s390x +Description of problem: +I installed Debian 12 inside a `qemu-system-mips64el` virtual machine. The performances are awfully slow, roughly 70 times slower than other qemu targets on the same host, namely ppc64, riscv64, s390x. + +The idea is to recompile and test an open source project on various platforms. + +Using a command such as `time make path/to/bin/file.o`, I compiled one single source file on the host and within qemu for various targets. The same source file, inside the same project, is used in all cases. + +The results are shown below (the "x" number between parentheses is the time factor compared to the compilation on the host). + +- Host (native): 0m1.316s +- qemu-system-ppc64: 0m31.622s (x24) +- qemu-system-riscv64: 0m40.691s (x31) +- qemu-system-s390x: 0m43.459s (x33) +- qemu-system-mips64el: 48m33.587s (x2214) + +The compilation of the same source is 24 to 33 times slower on the first three emulated targets, compared to the same compilation on the host, which is understandable. However, the same compilation on the mips64el target is 2214 time slower than the host, roughly 70 times slower than other emulated targets. + +Why do we have such a tremendous difference between qemu mips64el and other targets? +Additional information: +For reference, here are the other qemu to boot the other targets. Guest OS are Debian 12 or Ubuntu 22. +``` +qemu-system-ppc64 -smp 8 -m 8192 -nographic ... +qemu-system-riscv64 -machine virt -smp 8 -m 8192 -nographic ... +qemu-system-s390x -machine s390-ccw-virtio -cpu max,zpci=on -smp 8 -m 8192 -nographic ... +``` + +The other targets use `-smp 8` while qemu-system-mips64el does not support smp. However, the test compiles one single source file and does not (or marginally) use more than one CPU. + +Arguably, each compilation addresses a different target, uses a different backend, and the compilation time is not necessarily identical. OK, but 70 times slower seems way too much for this. diff --git a/results/classifier/gemma3:12b/performance/2216 b/results/classifier/gemma3:12b/performance/2216 new file mode 100644 index 00000000..d3cc3cc9 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2216 @@ -0,0 +1,4 @@ + +Incresaed artifacts generation speed with paralleled process +Additional information: +`parallel-jobs` was referenced `main` diff --git a/results/classifier/gemma3:12b/performance/2218 b/results/classifier/gemma3:12b/performance/2218 new file mode 100644 index 00000000..b7d3bca3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2218 @@ -0,0 +1,13 @@ + +MIDI playback issue on Windows 98 / 2000 / XP guest +Description of problem: +In Windows 98 / 2000 / XP guest, playback MIDI using Windows Media Player will cause audio slow. + +In Windows 98 / 2000 / XP guest, playback MP3 or WMA or WAV using Windows Media Player is works OK. +Steps to reproduce: +1. In Windows XP guest, open C:\WINDOWS\Media\Flourish.mid using Windows Media Player. +2. In Windows XP guest, open C:\WINDOWS\System32\OOBE\images\title.wma using Windows Media Player. +3. In Windows 98 guest, open C:\WINDOWS\Media\Passport.mid using Windows Media Player. +4. In Windows 98 guest, open C:\WINDOWS\Application Data\Microsoft\WELCOME\WELCOM98.WAV using Windows Media Player. +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/2221 b/results/classifier/gemma3:12b/performance/2221 new file mode 100644 index 00000000..1da5f056 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2221 @@ -0,0 +1,2 @@ + +CI timeouts on 'gcov' job: test-bufferiszero, test-crypto-tlscredsx509 diff --git a/results/classifier/gemma3:12b/performance/223 b/results/classifier/gemma3:12b/performance/223 new file mode 100644 index 00000000..a6c4b974 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/223 @@ -0,0 +1,2 @@ + +guest migration 100% cpu freeze bug diff --git a/results/classifier/gemma3:12b/performance/2237 b/results/classifier/gemma3:12b/performance/2237 new file mode 100644 index 00000000..28a80956 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2237 @@ -0,0 +1,40 @@ + +mirror block job memory leak +Description of problem: +After creating a background mirror job, and then the connection to the mirror target storage be interrupted and writing cannot be performed, the qemu process memory will increase significantly every time the mirror job performs a write. When the target stroage is restored, the data writing will be completed normally, but the memory will not be reduced. +Steps to reproduce: +1. start a virtual machine with libvirt(virsh start file) +2. add a target mirror block dev, configure io timeout to 2 sec(virsh qemu-monitor-command file --pretty '{"execute": "blockdev-add", "arguments": {"driver": "raw", "cache": {"direct": true}, "node-name": "node-target","file": {"driver": "rbd", "conf":"/etc/ceph/ceph.node53.conf", "pool": "test", "image": "rbd1", "auth-client-required": ["none"], "server": [{"host": "10.0.12.53", "port": "6789"}]}}}') +3. create a background mirror block job(virsh qemu-monitor-command file --pretty '{ "execute": "blockdev-mirror", "arguments": {"device": "libvirt-1-format", "target": "node-target", "sync": "full", "copy-mode": "background", "on-target-error": "ignore", "job-id": "job0"}}') +4. wait for the initial full synchronization to complete +5. write a large number of random ios in the virtual machine with the fio program(fio -filename=/dev/vdb -direct=1 -iodepth 1 -thread -rw=randwrite -ioengine=psync -bs=4k -size=4G -numjobs=1 -runtime=300 -group_reporting -name=sep) +6. break the connection with the remote storage or shutdown the remote storage while fio program is running(if the connection is interrupted first and then written io, the probability of reproduce is very low) +7. qemu will report an error indicating that io writing failed and try to write again(qemu-kvm: rbd request failed: cmd 1 offset 1421803520 bytes 1048576 flags 0 task.ret -110 (Connection timed out)) +8. use the numastat command to continuously observe the memory usage of the process and find that the heap memory has increased significantly. + +``` +Per-node process memory usage (in MBs) for PID 946492 (qemu-kvm) + Node 0 Total + --------------- --------------- +Huge 2048.00 2048.00 +Heap 2698.13 2698.13 +Stack 0.71 0.71 +Private 781.48 781.48 +---------------- --------------- --------------- +Total 5528.32 5528.32 + +after a while + +Per-node process memory usage (in MBs) for PID 1059068 (qemu-kvm) + Node 0 Total + --------------- --------------- +Huge 2048.00 2048.00 +Heap 21769.94 21769.94 +Stack 0.71 0.71 +Private 827.22 827.22 +---------------- --------------- --------------- +Total 24645.87 24645.87 +``` +Additional information: +libvirt xml: +[file.xml](/uploads/82ff2e410183f94fde7cbaf19e7911dc/file.xml) diff --git a/results/classifier/gemma3:12b/performance/2285 b/results/classifier/gemma3:12b/performance/2285 new file mode 100644 index 00000000..26a4a8ef --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2285 @@ -0,0 +1,2 @@ + +cross-i686-tci job intermittent timeouts diff --git a/results/classifier/gemma3:12b/performance/229 b/results/classifier/gemma3:12b/performance/229 new file mode 100644 index 00000000..5cf54b2a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/229 @@ -0,0 +1,2 @@ + +build-tools-and-docs-debian job waste cycles building pointless things diff --git a/results/classifier/gemma3:12b/performance/2381 b/results/classifier/gemma3:12b/performance/2381 new file mode 100644 index 00000000..f3483c1c --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2381 @@ -0,0 +1,4 @@ + +Modern x86 TSC features under TCG +Additional information: +I may be able to find a volunteer to implement this. If this feature does not appear to be a good first task, please let me know. diff --git a/results/classifier/gemma3:12b/performance/2398 b/results/classifier/gemma3:12b/performance/2398 new file mode 100644 index 00000000..a24a470f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2398 @@ -0,0 +1,63 @@ + +qemu stalls when taking LUKS encrypted snapshot +Description of problem: +We have been dealing with an issue recently, where qemu occasionally stalls when taking LUKS encrypted snapshots. We were able to take several core dumps (one example below) when the issue was happening and, upon analyzing those, we found out that the issue is that the function [qcrypto_pbkdf2_count_iters](https://github.com/qemu/qemu/blob/master/crypto/pbkdf.c#L88) reaches an iteration number high enough that the algorithm takes a long time to finish. + +Upon investigation, we were able to see that this is happening because [start_ms and end_ms](https://github.com/qemu/qemu/blob/master/crypto/pbkdf.c#L115) have the same value, giving a delta of zero, causing the number of iterations to always increase. + +Here are the important parts of the coredump: + +``` +(gdb) bt +#0 0x00007fb00aba5489 in _gcry_sha256_transform_amd64_avx2 () at ../../cipher/sha256-avx2-bmi2-amd64.S:346 +#1 0x00007fb00aba3000 in sha256_final (context=0x55ab875d5028) at ../../cipher/sha256.c:591 +#2 0x00007fb00ab19dea in md_final (a=0x55ab82e1bf50) at ../../cipher/md.c:800 +#3 0x00007fb00ab19f89 in md_final (a=a@entry=0x55ab82e1bf50) at ../../cipher/md.c:1003 +#4 _gcry_md_ctl (hd=hd@entry=0x55ab82e1bf50, buflen=0, buffer=0x0, cmd=5) at ../../cipher/md.c:1012 +#5 0x00007fb00ab1a4d0 in _gcry_md_ctl (buflen=0, buffer=0x0, cmd=5, hd=0x55ab82e1bf50) at ../../cipher/md.c:1106 +#6 _gcry_md_read (hd=0x55ab82e1bf50, algo=algo@entry=0) at ../../cipher/md.c:1110 +#7 0x00007fb00ab1d9ef in _gcry_kdf_pkdf2 (passphrase=passphrase@entry=0x55ab8177f040, passphraselen=passphraselen@entry=64, hashalgo=hashalgo@entry=8, salt=salt@entry=0x55ab8397a1d4, saltlen=saltlen@entry=32, + iterations=iterations@entry=32768000000, keysize=20, keybuffer=0x55ab817693c0) at ../../cipher/kdf.c:213 +#8 0x00007fb00ab1de3c in _gcry_kdf_pkdf2 (keybuffer=0x55ab817693c0, keysize=20, iterations=32768000000, saltlen=32, salt=0x55ab8397a1d4, hashalgo=8, passphraselen=64, passphrase=0x55ab8177f040) at ../../cipher/kdf.c:144 +#9 _gcry_kdf_derive (passphrase=0x55ab8177f040, passphraselen=64, algo=34, subalgo=8, salt=0x55ab8397a1d4, saltlen=32, iterations=32768000000, keysize=20, keybuffer=0x55ab817693c0) at ../../cipher/kdf.c:286 +#10 0x00007fb00ab02299 in gcry_kdf_derive (passphrase=passphrase@entry=0x55ab8177f040, passphraselen=passphraselen@entry=64, algo=algo@entry=34, hashalgo=hashalgo@entry=8, salt=salt@entry=0x55ab8397a1d4, saltlen=saltlen@entry=32, + iterations=32768000000, keysize=20, keybuffer=0x55ab817693c0) at ../../src/visibility.c:1337 +#11 0x000055ab7f80ff83 in qcrypto_pbkdf2 (hash=hash@entry=QCRYPTO_HASH_ALG_SHA256, key=key@entry=0x55ab8177f040 "\b@\327\061\177F\f\345\200Bw#", nkey=nkey@entry=64, + salt=salt@entry=0x55ab8397a1d4 "\"ͧ\322+\201!\375\177\020\037\252Hg$\271\021\340\343T\021OKָ\234m\304\066g\024\276", nsalt=nsalt@entry=32, iterations=iterations@entry=32768000000, + out=0x55ab817693c0 "C[\210\003\332\017b\350\f\257\377UP\257\262\275\033\v\034(", nout=20, errp=0x7fa7565e5df8) at ./crypto/pbkdf-gcrypt.c:75 +#12 0x000055ab7f80fe66 in qcrypto_pbkdf2_count_iters (hash=hash@entry=QCRYPTO_HASH_ALG_SHA256, key=key@entry=0x55ab8177f040 "\b@\327\061\177F\f\345\200Bw#", nkey=64, + salt=salt@entry=0x55ab8397a1d4 "\"ͧ\322+\201!\375\177\020\037\252Hg$\271\021\340\343T\021OKָ\234m\304\066g\024\276", nsalt=nsalt@entry=32, nout=nout@entry=20, errp=0x7fa7565e5df8) at ./crypto/pbkdf.c:80 +#13 0x000055ab7f812930 in qcrypto_block_luks_create (block=0x55ab82944540, options=<optimized out>, optprefix=<optimized out>, initfunc=0x55ab7f7abad0 <qcow2_crypto_hdr_init_func>, writefunc=0x55ab7f7ac040 <qcow2_crypto_hdr_write_func>, + opaque=0x55ab83a32290, errp=0x55ab823873d0) at ./crypto/block-luks.c:1362 +#14 0x000055ab7f810d80 in qcrypto_block_create (options=options@entry=0x55ab818e1f40, optprefix=optprefix@entry=0x55ab7f99912b "encrypt.", initfunc=initfunc@entry=0x55ab7f7abad0 <qcow2_crypto_hdr_init_func>, + writefunc=writefunc@entry=0x55ab7f7ac040 <qcow2_crypto_hdr_write_func>, opaque=opaque@entry=0x55ab83a32290, errp=errp@entry=0x55ab823873d0) at ./crypto/block.c:106 +#15 0x000055ab7f7b0f79 in qcow2_set_up_encryption (errp=0x55ab823873d0, cryptoopts=0x55ab818e1f40, bs=0x55ab83a32290) at ./block/qcow2.c:2996 +#16 qcow2_co_create (create_options=<optimized out>, errp=0x55ab823873d0) at ./block/qcow2.c:3529 +#17 0x000055ab7f7e2fca in blockdev_create_run (job=0x55ab82387350, errp=0x55ab823873d0) at ./block/create.c:46 +#18 0x000055ab7f79cf6f in job_co_entry (opaque=0x55ab82387350) at ./job.c:878 +#19 0x000055ab7f87e09c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ./util/coroutine-ucontext.c:115 +#20 0x00007fb009a14680 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 +#21 0x00007ffd40716530 in ?? () +#22 0x0000000000000000 in ?? () +(gdb) frame 12 +#12 0x000055ab7f80fe66 in qcrypto_pbkdf2_count_iters (hash=hash@entry=QCRYPTO_HASH_ALG_SHA256, key=key@entry=0x55ab8177f040 "\b@\327\061\177F\f\345\200Bw#", nkey=64, + salt=salt@entry=0x55ab8397a1d4 "\"ͧ\322+\201!\375\177\020\037\252Hg$\271\021\340\343T\021OKָ\234m\304\066g\024\276", nsalt=nsalt@entry=32, nout=nout@entry=20, errp=0x7fa7565e5df8) at ./crypto/pbkdf.c:80 +80 if (qcrypto_pbkdf2(hash, +(gdb) info locals +ret = 18446744073709551615 +out = 0x55ab817693c0 "C[\210\003\332\017b\350\f\257\377UP\257\262\275\033\v\034(" +iterations = 32768000000 +delta_ms = <optimized out> +start_ms = 35357141 +end_ms = 35357141 +``` + +We did some investigation on the getrusage system call, which is [used to calculate start_ms and end_ms](https://github.com/qemu/qemu/blob/master/crypto/pbkdf.c#L72) and found some patches which indicate that it might not be that accurate: + +https://github.com/torvalds/linux/commit/3dc167ba5729ddd2d8e3fa1841653792c295d3f1 + +https://lore.kernel.org/lkml/20221226031010.4079885-1-maxing.lan@bytedance.com/t/#m1c7f2fdc0ea742776a70fd1aa2a2e414c437f534 + +So far we have only seen this with Windows guests, but it might be a red herring. It happens maybe once a month and we were unable to get a reproducer. + +I'm open to proposing a fix for this, but how could we measure this without relying on getrusage which is causing us trouble? Any other suggestions or tips on this? diff --git a/results/classifier/gemma3:12b/performance/2426 b/results/classifier/gemma3:12b/performance/2426 new file mode 100644 index 00000000..dc2a4b54 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2426 @@ -0,0 +1,2 @@ + +How to determine which cpu microarchitecture is suitable for use on Windows 11? diff --git a/results/classifier/gemma3:12b/performance/2435 b/results/classifier/gemma3:12b/performance/2435 new file mode 100644 index 00000000..ecdc61a3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2435 @@ -0,0 +1,21 @@ + +CPU halted during fuzzing OHCI +Description of problem: +Is there a limit on the number of CPU cores that QEMU can use? I am running multiple sets of parallel fuzzing tests on a host machine. To prevent CPU contention, I have divided the running environments by using docker. The docker startup command is as follows: +`docker run --cpuset-cpus=8-15 --privileged --name qemu-container-ohci -it qemu-container bash` + +I found that the CPU is in a halted state and encountered the following error: +``` +#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=126899170563648) at ./nptl/pthread_kill.c:44 +#1 __pthread_kill_internal (signo=6, threadid=126899170563648) at ./nptl/pthread_kill.c:78 +#2 __GI___pthread_kill (threadid=126899170563648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 +#3 0x0000736a904a3476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 +#4 0x0000736a904897f3 in __GI_abort () at ./stdlib/abort.c:79 +#5 0x0000736a90dcbb57 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#6 0x0000736a90e2570f in g_assertion_message_expr () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#7 0x00005eca4aff5bad in mttcg_cpu_thread_fn (arg=0x62b000000200) at ../accel/tcg/tcg-accel-ops-mttcg.c:110 +#8 0x00005eca4b89d658 in qemu_thread_start (args=0x60300008b030) at ../util/qemu-thread-posix.c:541 +#9 0x0000736a904f5ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 +#10 0x0000736a90587850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 +``` +Can someone help analyze the reason? diff --git a/results/classifier/gemma3:12b/performance/2456 b/results/classifier/gemma3:12b/performance/2456 new file mode 100644 index 00000000..4ab9be62 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2456 @@ -0,0 +1,2 @@ + +check-tcg multi-threaded tests fail on ppc64 on clang-user CI job diff --git a/results/classifier/gemma3:12b/performance/2460 b/results/classifier/gemma3:12b/performance/2460 new file mode 100644 index 00000000..7118b90e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2460 @@ -0,0 +1,9 @@ + +Significant performance degradation of qemu-x86_64 starting from version 3 on aarch64 +Description of problem: +When I ran CoreMark with different qemu user-mode versions,guest x86-64-> host arm64, I found that the performance was highest with QEMU 2.x versions, and there was a significant performance degradation starting from QEMU version 3. What is the reason? + +| | | | | | | | | | | | | +|------------------------------------------|-------------|-------------|-------------|-------------|-------------|-------------|------------|-------------|-------------|-------------|-------------| +| qemu version | 2.5.1 | 2.8.0 | 2.9.0 | 2.9.1 | 3.0.0 | 4.0.0 | 5.2.0 | 6.2.0 | 7.2.13 | 8.2.6 | 9.0.1 | +| coremark score | 3905.995703 | 4465.947153 | 4534.119247 | 4538.577912 | 1167.337886 | 1163.399453 | 928.348384 | 1327.051954 | 1301.659616 | 1034.714677 | 1085.304971 | diff --git a/results/classifier/gemma3:12b/performance/2491 b/results/classifier/gemma3:12b/performance/2491 new file mode 100644 index 00000000..cc628c94 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2491 @@ -0,0 +1,16 @@ + +Performance Regression in QEMU (amd64 Emulating LoongArch64) from 8.0.4 to 9.0.2 +Description of problem: +Previous Performance: In May 2023, we were using QEMU 8.0.4 for qemu-user emulation, and the performance was satisfactory. The setup did not include LSX (Loongson SIMD Extensions) support in either the system or QEMU. Performance results are shown in Figure A. + +Current Performance: Recently, we upgraded to QEMU 9.0.2. Both the system and QEMU now support vectorized instruction sets. However, we observed a performance decline to less than 60% of the previous benchmarks. + +We found that the slowdown is not caused by LSX. Disabling LSX compilation in the new version results in even worse performance. However, there are indeed significant differences between the two systems in terms of LSX support. +Additional information: +We use systemd-nspawn and qemu-binfmt for containerized operations. You can get a clean chroot from lcpu release [here](https://github.com/lcpu-club/loongarchlinux-dockerfile/releases/download/20240806/base-devel-loong64-20240806.tar.zst) + +Figure A, performance in May 2023 + + +Figure B, performance nowadays + diff --git a/results/classifier/gemma3:12b/performance/2529 b/results/classifier/gemma3:12b/performance/2529 new file mode 100644 index 00000000..372ea388 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2529 @@ -0,0 +1,33 @@ + +`stack smashing detected` running arm64 image from amd64 machine +Description of problem: +When running a linux/arm64 `ubuntu:20.04` docker image on a linux/amd64 machine, an single command `apt-get update` will through below error +```sh +root@189bd36b9ae7:/# apt-get update +0% [Working]*** stack smashing detected ***: terminated +Reading package lists... Done +E: Method http has died unexpectedly! +E: Sub-process http received signal 6. + +``` + +Tested this is happening for ubuntu:18.04, ubuntu:20.04, ubuntu:22.04 so far + +If running same image directly from an ARM64 host, issue is gone +Steps to reproduce: +1. install QEMU on an AMD64 host machine (Ubuntu20) + ```sh + docker run --rm --privileged multiarch/qemu-user-static --reset -p yes + ``` +2. run linux/arm64 docker image of ubuntu:20.04 + ```sh + docker run --platform linux/arm64 -it --entrypoint /bin/bash ubuntu:20.04 + ``` +3. from within the container, run `apt-get update`, it will through below error + ```sh + root@189bd36b9ae7:/# apt-get update + 0% [Working]*** stack smashing detected ***: terminated + Reading package lists... Done + E: Method http has died unexpectedly! + E: Sub-process http received signal 6. + ``` diff --git a/results/classifier/gemma3:12b/performance/2564 b/results/classifier/gemma3:12b/performance/2564 new file mode 100644 index 00000000..f86cfdc2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2564 @@ -0,0 +1,2 @@ + +ubuntu-22.04-s390x-all-system CI job often times out diff --git a/results/classifier/gemma3:12b/performance/2572 b/results/classifier/gemma3:12b/performance/2572 new file mode 100644 index 00000000..75c59482 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2572 @@ -0,0 +1,31 @@ + +Guest os=Windows , qemu. Shutdown very slow. Memory allocation issue. +Description of problem: +simplifiying - libvirt config: +``` +<memory unit='KiB'>33554432</memory> + <currentMemory unit='KiB'>131072</currentMemory> +``` +when use `<currentMemory>` less than `<memory>` - at/after shutdown of guest os cpu hangs on 100% and lasts long- approximately 3-5 minutes +if change to +``` +<memory unit='KiB'>33554432</memory> + <currentMemory unit='KiB'>33554432</currentMemory> +``` +then shutdown takes less some seconds + +problem occurs not (shutdown of VM takes some seconds) in cases when not used balloon device: +1 `<currentMemory>` equal to `<memory>` +2 memballoon driver disabled in windows +3 memballoon disabled on libvirt with "model=none" (and therefore not passed to qemu command line) +Additional information: +on the guest : + * used drivers from virtio-win-0.1.262.iso - membaloon ver 100.95.104.26200 + * possible combination of all or some components + +monitored next: +`virsh dommemstat VMName` at shutdown time there grows "rss" till MaxMem, but very slowly. +aLso on `virsh setmem VMName --live --size 32G` +rss grows slow - but takes 2 times less than at simple shutdown time ( = at shutdown seems occurs memory allocation and deallocation at the same time) + +so something with some or all libvirt/qemu/balloon parts not so nice diff --git a/results/classifier/gemma3:12b/performance/2632 b/results/classifier/gemma3:12b/performance/2632 new file mode 100644 index 00000000..4a3e255f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2632 @@ -0,0 +1,84 @@ + +tcg optimization breaking memory access ordering +Description of problem: +The following code creates register dependency between 2 loads, which forces the first load to finish before the second: +``` +movz w0, #0x2 +str w0, [x1] +ldr w2, [x1] +eor w3, w2, w2 +ldr w4, [x5, w3, sxtw] +``` + +While translating it to tcg IR, it keeps this dependency correctly. +But after running tcg optimizations, it optimized the tcg sequence for `eor w3, w2, w2` at `0000000000000144` to `mov_i64 x3,$0x0`. which then removes the dependency between the loads. + +It results in incorrect behavior on the host on a multiple threaded program +Steps to reproduce: +1. +2. +3. +Additional information: +``` +OP: + ld_i32 loc0,env,$0xfffffffffffffff0 + brcond_i32 loc0,$0x0,lt,$L0 + st8_i32 $0x0,env,$0xfffffffffffffff4 + + ---- 0000000000000134 0000000000000000 0000000000000000 + add_i64 x28,x28,$0x2 + + ---- 0000000000000138 0000000000000000 0000000000000000 + mov_i64 x0,$0x2 + + ---- 000000000000013c 0000000000000000 0000000000001c00 + mov_i64 loc3,x1 + mov_i64 loc4,loc3 + qemu_st_a64_i64 x0,loc4,w16+un+leul,2 + + ---- 0000000000000140 0000000000000000 0000000000001c10 + mov_i64 loc5,x1 + mov_i64 loc6,loc5 + qemu_ld_a64_i64 x2,loc6,w16+un+leul,2 + + ---- 0000000000000144 0000000000000000 0000000000000000 + and_i64 loc7,x2,$0xffffffff + xor_i64 x3,x2,loc7 + and_i64 x3,x3,$0xffffffff + + ---- 0000000000000148 0000000000000000 0000000000001c20 + mov_i64 loc9,x5 + mov_i64 loc10,x3 + ext32s_i64 loc10,loc10 + add_i64 loc9,loc9,loc10 + mov_i64 loc11,loc9 + qemu_ld_a64_i64 x4,loc11,w16+un+leul,2 + st8_i32 $0x1,env,$0xfffffffffffffff4 +``` + + +``` +OP after optimization and liveness analysis: + ld_i32 tmp0,env,$0xfffffffffffffff0 pref=0xffffffff + brcond_i32 tmp0,$0x0,lt,$L0 dead: 0 + st8_i32 $0x0,env,$0xfffffffffffffff4 dead: 0 + + ---- 0000000000000134 0000000000000000 0000000000000000 + add_i64 x28,x28,$0x2 sync: 0 dead: 0 1 pref=0xffffffff + + ---- 0000000000000138 0000000000000000 0000000000000000 + mov_i64 x0,$0x2 sync: 0 dead: 0 pref=0xffffffff + + ---- 000000000000013c 0000000000000000 0000000000001c00 + qemu_st_a64_i64 $0x2,x1,w16+un+leul,2 dead: 0 + + ---- 0000000000000140 0000000000000000 0000000000001c10 + qemu_ld_a64_i64 x2,x1,w16+un+leul,2 sync: 0 dead: 0 1 pref=0xffffffff + + ---- 0000000000000144 0000000000000000 0000000000000000 + mov_i64 x3,$0x0 sync: 0 dead: 0 1 pref=0xffffffff + + ---- 0000000000000148 0000000000000000 0000000000001c20 + qemu_ld_a64_i64 x4,x5,w16+un+leul,2 sync: 0 dead: 0 1 pref=0xffffffff + st8_i32 $0x1,env,$0xfffffffffffffff4 dead: 0 +``` diff --git a/results/classifier/gemma3:12b/performance/2690 b/results/classifier/gemma3:12b/performance/2690 new file mode 100644 index 00000000..314259f1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2690 @@ -0,0 +1,21 @@ + +"Guest says index 40947 is available" +Description of problem: +As discussed [here](https://github.com/danobi/vmtest/issues/96) I have been running several instances of QEMU in parallel at `SCHED_IDLE`, and I've been getting QGA setup failures. +Steps to reproduce: +1. Install [vmtest](https://github.com/danobi/vmtest) +2. Run lots of copies of the command in the [github issues](https://github.com/danobi/vmtest/issues/96) via `chrt --idle 0`. +3. Unclear if this is the cause, but then I use the computer in the meantime so probably starve the `SCHED_IDLE` QEMU threads running from 2. + +This leads to failures to connect to the guest agent and then at the end I see this: + +``` +Guest says index 40947 is available + qemu-system-x86_64: Guest says index 40947 is available + qemu-system-x86_64: Guest says index 40947 is available +``` + + +The developer of vmtest seemed to think this may be of interest to QEMU developers based on the tone of the [comment they found](https://github.com/danobi/vmtest/issues/96#issuecomment-2483860554) in the QEMU code. + +I've now installed QEMU from Git master so I can report back whether the bug still appeared. diff --git a/results/classifier/gemma3:12b/performance/2737 b/results/classifier/gemma3:12b/performance/2737 new file mode 100644 index 00000000..e19f1ad6 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2737 @@ -0,0 +1,2 @@ + +Plans for Adding RISC-V Vector (RVV) Backend Support? diff --git a/results/classifier/gemma3:12b/performance/2773 b/results/classifier/gemma3:12b/performance/2773 new file mode 100644 index 00000000..a75d4725 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2773 @@ -0,0 +1,61 @@ + +qemu-system-sparc64 sometimes generates endless loops +Description of problem: +Sometimes emulation "stops" in a busy loop hogging 1 cpu completely. +gdb says: + +``` +0x00007d5805460ac5 in code_gen_buffer () +(gdb) info thread + Id Target Id Frame +* 1 LWP 9166 of process 12669 "" 0x00007d5805460ac5 in code_gen_buffer () + 2 LWP 19293 of process 12669 "" 0x00007d584680803a in ____sigtimedwait50 + () from /usr/lib/libc.so.12 + 3 LWP 20202 of process 12669 "" 0x00007d58468249ba in ___lwp_park60 () + from /usr/lib/libc.so.12 + 4 LWP 12669 of process 12669 "" 0x00007d58467b72ca in _sys___pollts50 () + from /usr/lib/libc.so.12 +(gdb) up +#1 0x00000000007b3a0f in cpu_tb_exec (cpu=cpu@entry=0x7d58041ac680, + itb=<optimized out>, tb_exit=tb_exit@entry=0x7d58037ffde8) + at ../accel/tcg/cpu-exec.c:458 +458 ret = tcg_qemu_tb_exec(cpu_env(cpu), tb_ptr); + +(gdb) down +#0 0x00007d5805460ac5 in code_gen_buffer () +(gdb) x/16i $pc +=> 0x7d5805460ac5 <code_gen_buffer+19401368>: mov %r15,0x68(%rbp) + 0x7d5805460ac9 <code_gen_buffer+19401372>: xor %r12,%r14 + 0x7d5805460acc <code_gen_buffer+19401375>: mov %r14,0x80(%rbp) + 0x7d5805460ad3 <code_gen_buffer+19401382>: mov %r12,%rbx + 0x7d5805460ad6 <code_gen_buffer+19401385>: mov %rbx,0x70(%rbp) + 0x7d5805460ada <code_gen_buffer+19401389>: mov %r12,0x78(%rbp) + 0x7d5805460ade <code_gen_buffer+19401393>: mov %r14,%r12 + 0x7d5805460ae1 <code_gen_buffer+19401396>: shr $0x20,%r12 + 0x7d5805460ae5 <code_gen_buffer+19401400>: and $0x1,%r12d + 0x7d5805460ae9 <code_gen_buffer+19401404>: dec %r12 + 0x7d5805460aec <code_gen_buffer+19401407>: and %rbx,%r12 + 0x7d5805460aef <code_gen_buffer+19401410>: mov %r12d,%ebx + 0x7d5805460af2 <code_gen_buffer+19401413>: movb $0x1,-0x4(%rbp) + 0x7d5805460af6 <code_gen_buffer+19401417>: cmp %r13,%rbx + 0x7d5805460af9 <code_gen_buffer+19401420>: + je 0x7d5805460b20 <code_gen_buffer+19401459> + 0x7d5805460aff <code_gen_buffer+19401426>: + jmp 0x7d5805460b04 <code_gen_buffer+19401431> +(gdb) list +453 if (qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC)) { +454 log_cpu_exec(log_pc(cpu, itb), cpu, itb); +455 } +456 +457 qemu_thread_jit_execute(); +458 ret = tcg_qemu_tb_exec(cpu_env(cpu), tb_ptr); +459 cpu->neg.can_do_io = true; +460 qemu_plugin_disable_mem_helpers(cpu); +461 /* +462 * TODO: Delay swapping back to the read-write region of the TB +``` +Steps to reproduce: +Unfortunately I have not been able to find a way to reliably reproduce this. +Happens "often" to me, but not always. + +If you have any idea (like: what traces to enable) how to debug this I'll try to gather more information diff --git a/results/classifier/gemma3:12b/performance/2797 b/results/classifier/gemma3:12b/performance/2797 new file mode 100644 index 00000000..77c78646 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2797 @@ -0,0 +1,4 @@ + +arm/raspi.c - incease memory limit +Additional information: +I can attempt to make a PR that increases this limit, but not sure if others would find it useful. diff --git a/results/classifier/gemma3:12b/performance/2821 b/results/classifier/gemma3:12b/performance/2821 new file mode 100644 index 00000000..1bd60379 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2821 @@ -0,0 +1,24 @@ + +Emulated newer x86 chipsets are noticably slower on cpu-bound loads than "-cpu qemu64" +Description of problem: +I noticed that "-cpu qemu64" is much faster than "-cpu max" or "-cpu Icelake-Server-noTSX" for cpu bound loads, and with more than one cpu under load. +Steps to reproduce: +1. Run a guest as per "qemu-system-x86_64 -cpu max [..]" command from above. Any linux distro should do. +2. run through the setup questions if you use Fedora-Server-KVM-41-1.4.x86_64.qcow2 from the example command line above +3. log into the guest via ssh, i.e. "ssh chris@amd64" here +4. cd /dev/shm; wget http://archive.apache.org/dist/httpd/httpd-2.4.57.tar.bz2; wget https://fluxcoil.net/files/tmp/job_httpd_extract_cpu.sh +6. bash ./job_httpd_extract_cpu.sh 4 300 +8. cat /tmp/counter + +Step 6 is executing a script which simply uses 4 parallel loops, where each loop runs "bzcat httpd-2.4.57.tar.bz2" constantly. After 300sec, the successful uncompressions over all 4 loops are summed up and stored in /tmp/counter. + +- result with "-cpu qemu64": 96 +- result with "-cpu max": 84 +- result with "-cpu Icelake-Server-noTSX": 44 +Additional information: +- For "-cpu Icelake-Server-noTSX" on this Thinkpad T590 I get these warnings, I think they are not relevant: + qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.pcid [bit 17] + qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24] + [..] +- I also looked at Broadwell etc, and all of them seem in the same ballpark. + Graph over some emulated architectures: https://fluxcoil.net/files/tmp/gnuplot_cpu-performance-emulated-only.png diff --git a/results/classifier/gemma3:12b/performance/2839 b/results/classifier/gemma3:12b/performance/2839 new file mode 100644 index 00000000..4cb2f4b3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2839 @@ -0,0 +1,36 @@ + +Physical memory usage spikes after migration for a VM using memory-backend-memfd memory +Description of problem: +When starting a virtual machine using the memory-backend-memfd type memory, configuring the virtual machine memory to 256GB or any other size, the QEMU process initially allocates only a little over 4GB of physical memory. However, after migrating the virtual machine, the physical memory occupied by the QEMU process almost equals 256GB. In an overcommitted memory environment, the increase in physical memory usage by the virtual machine can lead to insufficient host memory, triggering Out-Of-Memory (OOM). +Steps to reproduce: +1. start vm +./qemu-system-x86_64 -accel kvm -cpu SandyBridge -object memory-backend-memfd,id=mem1,size=256G -machine memory-backend=mem1 -smp 4 -drive file=/nvme0n1/luzhipeng/fusionos.qcow2,if=none,id=drive0,cache=none -device virtio-blk,drive=drive0,bootindex=1 -monitor stdio -vnc :0 +2. start vm on another host +./qemu-system-x86_64 -accel kvm -cpu SandyBridge -object memory-backend-memfd,id=mem1,size=256G -machine memory-backend=mem1 -smp 4 -drive file=/nvme0n1/luzhipeng/fusionos.qcow2,if=none,id=drive0,cache=none -device virtio-blk,drive=drive0,bootindex=1 -monitor stdio -vnc :0 -incoming tcp:0.0.0.0:4444 +3. migrate vm +migrate -d tcp:xx.xx.xx.xx:4444 +4. +Check QEMU process memory usage with the top command + +``` +top - 14:01:05 up 35 days, 20:16, 2 users, load average: 0.22, 0.23, 0.18 +Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie +%Cpu(s): 0.2 us, 0.1 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st +MiB Mem : 514595.3 total, 2642.6 free, 401703.3 used, 506435.3 buff/cache +MiB Swap: 0.0 total, 0.0 free, 0.0 used. 112892.0 avail Mem + + PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND +3865345 root 20 0 257.7g 256.1g 256.0g S 1.3 51.0 3:14.44 qemu-system-x86 +``` +Additional information: +``` +The relevant code: +void ram_handle_zero(void *host, uint64_t size) +{ + if (!buffer_is_zero(host, size)) { + memset(host, 0, size); + } +} +``` + +In the memory migration process, for the migration of zero pages, the destination side calls buffer_is_zero to check whether the corresponding page is entirely zero. If it is not zero, it actively sets it as a full page. For memory of the memfd type, the first access will allocate physical memory, resulting in physical memory allocation for all zero pages of the virtual machine. diff --git a/results/classifier/gemma3:12b/performance/2841 b/results/classifier/gemma3:12b/performance/2841 new file mode 100644 index 00000000..a4caa8e7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2841 @@ -0,0 +1,12 @@ + +QEMU is increasing memory swap, the only solution is to reboot after a freeze. +Description of problem: +Swap starts increasing suddenly and gets to around 60GB before laptop freezes and “dies”. +Steps to reproduce: +Seemingly random, didn’t notice any pattern.. it just started happening more often. + + + +age__4_.png) diff --git a/results/classifier/gemma3:12b/performance/286 b/results/classifier/gemma3:12b/performance/286 new file mode 100644 index 00000000..fd8e3eb7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/286 @@ -0,0 +1,2 @@ + +Performance degradation for WinXP boot time after b55f54bc diff --git a/results/classifier/gemma3:12b/performance/287 b/results/classifier/gemma3:12b/performance/287 new file mode 100644 index 00000000..1816f59f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/287 @@ -0,0 +1,2 @@ + +block copy job sometimes hangs on the last block for minutes diff --git a/results/classifier/gemma3:12b/performance/2878 b/results/classifier/gemma3:12b/performance/2878 new file mode 100644 index 00000000..c361383a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2878 @@ -0,0 +1,2 @@ + +Support for avx512 in qemu user space emulation. diff --git a/results/classifier/gemma3:12b/performance/2885 b/results/classifier/gemma3:12b/performance/2885 new file mode 100644 index 00000000..e59de5e9 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2885 @@ -0,0 +1,2 @@ + +Unable to increase CPU's for existing RISCV VM diff --git a/results/classifier/gemma3:12b/performance/290 b/results/classifier/gemma3:12b/performance/290 new file mode 100644 index 00000000..0c82c908 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/290 @@ -0,0 +1,2 @@ + +mmap MAP_NORESERVE of 2^42 bytes consumes 16Gb of actual RAM diff --git a/results/classifier/gemma3:12b/performance/2906 b/results/classifier/gemma3:12b/performance/2906 new file mode 100644 index 00000000..106ff63f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2906 @@ -0,0 +1,14 @@ + +x86 (32-bit) multicore very slow, but x86-64 is fast (on macOS arm64 host) +Description of problem: +More cores doesn't slow down a x86-32 guest on an x86-64 host, nor does it slow down an x86-64 guest on an arm64 host. However, adding extra cores massively slows down an x86-32 guest on an arm64 host. +Steps to reproduce: +1. Run 32-bit guest or 32-bit installer +2. +3. + +I have replicated this over several OSes using homebrew qemu, source-built qemu and UTM. This is not to be confused with a different bug in UTM that caused its version of QEMU to be slow. + +This also seems to apply to 32-bit processes in an x86-64 guest. +Additional information: +https://github.com/utmapp/UTM/issues/5468 diff --git a/results/classifier/gemma3:12b/performance/2946 b/results/classifier/gemma3:12b/performance/2946 new file mode 100644 index 00000000..7fa840ba --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2946 @@ -0,0 +1,11 @@ + +crypto/aes.c (used for emulating aes instructions) has a timing side-channel +Description of problem: +https://gitlab.com/qemu-project/qemu/-/blob/a9cd5bc6399a80fcf233ed0fffe6067b731227d8/crypto/aes.c#L1021 + +much of the code in crypto/aes.c accesses memory arrays where the array index is based on the secret data being encrypted/decrypted. because of cpu caches and other things that can delay memory accesses based on their address, this is a timing side-channel, potentially allowing leaking secrets over a network based on timing how long cryptography operations take. + +compare to openssl which uses an algorithm where its execution time doesn't depend on the data being processed: +https://github.com/openssl/openssl/commit/0051746e03c65f5970d8ca424579d50f58a877e0 + +I initially reported this as a security issue, but was told that since it's only used by TCG, it isn't a security issue, since TCG isn't considered secure. diff --git a/results/classifier/gemma3:12b/performance/2985 b/results/classifier/gemma3:12b/performance/2985 new file mode 100644 index 00000000..94f855b1 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/2985 @@ -0,0 +1,9 @@ + +throttle group limit feature request for discard +Additional information: +- Need to add particular options in [ThrottleGroupProperties](https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#object-QMP-block-core.ThrottleGroupProperties) which like this +```txt +x-discard-iops-total +x-discard-iops-total-max +.... +``` diff --git a/results/classifier/gemma3:12b/performance/334 b/results/classifier/gemma3:12b/performance/334 new file mode 100644 index 00000000..307fbf04 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/334 @@ -0,0 +1,2 @@ + +macOS App Nap feature gradually freezes QEMU process diff --git a/results/classifier/gemma3:12b/performance/353 b/results/classifier/gemma3:12b/performance/353 new file mode 100644 index 00000000..88c49b03 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/353 @@ -0,0 +1,2 @@ + +video capture, slowness diff --git a/results/classifier/gemma3:12b/performance/393 b/results/classifier/gemma3:12b/performance/393 new file mode 100644 index 00000000..d9077d2e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/393 @@ -0,0 +1,2 @@ + +tests/vm: Warn when cross-build VM is run with TCG accelerator diff --git a/results/classifier/gemma3:12b/performance/404 b/results/classifier/gemma3:12b/performance/404 new file mode 100644 index 00000000..729de88a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/404 @@ -0,0 +1,2 @@ + +Windows XP takes much longer to boot in TCG mode since 5.0 diff --git a/results/classifier/gemma3:12b/performance/435 b/results/classifier/gemma3:12b/performance/435 new file mode 100644 index 00000000..78b940d6 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/435 @@ -0,0 +1,2 @@ + +RISC-V: Support more cores diff --git a/results/classifier/gemma3:12b/performance/466 b/results/classifier/gemma3:12b/performance/466 new file mode 100644 index 00000000..57dab00c --- /dev/null +++ b/results/classifier/gemma3:12b/performance/466 @@ -0,0 +1,2 @@ + +3x 100% host CPU core usage while virtual machine is in idle diff --git a/results/classifier/gemma3:12b/performance/498417 b/results/classifier/gemma3:12b/performance/498417 new file mode 100644 index 00000000..586c5db6 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/498417 @@ -0,0 +1,15 @@ + +cache=writeback on disk image doesn't do write-back + +I noticed that qemu seems to have poor disk performance. Here's a test that has miserable performance but which should be really fast: + +- Configure qemu to use the disk image with cache=writeback +- Configure a 2GiB Linux VM on an 8GiB Linux host +- In the VM, write a 4GiB file (dd if=/dev/zero of=/tmp/x bs=4K count=1M) +- In the VM, read it back (dd if=/tmp/x of=/dev/null bs=4K count=1M) + +With writeback, the whole file should end up in the host pagecache. So when I read it back, there should be no I/O to the real disk, and it should be really fast. Instead, I see disk activity through the duration of the test, and the performance is roughly the native hard disk throughput (somewhat slower). + +I'm using version 0.11.1, and this is my command line: + +qemu-system-x86_64 -drive cache=writeback,index=0,media=disk,file=ubuntu.img -k en-us -m 2048 -smp 2 -vnc :3100 -usbdevice tablet -enable-kvm & \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/524447 b/results/classifier/gemma3:12b/performance/524447 new file mode 100644 index 00000000..7302300b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/524447 @@ -0,0 +1,10 @@ + +virsh save is very slow + +As reported here: http://www.redhat.com/archives/libvir-list/2009-December/msg00203.html + +"virsh save" is very slow - it writes the image at around 1MB/sec on my test system. + +(I think I saw a bug report for this issue on Fedora's bugzilla, but I can't find it now...) + +Confirmed under Karmic. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/547 b/results/classifier/gemma3:12b/performance/547 new file mode 100644 index 00000000..c006764a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/547 @@ -0,0 +1,2 @@ + +e1000: Loop blocking QEMU with high CPU usage diff --git a/results/classifier/gemma3:12b/performance/568445 b/results/classifier/gemma3:12b/performance/568445 new file mode 100644 index 00000000..c899dc94 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/568445 @@ -0,0 +1,12 @@ + +LVM backed drives should default to cache='none' + +Binary package hint: virt-manager + +KVM guests using LVM backed drives appear to experience fairly high iowait times on the host system if the guest has even a moderate amount of disk I/O. This translates to poor performance for the host and all guests running on the host, and appears to be due to caching as KVM defaults to using writethrough caching when nothing is specified. Explicitly disabling KVM's caching appears to result in significantly better host and guest performance. + +This is recommended in at least a few places: +http://<email address hidden>/msg17492.html +http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/48471 +http://<email address hidden>/msg30425.html +http://virt.kernelnewbies.org/XenVsKVM \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/589231 b/results/classifier/gemma3:12b/performance/589231 new file mode 100644 index 00000000..eae5c039 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/589231 @@ -0,0 +1,10 @@ + +cirrus vga is very slow in qemu-kvm-0.12 + +As has been reported multiple times (*), there were a regression in qemu-kvm from 0.11 to 0.12, which causes significant slowdown in cirrus vga emulation. For windows guests, where "standard VGA" driver works reasonable well, -vga std is a good workaround. But for e.g. linux guests, where vesa driver is painfully slow by its own, that's not a solution. + +(*) + debian qemu-kvm bug report #574988: http://bugs.debian.org/574988#17 + debian qemu bugreport (might be related): http://bugs.debian.org/575720 + kvm mailinglist thread: http://<email address hidden>/msg33459.html + another kvm ml thread: http://<email address hidden>/msg32744.html \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/601946 b/results/classifier/gemma3:12b/performance/601946 new file mode 100644 index 00000000..b49c5b4f --- /dev/null +++ b/results/classifier/gemma3:12b/performance/601946 @@ -0,0 +1,7 @@ + +[Feature request] qemu-img multi-threaded compressed image conversion + +Feature request: +qemu-img multi-threaded compressed image conversion + +Suppose I want to convert raw image to compressed qcow2. Multi-threaded conversion will be much faster, because bottleneck is compressing data. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/603872 b/results/classifier/gemma3:12b/performance/603872 new file mode 100644 index 00000000..93e89d76 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/603872 @@ -0,0 +1,4 @@ + +[Feature request] qemu-img image conversion does not show percentage + +It will be nice if qemu-img will be able to show percentage of completition and average speed of conversion and compress ratio (if converting to compressed qcow or qcow2) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/612 b/results/classifier/gemma3:12b/performance/612 new file mode 100644 index 00000000..498ab703 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/612 @@ -0,0 +1,2 @@ + +Much larger traces with qemu-6.1 than qemu-6.0 diff --git a/results/classifier/gemma3:12b/performance/642 b/results/classifier/gemma3:12b/performance/642 new file mode 100644 index 00000000..e66a891a --- /dev/null +++ b/results/classifier/gemma3:12b/performance/642 @@ -0,0 +1,5 @@ + +Slow QEMU I/O on macOS host +Description of problem: +QEMU on macOS host gives very low I/O speed. Tested with fio tool, compared to linux host +Tested on QEMU v6.1.0, and the recent master diff --git a/results/classifier/gemma3:12b/performance/672 b/results/classifier/gemma3:12b/performance/672 new file mode 100644 index 00000000..c690f7b3 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/672 @@ -0,0 +1,4 @@ + +Slow emulation of mac99 (PowerPC G4) due to being single-threaded. +Additional information: +None diff --git a/results/classifier/gemma3:12b/performance/693 b/results/classifier/gemma3:12b/performance/693 new file mode 100644 index 00000000..b90e799e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/693 @@ -0,0 +1,11 @@ + +Qemu increased memory usage with TCG +Description of problem: +The issue is that instances that are supposed to use only a small amount of memory (like 256MB) suddenly use a much higher amount of RSS when running the accel=tcg, around 512MB in the above example. This was not happening with qemu-4.2 (on Ubuntu 20.04). This is also not happening when using accel=kvm instead. The issue has been first noticed on Debian 11 (Bullseye) with the versions above, but it is happening in the same way on Centos 8 Stream, Ubuntu 21.10 and a pre-release version of Ubuntu 22.04. It also also seen when testing with qemu-6.1 built from source. +Steps to reproduce: +1. Deploy devstack (https://opendev.org/openstack/devstack) with VIRT_TYPE=qemu on a VM +2. Start an instance with cirros image and a flavor allocating 256MB +3. Do a ps and see a RSS size of about 512MB being used after the instance has finished booting +4. Expected result (seen with qemu-4.2 or VIRT_TYPE=kvm): RSS stays < 256MB +Additional information: +I can try to find a smaller commandline for manual reproduction if needed. The above sample is generated by OpenStack Nova via libvirt. diff --git a/results/classifier/gemma3:12b/performance/719 b/results/classifier/gemma3:12b/performance/719 new file mode 100644 index 00000000..8e525eed --- /dev/null +++ b/results/classifier/gemma3:12b/performance/719 @@ -0,0 +1,20 @@ + +live migration's performance with compression enabled is much worse than compression disabled +Description of problem: + +Steps to reproduce: +1. Run QEMU the Guests with 1Gpbs network on source host and destination host with QEMU command line +2. Run some memory work loads on Guest, for example, ./memtester 1G 1 +3. Set migration parameters in QEMU monitor. On source and destination, + execute: #migrate_set_capability compress on + Other compression parameters are all default. +4. Run migrate command, # migrate -d tcp:10.156.208.154:4000 +5. The results: + - without compression: total time: 197366 ms throughput: 937.81 mbps transferred Ram: 22593703 kbytes + - with compression: total time: 281711 ms throughput: 90.24 mbps transferred Ram: 3102898 kbytes + +When compression is enabled, the compression transferred ram is reduced a lot. But the throughput is down badly. +The total time of live migration with compression is longer than without compression. +I tried with 100G network bandwidth, it also has the same problem. +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/721793 b/results/classifier/gemma3:12b/performance/721793 new file mode 100644 index 00000000..860a45d7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/721793 @@ -0,0 +1,23 @@ + +QEMU freezes on startup (100% CPU utilization) + +0.12.5 was the last version of QEMU that runs ok and boots any os image. + +0.13.0-0.14.0 just freeze, and the only thing I see is a black screen and both of them make it use 100% of CPU also. +Both kernels 2.6.35.11 and 2.6.37.1 with and without PAE support. + +tested commands: + +W2000: +$ qemu -m 256 -localtime -net nic,model=rtl8139 -net tap -usbdevice host:0e21:0750 /var/opt/vm/w2000.img +W2000: +$ qemu /var/opt/vm/w2000.img +OpenBSD 4.8: +$ qemu -cdrom ~/cd48.iso -boot d empty-qcow2.img + +tried to use `-M pc-0.12` selector, different audio cards (I've found it caused infinite loop on startup once) -- no luck. +tried to use recent seabios from git -- still no luck. + +attached strace log of 0.14.0. + +everything was tested on HP mini 311C with Intel Atom N270. \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/740 b/results/classifier/gemma3:12b/performance/740 new file mode 100644 index 00000000..0682cdda --- /dev/null +++ b/results/classifier/gemma3:12b/performance/740 @@ -0,0 +1,28 @@ + +on single core Raspberry Pi, qemu-system-sparc appears to hang in bios +Description of problem: +I suspect it to be a race condition related to running on the slow single core Raspberry Pi, as I haven't managed to reproduce on x86 even when using taskset to tie qemu to a single core. + +The problem occurs about 4 out of 5 runs on qemu 5.2 (raspbian bullseye) and so far 100% of the time on qemu 6.1. + +About five seconds after start the sparc bios gets as far as `ttya initialized` and then appears to hang indefinitely. + +Instead, it should continue after about 3 more seconds with: +``` +Probing Memory Bank #0 32 Megabytes +Probing Memory Bank #1 Nothing there +Probing Memory Bank #2 Nothing there +Probing Memory Bank #3 Nothing there +``` + +See below for workaround. +Steps to reproduce: +1. Need a single core Raspberry Pi running raspbian, such as Raspberry Pi 1 or Zero +2. Download ss5.bin from https://github.com/andarazoroflove/sparc/raw/master/ss5.bin +3. Run the command: +``` +qemu-system-sparc -m 32 -bios ss5.bin -nographic +``` +After about 5 seconds of output it hangs at `ttya initialized` +Additional information: +## diff --git a/results/classifier/gemma3:12b/performance/741887 b/results/classifier/gemma3:12b/performance/741887 new file mode 100644 index 00000000..964768b9 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/741887 @@ -0,0 +1,80 @@ + +virsh snapshot-create too slow (kvm, qcow2, savevm) + +Action +====== +# time virsh snapshot-create 1 + +* Taking snapshot of a running KVM virtual machine + +Result +====== +Domain snapshot 1300983161 created +real 4m46.994s +user 0m0.000s +sys 0m0.010s + +Expected result +=============== +* Snapshot taken after few seconds instead of minutes. + +Environment +=========== +* Ubuntu Natty Narwhal upgraded from Lucid and Meerkat, fully updated. + +* Stock natty packages of libvirt and qemu installed (libvirt-bin 0.8.8-1ubuntu5; libvirt0 0.8.8-1ubuntu5; qemu-common 0.14.0+noroms-0ubuntu3; qemu-kvm 0.14.0+noroms-0ubuntu3). + +* Virtual machine disk format is qcow2 (debian 5 installed) +image: /storage/debian.qcow2 +file format: qcow2 +virtual size: 10G (10737418240 bytes) +disk size: 1.2G +cluster_size: 65536 +Snapshot list: +ID TAG VM SIZE DATE VM CLOCK +1 snap01 48M 2011-03-24 09:46:33 00:00:58.899 +2 1300979368 58M 2011-03-24 11:09:28 00:01:03.589 +3 1300983161 57M 2011-03-24 12:12:41 00:00:51.905 + +* qcow2 disk is stored on ext4 filesystem, without RAID or LVM or any special setup. + +* running guest VM takes about 40M RAM from inside, from outside 576M are given to that machine + +* host has fast dual-core pentium cpu with virtualization support, around 8G of RAM and 7200rpm harddrive (dd from urandom to file gives about 20M/s) + +* running processes: sshd, atd (empty), crond (empty), libvirtd, tmux, bash, rsyslogd, upstart-socket-bridge, udevd, dnsmasq, iotop (python) + +* networking is done by bridging and bonding + + +Detail description +================== + +* Under root, command 'virsh create-snapshot 1' is issued on booted and running KVM machine with debian inside. + +* After about four minutes, the process is done. + +* 'iotop' shows two 'kvm' processes reading/writing to disk. First one has IO around 1500 K/s, second one has around 400 K/s. That takes about three minutes. Then first process grabs about 3 M/s of IO and suddenly dissapears (1-2 sec). Then second process does about 7.5 M/s of IO for around a 1-2 minutes. + +* Snapshot is successfuly created and is usable for reverting or extracting. + +* Pretty much the same behaviour occurs when command 'savevm' is issued directly from qemu monitor, without using libvirf44bfb7fb978c9313ce050a1c4149bf04aa0a670t at all (actually, virsh snapshot-create just calls 'savevm' to the monitor socket). + +* This behaviour was observed on lucid, meerkat, natty and even with git version of libvirt (f44bfb7fb978c9313ce050a1c4149bf04aa0a670). Also slowsave packages from https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/524447 gave this issue. + + +Thank you for helping to solve this issue! + +ProblemType: Bug +DistroRelease: Ubuntu 11.04 +Package: libvirt-bin 0.8.8-1ubuntu5 +ProcVersionSignature: Ubuntu 2.6.38-7.38-server 2.6.38 +Uname: Linux 2.6.38-7-server x86_64 +Architecture: amd64 +Date: Thu Mar 24 12:19:41 2011 +InstallationMedia: Ubuntu-Server 10.04.2 LTS "Lucid Lynx" - Release amd64 (20110211.1) +ProcEnviron: + LANG=en_US.UTF-8 + SHELL=/bin/bash +SourcePackage: libvirt +UpgradeStatus: No upgrade log present (probably fresh install) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/753916 b/results/classifier/gemma3:12b/performance/753916 new file mode 100644 index 00000000..a01302f2 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/753916 @@ -0,0 +1,5 @@ + +performance bug with SeaBios 0.6.x + +in my tests SeaBios 0.5.1 has the best performance (100% faster) +i run qemu port in windows xp (phenom II x4 945, 4 gigas ram DDR3) and windows xp (Pentium 4, 1 giga ram ddr) \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/765 b/results/classifier/gemma3:12b/performance/765 new file mode 100644 index 00000000..863049bb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/765 @@ -0,0 +1,66 @@ + +Issue with Docker on M1 Mac +Description of problem: +I'm trying to run a docker container using the following command: + +``` +docker run --platform=linux/amd64 --rm uphold/litecoin-core \ + -printtoconsole \ + -regtest=1 \ + -rpcallowip=172.17.0.0/16 \ + -rpcauth='foo:1e72f95158becf7170f3bac8d9224$957a46166672d61d3218c167a223ed5290389e9990cc57397d24c979b4853f8e' +``` + +It should run the docker container, instead it throws the following error: +``` +/entrypoint.sh: assuming arguments for litecoind +/entrypoint.sh: setting data directory to /home/litecoin/.litecoin +runtime: failed to create new OS thread (have 2 already; errno=22) +fatal error: newosproc + +runtime stack: +runtime.throw(0x4cb21f, 0x9) + /usr/local/go/src/runtime/panic.go:566 +0x95 +runtime.newosproc(0xc420028000, 0xc420037fc0) + /usr/local/go/src/runtime/os_linux.go:160 +0x194 +runtime.newm(0x4d6db8, 0x0) + /usr/local/go/src/runtime/proc.go:1572 +0x132 +runtime.main.func1() + /usr/local/go/src/runtime/proc.go:126 +0x36 +runtime.systemstack(0x53ae00) + /usr/local/go/src/runtime/asm_amd64.s:298 +0x79 +runtime.mstart() + /usr/local/go/src/runtime/proc.go:1079 + +goroutine 1 [running]: +runtime.systemstack_switch() + /usr/local/go/src/runtime/asm_amd64.s:252 fp=0xc420022768 sp=0xc420022760 +runtime.main() + /usr/local/go/src/runtime/proc.go:127 +0x6c fp=0xc4200227c0 sp=0xc420022768 +runtime.goexit() + /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4200227c8 sp=0xc4200227c0 +``` +Steps to reproduce: +1. Run the following in a terminal window on a Mac with an M1 chip: +``` +docker run --platform=linux/amd64 --rm uphold/litecoin-core \ + -printtoconsole \ + -regtest=1 \ + -rpcallowip=172.17.0.0/16 \ + -rpcauth='foo:1e72f95158becf7170f3bac8d9224$957a46166672d61d3218c167a223ed5290389e9990cc57397d24c979b4853f8e' +``` +Additional information: +I increased the limits using ``ulimit`` as follows: + +``` +clemens@M1-MacBook-Pro ~ % ulimit -a +-t: cpu time (seconds) unlimited +-f: file size (blocks) unlimited +-d: data seg size (kbytes) unlimited +-s: stack size (kbytes) 8176 +-c: core file size (blocks) 0 +-v: address space (kbytes) unlimited +-l: locked-in-memory size (kbytes) unlimited +-u: processes 5333 +-n: file descriptors 256 +``` diff --git a/results/classifier/gemma3:12b/performance/767 b/results/classifier/gemma3:12b/performance/767 new file mode 100644 index 00000000..fe79c516 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/767 @@ -0,0 +1,4 @@ + +Improve softmmu TLB utilisation by improving tlb_flush usage on PPC64 +Additional information: + diff --git a/results/classifier/gemma3:12b/performance/773 b/results/classifier/gemma3:12b/performance/773 new file mode 100644 index 00000000..950d06ef --- /dev/null +++ b/results/classifier/gemma3:12b/performance/773 @@ -0,0 +1,28 @@ + +TCG profiler build fails +Description of problem: +Attempting to build with --enable-profiler fails +Steps to reproduce: +1. ../../configure --enable-profiler +2. make +Additional information: +[975/3221] Compiling C object libcommon.fa.p/monitor_qmp-cmds.c.o + FAILED: libcommon.fa.p/monitor_qmp-cmds.c.o + cc -m64 -mcx16 -Ilibcommon.fa.p -I../../dtc/libfdt -I/usr/include/capstone -I/usr/include/pixman-1 -I/usr/include/spice-server -I/usr/include/spice-1 -I/usr/include/libpng16 + -I/usr/include/p11-kit-1 -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/gio-unix-2.0 -I/us + r/include/slirp -I/usr/include/virgl -I/usr/include/libusb-1.0 -I/usr/include/cacard -I/usr/include/nss -I/usr/include/nspr -I/usr/include/PCSC -I/usr/include/gtk-3.0 -I/usr + /include/at-spi2-atk/2.0 -I/usr/include/at-spi-2.0 -I/usr/include/dbus-1.0 -I/usr/lib/x86_64-linux-gnu/dbus-1.0/include -I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/ + include/fribidi -I/usr/include/harfbuzz -I/usr/include/atk-1.0 -I/usr/include/uuid -I/usr/include/freetype2 -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/vte-2.91 -fdiagnosti + cs-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem /home/alex/lsrc/qemu.git/linux-headers -isystem linux-headers -iquote . -iquote /home/alex/lsrc/qemu.git + -iquote /home/alex/lsrc/qemu.git/include -iquote /home/alex/lsrc/qemu.git/disas/libvixl -iquote /home/alex/lsrc/qemu.git/tcg/i386 -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOUR + CE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-co + mmon -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wend + if-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIE -D_DEFAULT_SOURCE -D_ + XOPEN_SOURCE=600 -DNCURSES_WIDECHAR=1 -D_REENTRANT -DSTRUCT_IOVEC_DEFINED -MD -MQ libcommon.fa.p/monitor_qmp-cmds.c.o -MF libcommon.fa.p/monitor_qmp-cmds.c.o.d -o libcommon. + fa.p/monitor_qmp-cmds.c.o -c ../../monitor/qmp-cmds.c + ../../monitor/qmp-cmds.c: In function ‘qmp_x_query_profile’: + ../../monitor/qmp-cmds.c:369:21: error: implicit declaration of function ‘tcg_cpu_exec_time’ [-Werror=implicit-function-declaration] + 369 | cpu_exec_time = tcg_cpu_exec_time(); + | ^~~~~~~~~~~~~~~~~ + ../../monitor/qmp-cmds.c:369:21: error: nested extern declaration of ‘tcg_cpu_exec_time’ [-Werror=nested-externs] + cc1: all warnings being treated as errors diff --git a/results/classifier/gemma3:12b/performance/80 b/results/classifier/gemma3:12b/performance/80 new file mode 100644 index 00000000..353e58bc --- /dev/null +++ b/results/classifier/gemma3:12b/performance/80 @@ -0,0 +1,2 @@ + +[Feature request] qemu-img multi-threaded compressed image conversion diff --git a/results/classifier/gemma3:12b/performance/815 b/results/classifier/gemma3:12b/performance/815 new file mode 100644 index 00000000..cef8f4bb --- /dev/null +++ b/results/classifier/gemma3:12b/performance/815 @@ -0,0 +1,2 @@ + +Using spdk Vhost to accelerate QEMU, which QEMU version is the most appropriate? diff --git a/results/classifier/gemma3:12b/performance/819 b/results/classifier/gemma3:12b/performance/819 new file mode 100644 index 00000000..ff697676 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/819 @@ -0,0 +1,76 @@ + +watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0] +Description of problem: +During virtual disk live move/migration, VMs get severe stuttering and even cpu soft lockups, as described here: + +https://bugzilla.kernel.org/show_bug.cgi?id=199727 + +This also happens on some of our virtual machines when i/o load inside VM is high or workload is fsync centric. + +i'm searching for a solution to mitigate this problem, i.e. i can live with the stuttering/delays of several seconds, but getting cpu soft lockups of 22s or higher is inacceptable. + +i have searched the web for a long long time now, but did not find a solution , nor did i find a way on how to troubleshoot this more in depth to find the real root cause. + +if this issue report will not getting accepted because of "non native qemu" (i.e. proxmox platform) , please tell me which qemu/distro i can/should use instead (which has easy usable live migration feature) to try reproducing the problem. +Steps to reproduce: +1. do a live migration of one or more virtual machine disks +2. watch "ioping -WWWYy test.dat" inside VM (being moved) for disk latency +3. you disk latency is heavily varying , from time to time it goes up to vaues of tens seconds, even leading to kernel messages like " kernel:[ 2155.520846] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]" + +``` +4 KiB >>> test.dat (ext4 /dev/sda1): request=55 time=1.07 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=56 time=1.24 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=57 time=567.4 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=58 time=779.0 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=59 time=589.0 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=60 time=1.57 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=61 time=847.7 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=62 time=933.0 ms +4 KiB >>> test.dat (ext4 /dev/sda1): request=63 time=891.4 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=64 time=820.8 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=65 time=1.02 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=66 time=2.44 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=67 time=620.7 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=68 time=1.03 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=69 time=1.24 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=70 time=1.42 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=71 time=1.36 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=72 time=1.41 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=73 time=1.33 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=74 time=2.36 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=75 time=1.46 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=76 time=1.45 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=77 time=1.28 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=78 time=1.41 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=79 time=2.33 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=80 time=1.39 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=81 time=1.35 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=82 time=1.54 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=83 time=1.52 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=84 time=1.50 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=85 time=2.00 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=86 time=1.47 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=87 time=1.26 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=88 time=1.29 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=89 time=2.05 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=90 time=1.44 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=91 time=1.43 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=92 time=1.72 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=93 time=1.77 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=94 time=2.56 s + +Message from syslogd@iotest2 at Jan 14 14:51:12 ... + kernel:[ 2155.520846] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0] +4 KiB >>> test.dat (ext4 /dev/sda1): request=95 time=22.5 s (slow) +4 KiB >>> test.dat (ext4 /dev/sda1): request=96 time=3.56 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=97 time=1.52 s (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=98 time=1.69 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=99 time=1.90 s +4 KiB >>> test.dat (ext4 /dev/sda1): request=100 time=1.15 s (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=101 time=890.0 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=102 time=959.6 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=103 time=926.5 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=104 time=791.5 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=105 time=577.8 ms (fast) +4 KiB >>> test.dat (ext4 /dev/sda1): request=106 time=867.7 ms (fast) +``` diff --git a/results/classifier/gemma3:12b/performance/83 b/results/classifier/gemma3:12b/performance/83 new file mode 100644 index 00000000..9cbabe23 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/83 @@ -0,0 +1,2 @@ + +QEMU x87 emulation of trig and other complex ops is only at 64-bit precision, not 80-bit diff --git a/results/classifier/gemma3:12b/performance/852 b/results/classifier/gemma3:12b/performance/852 new file mode 100644 index 00000000..bb191b6e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/852 @@ -0,0 +1,32 @@ + +ppc64le: possible SIMD issue casting double to int +Description of problem: +Working with numpy in a ppc64le VM, I ran into a strange double -to casting issue, specifically when casting an array of 1.0 values to 1 values. The numpy folks guided me to a small reproducible test case. + +The attached [convert.c](/uploads/2dd7936f4defccf816ffee7c7c002e77/convert.c) creates double and int arrays of length `1 <= n <= 16`. The double array is filled with the value 1.0, and both arrays are passed to a function that converts the value. + +With `-O2`, output is as expected (truncated here): + +``` +i = 1: 1 +i = 2: 1 1 +i = 3: 1 1 1 +i = 4: 1 1 1 1 +i = 5: 1 1 1 1 1 +i = 6: 1 1 1 1 1 1 +``` + +With `-O3`, all values that fit into blocks of four become zero: +``` +i = 1: 1 +i = 2: 1 1 +i = 3: 1 1 1 +i = 4: 0 0 0 0 +i = 5: 0 0 0 0 1 +i = 6: 0 0 0 0 1 1 +``` + +I tested this with executables compiled on a physical ppc64le host, where the issue is not reproducible. +Steps to reproduce: +1. `gcc -O2 -o convert convert.c && ./convert` +2. `gcc -O3 -o convert convert.c && ./convert` diff --git a/results/classifier/gemma3:12b/performance/856 b/results/classifier/gemma3:12b/performance/856 new file mode 100644 index 00000000..f3bfd77e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/856 @@ -0,0 +1,62 @@ + +Occasional deadlock in linux-user (sh4) when running threadcount test +Description of problem: + +Steps to reproduce: +1. docker run --rm -it -u (id -u) -v $HOME:$HOME -w (pwd) qemu/debian-all-test-cross /bin/bash +2. '../../configure' '--cc=clang' '--cxx=clang++' '--disable-system' '--target-list-exclude=microblazeel-linux-user,aarch64_be-linux-user,i386-linux-user,m68k-linux-user,mipsn32el-linux-user,xtensaeb-linux-user' '--extra-cflags=-fsanitize=undefined' '--extra-cflags=-fno-sanitize-recover=undefined' +3. make; make build-tcg +4. retry.py -n 400 -c -- timeout --foreground 90 ./qemu-sh4 -plugin ./tests/plugin/libinsn.so -d plugin ./tests/tcg/sh4-linux-user/threadcount + +Failure rate on hackbox: + +``` +Results summary: +0: 397 times (99.25%), avg time 0.686 (0.00 varience/0.01 deviation) +124: 3 times (0.75%), avg time 90.559 (0.00 varience/0.01 deviation) +``` + +It seems to fail more frequently on Gitlabs CI +Additional information: +Without the timeout you end up with a deadlock. The following backtrace was found, stepping in gdb unwedges the hang: + +``` +(gdb) info threads + Id Target Id Frame +* 1 LWP 15894 "qemu-sh4" safe_syscall_base () at ../../common-user/host/x86_64/safe-syscall.inc.S:75 + 2 LWP 15994 "qemu-sh4" 0x00007f956b800f59 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6 + 3 LWP 15997 "qemu-sh4" safe_syscall_base () at ../../common-user/host/x86_64/safe-syscall.inc.S:75 +(gdb) bt +#0 safe_syscall_base () at ../../common-user/host/x86_64/safe-syscall.inc.S:75 +#1 0x0000560ee17196e4 in safe_futex (uaddr=0x58e8, op=-513652411, val=<optimized out>, timeout=0xf0, uaddr2=<optimized out>, val3=582) at ../../linux-user/syscall.c:681 +#2 do_safe_futex (uaddr=0x58e8, op=-513652411, val=<optimized out>, timeout=0xf0, uaddr2=<optimized out>, val3=582) at ../../linux-user/syscall.c:7757 +#3 0x0000560ee170c8d9 in do_syscall1 (cpu_env=<optimized out>, num=<optimized out>, arg1=<optimized out>, arg2=<optimized out>, arg3=22760, arg4=<optimized out>, arg5=<optimized out>, arg6=240, arg7=0, arg8=0) at /home/alex.bennee/lsrc/qemu.git/include/exec/cpu_ldst.h:90 +#4 0x0000560ee170220c in do_syscall (cpu_env=<optimized out>, num=<optimized out>, arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=<optimized out>, arg5=<optimized out>, arg6=<optimized out>, arg7=<optimized out>, arg8=<optimized out>) at ../../linux-user/syscall.c:13239 +#5 0x0000560ee1626111 in cpu_loop (env=0x560ee294b028) at ../../linux-user/sh4/cpu_loop.c:43 +#6 0x0000560ee16ee37d in main (argc=-493657104, argv=0x7ffdcaf52028, envp=<optimized out>) at ../../linux-user/main.c:883 +(gdb) thread 2 +[Switching to thread 2 (LWP 15994)] +#0 0x00007f956b800f59 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6 +(gdb) bt +#0 0x00007f956b800f59 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6 +#1 0x0000560ee1847bd6 in qemu_futex_wait (f=<optimized out>, val=<optimized out>) at /home/alex.bennee/lsrc/qemu.git/include/qemu/futex.h:29 +#2 qemu_event_wait (ev=0x560ee2738974 <rcu_call_ready_event>) at ../../util/qemu-thread-posix.c:481 +#3 0x0000560ee18539a2 in call_rcu_thread (opaque=<optimized out>) at ../../util/rcu.c:261 +#4 0x0000560ee1847f17 in qemu_thread_start (args=0x560ee2933eb0) at ../../util/qemu-thread-posix.c:556 +#5 0x00007f956b8f6fa3 in start_thread () from target:/lib/x86_64-linux-gnu/libpthread.so.0 +#6 0x00007f956b8064cf in clone () from target:/lib/x86_64-linux-gnu/libc.so.6 +(gdb) thread 3 +[Switching to thread 3 (LWP 15997)] +#0 safe_syscall_base () at ../../common-user/host/x86_64/safe-syscall.inc.S:75 +75 cmp $-4095, %rax +(gdb) bt +#0 safe_syscall_base () at ../../common-user/host/x86_64/safe-syscall.inc.S:75 +#1 0x0000560ee17196e4 in safe_futex (uaddr=0x2, op=-513652411, val=<optimized out>, timeout=0x3f7fcdc4, uaddr2=<optimized out>, val3=582) at ../../linux-user/syscall.c:681 +#2 do_safe_futex (uaddr=0x2, op=-513652411, val=<optimized out>, timeout=0x3f7fcdc4, uaddr2=<optimized out>, val3=582) at ../../linux-user/syscall.c:7757 +#3 0x0000560ee170c8d9 in do_syscall1 (cpu_env=<optimized out>, num=<optimized out>, arg1=<optimized out>, arg2=<optimized out>, arg3=2, arg4=<optimized out>, arg5=<optimized out>, arg6=1065340356, arg7=0, arg8=0) at /home/alex.bennee/lsrc/qemu.git/include/exec/cpu_ldst.h:90 +#4 0x0000560ee170220c in do_syscall (cpu_env=<optimized out>, num=<optimized out>, arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=<optimized out>, arg5=<optimized out>, arg6=<optimized out>, arg7=<optimized out>, arg8=<optimized out>) at ../../linux-user/syscall.c:13239 +#5 0x0000560ee1626111 in cpu_loop (env=0x560ee2a2c2d8) at ../../linux-user/sh4/cpu_loop.c:43 +#6 0x0000560ee171728f in clone_func (arg=<optimized out>) at ../../linux-user/syscall.c:6608 +#7 0x00007f956b8f6fa3 in start_thread () from target:/lib/x86_64-linux-gnu/libpthread.so.0 +#8 0x00007f956b8064cf in clone () from target:/lib/x86_64-linux-gnu/libc.so.6 +``` diff --git a/results/classifier/gemma3:12b/performance/861 b/results/classifier/gemma3:12b/performance/861 new file mode 100644 index 00000000..a51da8db --- /dev/null +++ b/results/classifier/gemma3:12b/performance/861 @@ -0,0 +1,2 @@ + +Using qemu+kvm is slower than using qemu in rv6(xv6 rust porting) diff --git a/results/classifier/gemma3:12b/performance/862 b/results/classifier/gemma3:12b/performance/862 new file mode 100644 index 00000000..5b9ca236 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/862 @@ -0,0 +1,50 @@ + +Using qemu+kvm is slower than using qemu in rv6(xv6 rust porting) +Description of problem: +Using qemu+kvm is slower than using qemu in rv6(xv6 rust porting) +Steps to reproduce: +``` +git clone https://github.com/kaist-cp/rv6 +cd rv6 +make clean +TARGET=arm GIC_VERSION=3 KVM=yes make qemu +``` +Additional information: +We are currently working on the [rv6 project](https://github.com/kaist-cp/rv6) which is porting MIT's educational operating system [xv6](https://github.com/mit-pdos/xv6-public) to Rust.<br> Our code is located [here](https://github.com/kaist-cp/rv6/tree/main/kernel-rs). +We use qemu and [qemu's virt platform](https://qemu.readthedocs.io/en/latest/system/arm/virt.html) to execute rv6, and it works well with using qemu. +Executing command on arm machine is this: +``` +RUST_MODE=release TARGET=arm KVM=yes GIC_VERSION=3 +qemu-system-aarch64 -machine virt -kernel kernel/kernel -m 128M -smp 80 -nographic -drive file=fs.img,if=none,format=raw,id=x0,copy-on-read=off -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -cpu cortex-a53 -machine gic-version=3 -net none +``` +To make some speed boost experiment with KVM, we made rv6 support the arm architecture on arm machine. The arm architecture's driver code locates in [here](https://github.com/kaist-cp/rv6/tree/main/kernel-rs/src/arch/arm). +The problem is, when we use qemu with kvm, the performance is significantly reduced. +Executing command on arm machine with KVM is this: +``` +qemu-system-aarch64 -machine virt -kernel kernel/kernel -m 128M -smp 80 -nographic -drive file=fs.img,if=none,format=raw,id=x0,copy-on-read=off -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -cpu host -enable-kvm -machine gic-version=3 -net none +``` +We repeated +1. Write 500 bytes syscall 10,000 times and the result was: kvm disable: 4,500,000 us, kvm enable: 29,000,000 us. (> 5 times) +2. Open/Close syscall 10,000 times result: kvm disable: 12,000,000 us, kvm enable: 29,000,000 us. (> 5 times) +3. Getppid syscall 10,000 times result: kvm disable: 735,000 us, kvm enable: 825,000 us. (almost same) +4. Simple calculation(a = a * 1664525 + 1013904223) 100 million times result: kvm disable: 2,800,000 us, kvm enable: 65,000,000 us. (> 20 times) + +And the elapsed time was estimated by [uptime_as_micro](https://github.com/kaist-cp/rv6/blob/90b84b60931327ae8635875b788b10280e47b99c/kernel-rs/src/arch/arm/timer.rs#L17) syscall in rv6. +These results were so hard to understand. <br>So first we tried to find the bottleneck on rv6's booting process, because finding bottleneck during processing user program was so difficult. +We found that the first noticeable bottleneck on rv6 booting process was [here](https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/kalloc.rs#L107-L108): +``` +run.as_mut().init(); +self.runs().push_front(run.as_ref()); +``` +As far as we know, this part is just kind of "list initialization and push element" part. So we thought that by some reason, the KVM is not actually working and it makes worse result. And also this part is even before turn on some interrupts, so we thought [arm's GIC](https://developer.arm.com/documentation/dai0492/b/) or interrupt related thing is not related with problem. + +So, how can I get better performance when using kvm with qemu? + +To solve this problem, we tried these already: +1. change qemu(4.2, 6.2), virt version, change [some command for qemu-kvm](https://linux.die.net/man/1/qemu-kvm) like cpu, drive cache, copy-on-read something, kernel_irqchip.., cpu core.. etc +2. find some kvm hypercall to use - but not exists on arm64 +3. Run [lmbench](http://lmbench.sourceforge.net/) by ubuntu on qemu with kvm to check KVM itself is okay. - We found KVM with ubuntu is super faster than only using qemu. +4. Check [16550a UART print code](https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/arch/arm/uart.rs) is really slow on enabling KVM which makes incorrect result on benchmark - Without bottleneck code, we found the progress time of rv6 booting were almost same with KVM enabled or not. +5. Check other people who suffer same situation like us - but [this superuser page](https://superuser.com/questions/1317948/qemu-enable-kvm-slower-than-pure-emulation-for-x86-64) not works. Our clocksource is arch_sys_counter. + +Thank you for your help. diff --git a/results/classifier/gemma3:12b/performance/866 b/results/classifier/gemma3:12b/performance/866 new file mode 100644 index 00000000..9bcdd806 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/866 @@ -0,0 +1,54 @@ + +linux-user: substantial memory leak when threads are created and destroyed +Description of problem: +Substantial memory leak when the following simple program is executed on `qemu-arm`, +```c +// compile with `arm-none-linux-gnueabihf-gcc test_qemu.c -o test_qemu.out -pthread` + +#include <assert.h> +#include <pthread.h> + +#define MAGIC_RETURN ((void *)42) + +void *thread_main(void *arg) +{ + return MAGIC_RETURN; +} + +int main(int argc, char *argv[]) +{ + size_t i; + for (i = 0;; i++) + { + pthread_t thread; + assert(pthread_create(&thread, NULL, thread_main, NULL) == 0); + void *ret; + assert(pthread_join(thread, &ret) == 0); + assert(ret == MAGIC_RETURN); + } + + return 0; +} +``` +Steps to reproduce: +1. +``` +export TOOLCHAIN_PREFIX=arm-none-linux-gnueabihf +export ARMSDK=/${TOOLCHAIN_PREFIX} +export SYSROOT=${ARMSDK}/${TOOLCHAIN_PREFIX}/libc +export CC=${ARMSDK}/bin/${TOOLCHAIN_PREFIX}-gcc +``` +2. Download the arm toolchain: `curl --output ${TOOLCHAIN_PREFIX}.tar.xz -L 'https://developer.arm.com/-/media/Files/downloads/gnu-a/10.2-2020.11/binrel/gcc-arm-10.2-2020.11-x86_64-arm-none-linux-gnueabihf.tar.xz?revision=d0b90559-3960-4e4b-9297-7ddbc3e52783&la=en&hash=985078B758BC782BC338DB947347107FBCF8EF6B'` +3. `mkdir -p ${ARMSDK} && tar xf ${TOOLCHAIN_PREFIX}.tar.xz -C ${ARMSDK} --strip-components=1` +4. `$CC test_qemu.c -o test_qemu.out -pthread` +5. `qemu-arm -L $SYSROOT ./test_qemu.out` +6. Observe memory usage keeps ramping up and crashes the process once out of memory. +Additional information: +Valgrind annotation logs [annot.log](/uploads/f8d05d8f216d5a589e8da0758a345de6/annot.log) generated by a local build on master@0a301624c2f4ced3331ffd5bce85b4274fe132af from +```bash +valgrind --xtree-memory=full --xtree-memory-file=xtmemory.kcg bin/debug/native/qemu-arm -L $SYSROOT /mnt/f/test_qemu3.out +# Send CTRL-C before the process crashes due to oom +callgrind_annotate --auto=yes --inclusive=yes --sort=curB:100,curBk:100,totB:100,totBk:100,totFdB:100,totFdBk:100 xtmemory.kcg > annot.log +``` + +# diff --git a/results/classifier/gemma3:12b/performance/874 b/results/classifier/gemma3:12b/performance/874 new file mode 100644 index 00000000..958d12d7 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/874 @@ -0,0 +1,2 @@ + +New Python QMP library races on NetBSD diff --git a/results/classifier/gemma3:12b/performance/878019 b/results/classifier/gemma3:12b/performance/878019 new file mode 100644 index 00000000..ccd39c53 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/878019 @@ -0,0 +1,11 @@ + +0.15.1 black screen and 100% cpu on start + +Trying the freshly compiled 0.15.1 (command line: "qemu"), the window stays black, it uses 100% cpu, and can't be killed with ctrl-c, has to be killed with killall -9. + +Strace shows it's waiting on a futex forever? + +Build config: +./configure --prefix=/usr/local --interp-prefix=/usr/local/share/qemu \ +--enable-mixemu --disable-brlapi --enable-io-thread --audio-drv-list="oss alsa sdl" \ +--disable-opengl \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/897 b/results/classifier/gemma3:12b/performance/897 new file mode 100644 index 00000000..7e14876e --- /dev/null +++ b/results/classifier/gemma3:12b/performance/897 @@ -0,0 +1,2 @@ + +Warning with "qemu-s390x -cpu max" diff --git a/results/classifier/gemma3:12b/performance/919 b/results/classifier/gemma3:12b/performance/919 new file mode 100644 index 00000000..517d3754 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/919 @@ -0,0 +1,6 @@ + +Slow in Windows +Description of problem: +Eg . Win8.1 in QEMU on Windows is very slow and other os also are very slow +Steps to reproduce: +Just run a qemu instance diff --git a/results/classifier/gemma3:12b/performance/950692 b/results/classifier/gemma3:12b/performance/950692 new file mode 100644 index 00000000..822ed2ed --- /dev/null +++ b/results/classifier/gemma3:12b/performance/950692 @@ -0,0 +1,77 @@ + +High CPU usage in Host (revisited) + +Hi, + +last time QEMU(KVM) was working for us flawlessly was 2.6.35 kernel. + +Actually it still works flawlessly on that one single machine, that still has this kernel. Qemu version is meanwhile 1.0-r3, so the problem seems to be dependent on kernel version and not qemu version. + +We have several other machines, where the "high CPU usage in host" problem is present in various degrees of annoyingness. + +Both host and guest are Gentoo linux, at least that's what we test with. Several tested systems with other linux distributions and FreeBSD show similar - if not worse - behaviour. I will talk about 3 hosts, machine A, machine B and machine C + +A: + +2.6.35-gentoo-r9 #2 SMP Sat Nov 6 22:32:28 CET 2010 x86_64 Intel(R) Xeon(R) CPU L5410 @ 2.33GHz +32GB, runs about 15 KVM guests (all Gentoo, some 32bit, some 64bit, all SMP) +no problems whatsoever, host CPU usage corresponds to Guest CPU usage + 1-2%, that's how we like it +qemu 1.0-r3 + +B: + +3.0.6-gentoo #1 SMP Sun Oct 16 18:57:31 CEST 2011 x86_64 Intel(R) Xeon(R) CPU L5630 @ 2.13GHz +144GB, runs 1(!) KVM guest (Debian 6.x) +/usr/bin/qemu-system-x86_64 --enable-kvm -daemonize -cpu host -k de -net tap -tdf -hda /data/vm/disk.raw -m 768 -smp 1 -vnc :5 -net nic,model=e1000,macaddr=... +100% host CPU load always, therefore it got only "smp 1", if we gave it smp 2, it would have 200%, smp 4 400% and so on. +qemu 1.0-r3 + +C: + +3.1.6-gentoo #5 SMP Tue Mar 6 20:34:44 CET 2012 x86_64 Intel(R) Xeon(R) CPU 5148 @ 2.33GHz +16GB, runs 1-4 KVM guests (mostly Gentoo machines from A, plus some SuSE, RedHat etc.) +X00% CPU usage, where x corresponds to the smp X parameter, at startup as well as if someone "touches" the VM, like logging in, doing a "ls". If the machine is ABSOLUTELY IDLE, the process also exhibits 1-2% CPU load in host, but as soon as you do a simple ls, usage goes to - say - 400%, where it remains for some seconds, then slowly falls 280%, 120%, 60%, ... back to 1-2% +qemu 1.0-r3 + + +B is no go, C tries to well-behave but ultimatively fails, A is golden. + +There seems to be REAL high CPU usage and not only an error in displaying it. Other processes get less CPU power and exhibit definitely a slower runtime. On B, definitely one CPU core is hogged all the time + + +Some years ago we experienced something similar with ~2.6.26 and after a long and woeful period, we found out that compiling the host kernel as a tickless system caused the problem. Enabling high resolution timers made the problem go away and that is the situation on machine A until today. Since then no one dared to touch this production server. Unfortunately, this recipe didn't help with the other machines. + +I have scanned the net for similar problems and there are people complaining about high CPU usage. Unfortunately very often the devs or maintainers cannot reproduce it and the issue is dropped. Well - we cannot reproduce a "good behaviour"(tm) on any but one machine with any recent (read: post-2.6.35) linux kernel. + +Summary what we tried so far: + +* different linux kernels @ host, and @ guest + +-> no difference, especially there are guests @ A, that run newer kernels and there are Guests at B and C that run older kernels than is the host kernel + +* smp and non-smp, 32bit and 64bit guests + +-> 32/64bit in the guest makes no difference whatsoever. The smp just limits how much of the host CPU the guest hogs on non-well behaving systems (smp X -> X * 100%) + +* various linux guest OS, as well as FreeBSD + +-> no difference whatsoever + +* various options parameters in the host kernel (other schedulers, HRT, tickless,...) + +-> no difference whatsoever + +* various versions of qemu/kvm since 0.13 + +-> no difference whatsoever + +* various qemu/kvm options, virtio and non-virtio configurations (most of the VMs @ A run blk-virtio but emulate an e1000) + +-> no difference whatsoever + + +You could say, we've reached wits' end. We could try 2.6.35 @ machine C with the same configuration from A (they are identical except CPU and RAM size, but same RAID, mainboard, etc. plus A once had also the 5148 Xeons and an upgrade luckily made no difference in good behaviour, so I would exclude the CPU factor) but honestly that is not the way I'd like to go. The goal is to update A to something recent and not to loose it's VM-hosting well behaviour. Ideally to propagate this well beaviour to the other machines. + + +Arjan Minski + PetaMem IT \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/959 b/results/classifier/gemma3:12b/performance/959 new file mode 100644 index 00000000..18de7d65 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/959 @@ -0,0 +1,10 @@ + +100% CPU utilization when the guest is idle (FreeBSD on M1 Mac) +Description of problem: +100% CPU utilization when the guest is idle. +Steps to reproduce: +1. Download the FreeBSD qcow2 image and decompress it: https://download.freebsd.org/releases/VM-IMAGES/13.0-RELEASE/aarch64/Latest/ +2. Execute the above command. +3. The QEMU process consumes 100% CPU. +4. + diff --git a/results/classifier/gemma3:12b/performance/965867 b/results/classifier/gemma3:12b/performance/965867 new file mode 100644 index 00000000..c5fd5e71 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/965867 @@ -0,0 +1,49 @@ + +9p virtual file system on qemu slow + +Hi, +The 9p virtual file system is slow. Several examples below: +--------------------------------------------------------- +Host for the first time: +$ time ls bam.unsorted/ +........................... +real 0m0.084s +user 0m0.000s +sys 0m0.028s +-------------------------------------------------- +Host second and following: + +real 0m0.009s +user 0m0.000s +sys 0m0.008s +------------------------------------------------------ +VM for the first time: +$ time ls bam.unsorted/ +................................ +real 0m23.141s +user 0m0.064s +sys 0m2.156s +------------------------------------------------------ +VM for the second time +real 0m3.643s +user 0m0.024s +sys 0m0.424s +---------------------------------------------------- + +Copy on host: +$ time cp 5173T.root.bak test.tmp +real 0m30.346s +user 0m0.004s +sys 0m5.324s + +$ ls -lahs test.tmp +2.7G -rw------- 1 oneadmin cloud 2.7G Mar 26 21:47 test.tmp + +--------------------------------------------------- +$ copy on VM for the same file + +time cp 5173T.root.bak test.tmp + +real 5m46.978s +user 0m0.352s +sys 1m38.962s \ No newline at end of file diff --git a/results/classifier/gemma3:12b/performance/966 b/results/classifier/gemma3:12b/performance/966 new file mode 100644 index 00000000..592c2b92 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/966 @@ -0,0 +1,59 @@ + +simple code line is so slow on rv6(rust os) than ubuntu +Description of problem: +[Simple code line for getppid](https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/proc/procs.rs#L470) takes so long time(About 0.08 microsec, which is about 70% time of ubuntu getppid() syscall) on kernel. So we wonder if there is a problem with the qemu or kvm side settings. +Steps to reproduce: +``` +git clone https://github.com/kaist-cp/rv6 +cd rv6 +make clean +RUST_MODE=release TARGET=arm GIC_VERSION=3 KVM=yes make qemu +``` +Additional information: +We are currently working on the [rv6 project](https://github.com/kaist-cp/rv6) which is porting MIT's educational operating system [xv6](https://github.com/mit-pdos/xv6-public) to Rust.<br> Our code is located [here](https://github.com/kaist-cp/rv6/tree/main/kernel-rs). +We use qemu and [qemu's virt platform](https://qemu.readthedocs.io/en/latest/system/arm/virt.html) to execute rv6, and it works well with using qemu. +Executing command on arm machine is this: +``` +RUST_MODE=release TARGET=arm KVM=yes GIC_VERSION=3; # compile +qemu-system-aarch64 -machine virt -kernel kernel/kernel -m 128M -smp 1 -nographic -drive file=fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -cpu host -enable-kvm -machine gic-version=3 +``` +Now, we are comparing the speed(exactly, elapsed wall clock time) of system call of qemu+rv6+kvm and qemu+ubuntu 18.04+kvm with [lmbench](http://lmbench.sourceforge.net/). +For ubuntu, qemu command is this: +``` +qemu-system-aarch64 -cpu host -enable-kvm -device rtl8139,netdev=net0 -device virtio-scsi-device -device scsi-cd,drive=cdrom -device virtio-blk-device,drive=hd0 -drive "file=${iso},id=cdrom,if=none,media=cdrom" -drive "if=none,file=${img_snapshot},id=hd0" -m 2G -machine "virt,gic-version=3,its=off" -netdev user,id=net0 -nographic -pflash "$flash0" -pflash "$flash1" -smp 1 +``` +Now, our goal is to make rv6 perform similar or faster than ubuntu for relatively simple system call like getppid(). <br> +Relatively simple system call means, for example, in the case of getppid(), the actual system call execution part is so simple. So it mainly measures the time for user space -> kernel space -> user space. <br> +And we thought that on getppid() syscall, rv6 could show similar performance or more faster than ubuntu cause it's simple system.<br> +**The most important problem** is that, although it will be described later, a [simple code line for getppid](https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/proc/procs.rs#L470) takes so long time(About 0.08 microsec, which is about 70% time of ubuntu getppid() syscall) on kernel. So we wonder if there is a problem with the qemu or kvm side settings. + +First, the measured performance result for lmbench's "lat_syscall null" which executes internally getppid() is: + - rv6, Rust opt-level: 1, smp 3(qemu), gcc optimization level: -O -> average 1.662 microsec + - ubuntu, smp 3, gcc optimization level: -O -> average 0.126 microsec +So we see that rv6 is so slower than ubuntu over 10x. + +To find the bottleneck of rv6, we use [linux perf](https://perf.wiki.kernel.org/index.php/Main_Page) and divided execution path into 4 stages. <br> +Stage 1: Call getppid in the user space to until just before the trap handler is called<br> +Stage 2: From after stage 1 to until just before the start of code specific to sys_getppid.<br> +Stage 3: From after stage 2 to [end of actual sys_getppid function](https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/proc/procs.rs#L468-L473)<br> +Stage 4: From after stage 3 to the point which getppid syscall returns on user space<br> +The result with perf was: + - ubuntu: 0.042 microsec/ 0.0744 microsec / 0.00985 microsec / 0 -> total 0.126 microsec + - rv6: ? / ? / 0.3687 microsec / ? -> 1.662 microsec + - we made assumption for ubuntu stage 4 time is zero. + - The question mark is, on rv6 we couldn't use perf so only stage 3 time is measured for right now, but checked stage 3 part manually. + +So from the result, it can be confirmed that the rv6's stage 3 already consumes more than 3 times of ubuntu's syscall total time, and at least 30 times more than ubuntu's stage 3. +This is so bad, so we tried several things to inspect the problem: + - Check whether rv6's timer interrupt affects execution time: The interval is 100ms which is so big, so it seems not related. + - To check user space's execution speed, we made simple quick sort program and check rv6's user space speed is significantly slower than ubuntu. + - When running 100,000 times, rv6(smp 1, opt-level 1)'s execution time: 3.2s vs ubuntu(smp 1)'s execution time: 2.7s. + - Although it is 20% slower, it is judged that there is almost no difference compared to the lmbench result. So we thought it is no big problem. + + - Next we checked rv6's stage 3's code. https://github.com/kaist-cp/rv6/blob/main/kernel-rs/src/proc/procs.rs#L468 + - The lock is held twice at line 469 and line 472, whereas ubuntu's same code part, lock is held only once. So first if we change the structure to hold lock only once, there will be improvement in speed. we noticed that. + - **Also there's a big problem on 470 line.** We measured time for 470 line with CNTPCT_EL0 register, and it was found that at least 0.08 microsec was consumed in the corresponding line. + - So ubuntu's stage 3 consumes about 0.01 microsec, but only line 470 of rv6, which does not have complicated logic(it also doesn't hold lock) consumes about 8 times that ubuntu's stage 3. + - So we concluded that there may be a problem with the kvm setting on the kernel side or other settings. + +So do you have any idea for this problem? Thank you for your help. diff --git a/results/classifier/gemma3:12b/performance/973 b/results/classifier/gemma3:12b/performance/973 new file mode 100644 index 00000000..20f3d452 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/973 @@ -0,0 +1,20 @@ + +qemu 6.2 memory leak when failed to boot and infinitely reboot +Description of problem: +qemu allocates tons of memory (very likely memory leak) in certain (rare) cases. + +When I misconfigured qemu so that I have run a bigger linux kernel within insufficient memory (for example 8M bzImage while 16M ram and no hdd), the kernel will obviously fail to boot. In this case qemu will reboot (likely the linux kernel reboots). However reboot does not solve the problem, causing qemu to repeatedly reboot. + +Memory usage of qemu raises sharply in the progress. +Steps to reproduce: +1. Get any linux kernel (tested with 5.15.33) +2. Run the kernel on qemu, with memory smaller than necessary +Additional information: +A reproducing dockerfile: +``` +FROM alpine:3.15 + +RUN apk add qemu-system-x86_64 linux-virt + +CMD ["/usr/bin/qemu-system-x86_64", "-kernel", "/boot/vmlinuz-virt", "-nographic", "-net", "none", "-m", "16M"] +``` diff --git a/results/classifier/gemma3:12b/performance/985 b/results/classifier/gemma3:12b/performance/985 new file mode 100644 index 00000000..560c69e0 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/985 @@ -0,0 +1,60 @@ + +pkg_add is working very slow on NetBSD +Description of problem: +pkg_add is working very slow, it installs one package in ~30 minutes although network speed is normal. +Steps to reproduce: +1. `wget https://cdn.netbsd.org/pub/NetBSD/NetBSD-9.2/images/NetBSD-9.2-amd64.iso` +2. `qemu-img create -f qcow2 disk.qcow2 15G` +3. Install +``` +qemu-system-x86_64 -m 2048 -enable-kvm \ + -drive if=virtio,file=disk.qcow2,format=qcow2 \ + -netdev user,id=mynet0,hostfwd=tcp::7722-:22 \ + -device e1000,netdev=mynet0 \ + -cdrom NetBSD-9.2-amd64.iso +``` + # Installation steps + - 1) Boot Normally + - a) Installation messages in English + - a) unchanged + - a) Install NetBSD to hard disk + - b) Yes + - a) 15G + - a) GPT + - a) This is the correct geometry + - b) Use default partition sizes + - x) Partition sizes are ok + - b) Yes + - a) Use BIOS console + - b) Installation without X11 + - a) CD-ROM / DVD / install image media + - Hit enter to continue + - a) configure network (Select defaults here, perform autoconf) + - x) Finished configuring + - Hit enter to continue + - x) Exit Install System + - Close QEMU +4. Run +``` + qemu-system-x86_64 -m 2048 \ + -drive if=virtio,file=disk.qcow2,format=qcow2 \ + -enable-kvm \ + -netdev user,id=mynet0,hostfwd=tcp:127.0.0.1:7722-:22 \ + -device e1000,netdev=mynet0 +``` +5. Login as root +6. In NetBSD +``` +export PKG_PATH="http://cdn.NetBSD.org/pub/pkgsrc/packages/NetBSD/$(uname -p)/$(uname -r)/All/" && \ +pkg_add pkgin + +``` +You should see that each of the package's installation takes ~30 minutes. +Additional information: +NetBSD 9.2 is also tested in Debian 11 with 'QEMU 6.2.0' and encountered same slowness. + +NetBSD 7.1 and 8.1 are tested on openSUSE Tumbleweed and encountered same slowness. + +OpenBSD's pkg_add is working correctly. + +I am not sure if it will help but Virtualbox(at least 6.1) is working correctly. diff --git a/results/classifier/gemma3:12b/performance/997 b/results/classifier/gemma3:12b/performance/997 new file mode 100644 index 00000000..19130d2b --- /dev/null +++ b/results/classifier/gemma3:12b/performance/997 @@ -0,0 +1,18 @@ + +Iothread is stuck at 100% CPU usage with virtio-scsi on QEMU 7.0.0 +Description of problem: +Starting with QEMU 7.0.0, the iothread associated attached to a virtio-scsi controller is stuck at 100% CPU usage. Bisected to: https://gitlab.com/qemu-project/qemu/-/commit/826cc32423db2a99d184dbf4f507c737d7e7a4ae + +- Works as expected without the iothread +- No issue with virtio-blk + iothread +- Same behavior regardless of io=threads/native/io_uring +- Same behavior with default vs increased queue count +- The issue is triggered when the guest OS initializes the virtio driver +Steps to reproduce: +1. Add virtio-scsi controller with iothread +2. Boot VM +3. Check per-thread CPU usage such as in htop +Additional information: +[fedora.log](/uploads/776fbf8e5b823d0ab326946684ef9022/fedora.log) + +[fedora.xml](/uploads/54879e5adfb227ddef79d382e86fc608/fedora.xml) diff --git a/results/classifier/gemma3:12b/performance/997631 b/results/classifier/gemma3:12b/performance/997631 new file mode 100644 index 00000000..5de4ee00 --- /dev/null +++ b/results/classifier/gemma3:12b/performance/997631 @@ -0,0 +1,18 @@ + +Windows 2008R2 very slow cold boot when 4 CPUs + +Hi, + +well, I'm in a similar boat as the one in #992067. But regardless any memory-settings. +It takes "ages" in a cold-boot Windows 2008R2 with qemu-1.0.1, qemu-1.0.50 and latest-n-greatest from today ( 1.0.50 /qemu-1b3e76e ). It eats up 400% host-cpu-load until login-prompt is shown on the console. + +Meanwhile I tried couple of settings with "-cpu features (hv_spinlocks), hv_relaxed and hv_vapic. ". +Due to some Clock-glitches I start qemu-system-x86_64 with "-no-hpet". + +With 2 processors the system is up after 2 minutes, with 4 procs almost 10 minutes... After a reset ( warmstart) the 4 proc-system is up after a couple of 20 secs. + +Hints welcome, though once started, the system seems to operate "normally". + +Thnx in@vance, + +Oliver. \ No newline at end of file |