summary refs log tree commit diff stats
path: root/results/classifier/016/virtual
diff options
context:
space:
mode:
authorChristian Krinitsin <mail@krinitsin.com>2025-07-03 19:39:53 +0200
committerChristian Krinitsin <mail@krinitsin.com>2025-07-03 19:39:53 +0200
commitdee4dcba78baf712cab403d47d9db319ab7f95d6 (patch)
tree418478faf06786701a56268672f73d6b0b4eb239 /results/classifier/016/virtual
parent4d9e26c0333abd39bdbd039dcdb30ed429c475ba (diff)
downloademulator-bug-study-dee4dcba78baf712cab403d47d9db319ab7f95d6.tar.gz
emulator-bug-study-dee4dcba78baf712cab403d47d9db319ab7f95d6.zip
restructure results
Diffstat (limited to 'results/classifier/016/virtual')
-rw-r--r--results/classifier/016/virtual/04472277603
-rw-r--r--results/classifier/016/virtual/16201167127
-rw-r--r--results/classifier/016/virtual/241903402083
-rw-r--r--results/classifier/016/virtual/2493082660
-rw-r--r--results/classifier/016/virtual/258928271104
-rw-r--r--results/classifier/016/virtual/35170175548
-rw-r--r--results/classifier/016/virtual/365680444608
-rw-r--r--results/classifier/016/virtual/46572227433
-rw-r--r--results/classifier/016/virtual/53568181105
-rw-r--r--results/classifier/016/virtual/57231878269
-rw-r--r--results/classifier/016/virtual/67821138226
-rw-r--r--results/classifier/016/virtual/700212717475
-rw-r--r--results/classifier/016/virtual/704164881206
-rw-r--r--results/classifier/016/virtual/744669631905
-rw-r--r--results/classifier/016/virtual/79834768436
15 files changed, 0 insertions, 21188 deletions
diff --git a/results/classifier/016/virtual/04472277 b/results/classifier/016/virtual/04472277
deleted file mode 100644
index 307fd76c..00000000
--- a/results/classifier/016/virtual/04472277
+++ /dev/null
@@ -1,603 +0,0 @@
-virtual: 0.939
-KVM: 0.879
-x86: 0.774
-debug: 0.772
-files: 0.742
-hypervisor: 0.710
-user-level: 0.641
-operating system: 0.244
-boot: 0.068
-kernel: 0.045
-PID: 0.025
-performance: 0.024
-TCG: 0.022
-VMM: 0.019
-register: 0.018
-socket: 0.017
-semantic: 0.015
-device: 0.010
-risc-v: 0.010
-network: 0.007
-architecture: 0.006
-ppc: 0.006
-alpha: 0.005
-graphic: 0.003
-assembly: 0.003
-vnc: 0.003
-peripherals: 0.002
-permissions: 0.002
-i386: 0.001
-arm: 0.001
-mistranslation: 0.001
-
-[BUG][KVM_SET_USER_MEMORY_REGION] KVM_SET_USER_MEMORY_REGION failed
-
-Hi all,
-I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR.
-Is there any one know this?
-The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log
-```
-2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument
-kvm_set_phys_mem: error registering slot: Invalid argument
-2023-03-14 10:09:18.198+0000: shutting down, reason=crashed
-```
-The xml file
-```
-root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml
-<!--
-WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
-OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
-  virsh edit instance-0000000e
-or other application using the libvirt API.
--->
-<domain type='kvm'>
-  <name>instance-0000000e</name>
-  <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid>
-  <metadata>
-    <nova:instance xmlns:nova="
-http://openstack.org/xmlns/libvirt/nova/1.1
-">
-      <nova:package version="25.1.0"/>
-      <nova:name>provider-instance</nova:name>
-      <nova:creationTime>2023-03-14 10:09:13</nova:creationTime>
-      <nova:flavor name="cirros-os-dpu-test-1">
-        <nova:memory>64</nova:memory>
-        <nova:disk>1</nova:disk>
-        <nova:swap>0</nova:swap>
-        <nova:ephemeral>0</nova:ephemeral>
-        <nova:vcpus>1</nova:vcpus>
-      </nova:flavor>
-      <nova:owner>
-        <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user>
-        <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project>
-      </nova:owner>
-      <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/>
-      <nova:ports>
-        <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340">
-          <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/>
-        </nova:port>
-      </nova:ports>
-    </nova:instance>
-  </metadata>
-  <memory unit='KiB'>65536</memory>
-  <currentMemory unit='KiB'>65536</currentMemory>
-  <vcpu placement='static'>1</vcpu>
-  <sysinfo type='smbios'>
-    <system>
-      <entry name='manufacturer'>OpenStack Foundation</entry>
-      <entry name='product'>OpenStack Nova</entry>
-      <entry name='version'>25.1.0</entry>
-      <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='family'>Virtual Machine</entry>
-    </system>
-  </sysinfo>
-  <os>
-    <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type>
-    <boot dev='hd'/>
-    <smbios mode='sysinfo'/>
-  </os>
-  <features>
-    <acpi/>
-    <apic/>
-    <vmcoreinfo state='on'/>
-  </features>
-  <cpu mode='host-model' check='partial'>
-    <topology sockets='1' dies='1' cores='1' threads='1'/>
-  </cpu>
-  <clock offset='utc'>
-    <timer name='pit' tickpolicy='delay'/>
-    <timer name='rtc' tickpolicy='catchup'/>
-    <timer name='hpet' present='no'/>
-  </clock>
-  <on_poweroff>destroy</on_poweroff>
-  <on_reboot>restart</on_reboot>
-  <on_crash>destroy</on_crash>
-  <devices>
-    <emulator>/usr/bin/qemu-system-x86_64</emulator>
-    <disk type='file' device='disk'>
-      <driver name='qemu' type='qcow2' cache='none'/>
-      <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/>
-      <target dev='vda' bus='virtio'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
-    </disk>
-    <controller type='usb' index='0' model='piix3-uhci'>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
-    </controller>
-    <controller type='pci' index='0' model='pci-root'/>
-    <interface type='hostdev' managed='yes'>
-      <mac address='fa:16:3e:aa:d9:23'/>
-      <source>
-        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
-    </interface>
-    <serial type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='isa-serial' port='0'>
-        <model name='isa-serial'/>
-      </target>
-    </serial>
-    <console type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='serial' port='0'/>
-    </console>
-    <input type='tablet' bus='usb'>
-      <address type='usb' bus='0' port='1'/>
-    </input>
-    <input type='mouse' bus='ps2'/>
-    <input type='keyboard' bus='ps2'/>
-    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
-      <listen type='address' address='0.0.0.0'/>
-    </graphics>
-    <audio id='1' type='none'/>
-    <video>
-      <model type='virtio' heads='1' primary='yes'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
-    </video>
-    <hostdev mode='subsystem' type='pci' managed='yes'>
-      <source>
-        <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
-    </hostdev>
-    <memballoon model='virtio'>
-      <stats period='10'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
-    </memballoon>
-    <rng model='virtio'>
-      <backend model='random'>/dev/urandom</backend>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
-    </rng>
-  </devices>
-</domain>
-```
-----
-Simon Jones
-
-This is happened in ubuntu22.04.
-QEMU is install by apt like this:
-apt install -y qemu qemu-kvm qemu-system
-and QEMU version is 6.2.0
-----
-Simon Jones
-Simon Jones <
-batmanustc@gmail.com
-> 于2023年3月21日周二 08:40写道:
-Hi all,
-I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR.
-Is there any one know this?
-The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log
-```
-2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument
-kvm_set_phys_mem: error registering slot: Invalid argument
-2023-03-14 10:09:18.198+0000: shutting down, reason=crashed
-```
-The xml file
-```
-root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml
-<!--
-WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
-OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
-  virsh edit instance-0000000e
-or other application using the libvirt API.
--->
-<domain type='kvm'>
-  <name>instance-0000000e</name>
-  <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid>
-  <metadata>
-    <nova:instance xmlns:nova="
-http://openstack.org/xmlns/libvirt/nova/1.1
-">
-      <nova:package version="25.1.0"/>
-      <nova:name>provider-instance</nova:name>
-      <nova:creationTime>2023-03-14 10:09:13</nova:creationTime>
-      <nova:flavor name="cirros-os-dpu-test-1">
-        <nova:memory>64</nova:memory>
-        <nova:disk>1</nova:disk>
-        <nova:swap>0</nova:swap>
-        <nova:ephemeral>0</nova:ephemeral>
-        <nova:vcpus>1</nova:vcpus>
-      </nova:flavor>
-      <nova:owner>
-        <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user>
-        <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project>
-      </nova:owner>
-      <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/>
-      <nova:ports>
-        <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340">
-          <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/>
-        </nova:port>
-      </nova:ports>
-    </nova:instance>
-  </metadata>
-  <memory unit='KiB'>65536</memory>
-  <currentMemory unit='KiB'>65536</currentMemory>
-  <vcpu placement='static'>1</vcpu>
-  <sysinfo type='smbios'>
-    <system>
-      <entry name='manufacturer'>OpenStack Foundation</entry>
-      <entry name='product'>OpenStack Nova</entry>
-      <entry name='version'>25.1.0</entry>
-      <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='family'>Virtual Machine</entry>
-    </system>
-  </sysinfo>
-  <os>
-    <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type>
-    <boot dev='hd'/>
-    <smbios mode='sysinfo'/>
-  </os>
-  <features>
-    <acpi/>
-    <apic/>
-    <vmcoreinfo state='on'/>
-  </features>
-  <cpu mode='host-model' check='partial'>
-    <topology sockets='1' dies='1' cores='1' threads='1'/>
-  </cpu>
-  <clock offset='utc'>
-    <timer name='pit' tickpolicy='delay'/>
-    <timer name='rtc' tickpolicy='catchup'/>
-    <timer name='hpet' present='no'/>
-  </clock>
-  <on_poweroff>destroy</on_poweroff>
-  <on_reboot>restart</on_reboot>
-  <on_crash>destroy</on_crash>
-  <devices>
-    <emulator>/usr/bin/qemu-system-x86_64</emulator>
-    <disk type='file' device='disk'>
-      <driver name='qemu' type='qcow2' cache='none'/>
-      <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/>
-      <target dev='vda' bus='virtio'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
-    </disk>
-    <controller type='usb' index='0' model='piix3-uhci'>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
-    </controller>
-    <controller type='pci' index='0' model='pci-root'/>
-    <interface type='hostdev' managed='yes'>
-      <mac address='fa:16:3e:aa:d9:23'/>
-      <source>
-        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
-    </interface>
-    <serial type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='isa-serial' port='0'>
-        <model name='isa-serial'/>
-      </target>
-    </serial>
-    <console type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='serial' port='0'/>
-    </console>
-    <input type='tablet' bus='usb'>
-      <address type='usb' bus='0' port='1'/>
-    </input>
-    <input type='mouse' bus='ps2'/>
-    <input type='keyboard' bus='ps2'/>
-    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
-      <listen type='address' address='0.0.0.0'/>
-    </graphics>
-    <audio id='1' type='none'/>
-    <video>
-      <model type='virtio' heads='1' primary='yes'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
-    </video>
-    <hostdev mode='subsystem' type='pci' managed='yes'>
-      <source>
-        <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
-    </hostdev>
-    <memballoon model='virtio'>
-      <stats period='10'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
-    </memballoon>
-    <rng model='virtio'>
-      <backend model='random'>/dev/urandom</backend>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
-    </rng>
-  </devices>
-</domain>
-```
-----
-Simon Jones
-
-This is full ERROR log
-2023-03-23 08:00:52.362+0000: starting up libvirt version: 8.0.0, package: 1ubuntu7.4 (Christian Ehrhardt <
-christian.ehrhardt@canonical.com
-> Tue, 22 Nov 2022 15:59:28 +0100), qemu version: 6.2.0Debian 1:6.2+dfsg-2ubuntu6.6, kernel: 5.19.0-35-generic, hostname: c1c2
-LC_ALL=C \
-PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin \
-HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e \
-XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.local/share \
-XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.cache \
-XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-4-instance-0000000e/.config \
-/usr/bin/qemu-system-x86_64 \
--name guest=instance-0000000e,debug-threads=on \
--S \
--object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-instance-0000000e/master-key.aes"}' \
--machine pc-i440fx-6.2,usb=off,dump-guest-core=off,memory-backend=pc.ram \
--accel kvm \
--cpu Cooperlake,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,sha-ni=on,umip=on,waitpkg=on,gfni=on,vaes=on,vpclmulqdq=on,rdpid=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,avx-vnni=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,hle=off,rtm=off,avx512f=off,avx512dq=off,avx512cd=off,avx512bw=off,avx512vl=off,avx512vnni=off,avx512-bf16=off,taa-no=off \
--m 64 \
--object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":67108864}' \
--overcommit mem-lock=off \
--smp 1,sockets=1,dies=1,cores=1,threads=1 \
--uuid ff91d2dc-69a1-43ef-abde-c9e4e9a0305b \
--smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=25.1.0,serial=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,uuid=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,family=Virtual Machine' \
--no-user-config \
--nodefaults \
--chardev socket,id=charmonitor,fd=33,server=on,wait=off \
--mon chardev=charmonitor,id=monitor,mode=control \
--rtc base=utc,driftfix=slew \
--global kvm-pit.lost_tick_policy=delay \
--no-hpet \
--no-shutdown \
--boot strict=on \
--device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
--blockdev '{"driver":"file","filename":"/var/lib/nova/instances/_base/8b58db82a488248e7c5e769599954adaa47a5314","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
--blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \
--blockdev '{"driver":"file","filename":"/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
--blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \
--device virtio-blk-pci,bus=pci.0,addr=0x3,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \
--add-fd set=1,fd=34 \
--chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on \
--device isa-serial,chardev=charserial0,id=serial0 \
--device usb-tablet,id=input0,bus=usb.0,port=1 \
--audiodev '{"id":"audio1","driver":"none"}' \
--vnc
-0.0.0.0:0
-,audiodev=audio1 \
--device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x2 \
--device vfio-pci,host=0000:01:00.5,id=hostdev0,bus=pci.0,addr=0x4 \
--device vfio-pci,host=0000:01:00.6,id=hostdev1,bus=pci.0,addr=0x5 \
--device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \
--object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
--device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 \
--device vmcoreinfo \
--sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
--msg timestamp=on
-char device redirected to /dev/pts/3 (label charserial0)
-2023-03-23T08:00:53.728550Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument
-kvm_set_phys_mem: error registering slot: Invalid argument
-2023-03-23 08:00:54.201+0000: shutting down, reason=crashed
-2023-03-23 08:54:43.468+0000: starting up libvirt version: 8.0.0, package: 1ubuntu7.4 (Christian Ehrhardt <
-christian.ehrhardt@canonical.com
-> Tue, 22 Nov 2022 15:59:28 +0100), qemu version: 6.2.0Debian 1:6.2+dfsg-2ubuntu6.6, kernel: 5.19.0-35-generic, hostname: c1c2
-LC_ALL=C \
-PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin \
-HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e \
-XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.local/share \
-XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.cache \
-XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-5-instance-0000000e/.config \
-/usr/bin/qemu-system-x86_64 \
--name guest=instance-0000000e,debug-threads=on \
--S \
--object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-5-instance-0000000e/master-key.aes"}' \
--machine pc-i440fx-6.2,usb=off,dump-guest-core=off,memory-backend=pc.ram \
--accel kvm \
--cpu Cooperlake,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,sha-ni=on,umip=on,waitpkg=on,gfni=on,vaes=on,vpclmulqdq=on,rdpid=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,avx-vnni=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,hle=off,rtm=off,avx512f=off,avx512dq=off,avx512cd=off,avx512bw=off,avx512vl=off,avx512vnni=off,avx512-bf16=off,taa-no=off \
--m 64 \
--object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":67108864}' \
--overcommit mem-lock=off \
--smp 1,sockets=1,dies=1,cores=1,threads=1 \
--uuid ff91d2dc-69a1-43ef-abde-c9e4e9a0305b \
--smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=25.1.0,serial=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,uuid=ff91d2dc-69a1-43ef-abde-c9e4e9a0305b,family=Virtual Machine' \
--no-user-config \
--nodefaults \
--chardev socket,id=charmonitor,fd=33,server=on,wait=off \
--mon chardev=charmonitor,id=monitor,mode=control \
--rtc base=utc,driftfix=slew \
--global kvm-pit.lost_tick_policy=delay \
--no-hpet \
--no-shutdown \
--boot strict=on \
--device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
--blockdev '{"driver":"file","filename":"/var/lib/nova/instances/_base/8b58db82a488248e7c5e769599954adaa47a5314","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
--blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \
--blockdev '{"driver":"file","filename":"/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
--blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \
--device virtio-blk-pci,bus=pci.0,addr=0x3,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \
--add-fd set=1,fd=34 \
--chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on \
--device isa-serial,chardev=charserial0,id=serial0 \
--device usb-tablet,id=input0,bus=usb.0,port=1 \
--audiodev '{"id":"audio1","driver":"none"}' \
--vnc
-0.0.0.0:0
-,audiodev=audio1 \
--device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x2 \
--device vfio-pci,host=0000:01:00.5,id=hostdev0,bus=pci.0,addr=0x4 \
--device vfio-pci,host=0000:01:00.6,id=hostdev1,bus=pci.0,addr=0x5 \
--device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \
--object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
--device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 \
--device vmcoreinfo \
--sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
--msg timestamp=on
-char device redirected to /dev/pts/3 (label charserial0)
-2023-03-23T08:54:44.755039Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument
-kvm_set_phys_mem: error registering slot: Invalid argument
-2023-03-23 08:54:45.230+0000: shutting down, reason=crashed
-----
-Simon Jones
-Simon Jones <
-batmanustc@gmail.com
-> 于2023年3月23日周四 05:49写道:
-This is happened in ubuntu22.04.
-QEMU is install by apt like this:
-apt install -y qemu qemu-kvm qemu-system
-and QEMU version is 6.2.0
-----
-Simon Jones
-Simon Jones <
-batmanustc@gmail.com
-> 于2023年3月21日周二 08:40写道:
-Hi all,
-I start a VM in openstack, and openstack use libvirt to start qemu VM, but now log show this ERROR.
-Is there any one know this?
-The ERROR log from /var/log/libvirt/qemu/instance-0000000e.log
-```
-2023-03-14T10:09:17.674114Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xfffffffffe000000, size=0x2000: Invalid argument
-kvm_set_phys_mem: error registering slot: Invalid argument
-2023-03-14 10:09:18.198+0000: shutting down, reason=crashed
-```
-The xml file
-```
-root@c1c2:~# cat /etc/libvirt/qemu/instance-0000000e.xml
-<!--
-WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
-OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
-  virsh edit instance-0000000e
-or other application using the libvirt API.
--->
-<domain type='kvm'>
-  <name>instance-0000000e</name>
-  <uuid>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</uuid>
-  <metadata>
-    <nova:instance xmlns:nova="
-http://openstack.org/xmlns/libvirt/nova/1.1
-">
-      <nova:package version="25.1.0"/>
-      <nova:name>provider-instance</nova:name>
-      <nova:creationTime>2023-03-14 10:09:13</nova:creationTime>
-      <nova:flavor name="cirros-os-dpu-test-1">
-        <nova:memory>64</nova:memory>
-        <nova:disk>1</nova:disk>
-        <nova:swap>0</nova:swap>
-        <nova:ephemeral>0</nova:ephemeral>
-        <nova:vcpus>1</nova:vcpus>
-      </nova:flavor>
-      <nova:owner>
-        <nova:user uuid="ff627ad39ed94479b9c5033bc462cf78">admin</nova:user>
-        <nova:project uuid="512866f9994f4ad8916d8539a7cdeec9">admin</nova:project>
-      </nova:owner>
-      <nova:root type="image" uuid="9e58cb69-316a-4093-9f23-c1d1bd8edffe"/>
-      <nova:ports>
-        <nova:port uuid="77c1dc00-af39-4463-bea0-12808f4bc340">
-          <nova:ip type="fixed" address="172.1.1.43" ipVersion="4"/>
-        </nova:port>
-      </nova:ports>
-    </nova:instance>
-  </metadata>
-  <memory unit='KiB'>65536</memory>
-  <currentMemory unit='KiB'>65536</currentMemory>
-  <vcpu placement='static'>1</vcpu>
-  <sysinfo type='smbios'>
-    <system>
-      <entry name='manufacturer'>OpenStack Foundation</entry>
-      <entry name='product'>OpenStack Nova</entry>
-      <entry name='version'>25.1.0</entry>
-      <entry name='serial'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='uuid'>ff91d2dc-69a1-43ef-abde-c9e4e9a0305b</entry>
-      <entry name='family'>Virtual Machine</entry>
-    </system>
-  </sysinfo>
-  <os>
-    <type arch='x86_64' machine='pc-i440fx-6.2'>hvm</type>
-    <boot dev='hd'/>
-    <smbios mode='sysinfo'/>
-  </os>
-  <features>
-    <acpi/>
-    <apic/>
-    <vmcoreinfo state='on'/>
-  </features>
-  <cpu mode='host-model' check='partial'>
-    <topology sockets='1' dies='1' cores='1' threads='1'/>
-  </cpu>
-  <clock offset='utc'>
-    <timer name='pit' tickpolicy='delay'/>
-    <timer name='rtc' tickpolicy='catchup'/>
-    <timer name='hpet' present='no'/>
-  </clock>
-  <on_poweroff>destroy</on_poweroff>
-  <on_reboot>restart</on_reboot>
-  <on_crash>destroy</on_crash>
-  <devices>
-    <emulator>/usr/bin/qemu-system-x86_64</emulator>
-    <disk type='file' device='disk'>
-      <driver name='qemu' type='qcow2' cache='none'/>
-      <source file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/disk'/>
-      <target dev='vda' bus='virtio'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
-    </disk>
-    <controller type='usb' index='0' model='piix3-uhci'>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
-    </controller>
-    <controller type='pci' index='0' model='pci-root'/>
-    <interface type='hostdev' managed='yes'>
-      <mac address='fa:16:3e:aa:d9:23'/>
-      <source>
-        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x5'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
-    </interface>
-    <serial type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='isa-serial' port='0'>
-        <model name='isa-serial'/>
-      </target>
-    </serial>
-    <console type='pty'>
-      <log file='/var/lib/nova/instances/ff91d2dc-69a1-43ef-abde-c9e4e9a0305b/console.log' append='off'/>
-      <target type='serial' port='0'/>
-    </console>
-    <input type='tablet' bus='usb'>
-      <address type='usb' bus='0' port='1'/>
-    </input>
-    <input type='mouse' bus='ps2'/>
-    <input type='keyboard' bus='ps2'/>
-    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
-      <listen type='address' address='0.0.0.0'/>
-    </graphics>
-    <audio id='1' type='none'/>
-    <video>
-      <model type='virtio' heads='1' primary='yes'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
-    </video>
-    <hostdev mode='subsystem' type='pci' managed='yes'>
-      <source>
-        <address domain='0x0000' bus='0x01' slot='0x00' function='0x6'/>
-      </source>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
-    </hostdev>
-    <memballoon model='virtio'>
-      <stats period='10'/>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
-    </memballoon>
-    <rng model='virtio'>
-      <backend model='random'>/dev/urandom</backend>
-      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
-    </rng>
-  </devices>
-</domain>
-```
-----
-Simon Jones
-
diff --git a/results/classifier/016/virtual/16201167 b/results/classifier/016/virtual/16201167
deleted file mode 100644
index f1cb262a..00000000
--- a/results/classifier/016/virtual/16201167
+++ /dev/null
@@ -1,127 +0,0 @@
-virtual: 0.917
-KVM: 0.915
-hypervisor: 0.764
-debug: 0.653
-kernel: 0.598
-operating system: 0.556
-assembly: 0.306
-performance: 0.135
-TCG: 0.109
-PID: 0.107
-files: 0.093
-x86: 0.074
-VMM: 0.045
-register: 0.033
-device: 0.025
-risc-v: 0.018
-user-level: 0.017
-i386: 0.016
-ppc: 0.015
-arm: 0.010
-semantic: 0.009
-architecture: 0.009
-boot: 0.006
-alpha: 0.006
-network: 0.004
-socket: 0.003
-vnc: 0.003
-graphic: 0.003
-peripherals: 0.002
-permissions: 0.001
-mistranslation: 0.000
-
-[BUG] Qemu abort with error "kvm_mem_ioeventfd_add: error adding ioeventfd: File exists (17)"
-
-Hi list,
-
-When I did some tests in my virtual domain with live-attached virtio deivces, I 
-got a coredump file of Qemu.
-
-The error print from qemu is "kvm_mem_ioeventfd_add: error adding ioeventfd: 
-File exists (17)".
-And the call trace in the coredump file displays as below:
-#0  0x0000ffff89acecc8 in ?? () from /usr/lib64/libc.so.6
-#1  0x0000ffff89a8acbc in raise () from /usr/lib64/libc.so.6
-#2  0x0000ffff89a78d2c in abort () from /usr/lib64/libc.so.6
-#3  0x0000aaaabd7ccf1c in kvm_mem_ioeventfd_add (listener=<optimized out>, 
-section=<optimized out>, match_data=<optimized out>, data=<optimized out>, 
-e=<optimized out>) at ../accel/kvm/kvm-all.c:1607
-#4  0x0000aaaabd6e0304 in address_space_add_del_ioeventfds (fds_old_nb=164, 
-fds_old=0xffff5c80a1d0, fds_new_nb=160, fds_new=0xffff5c565080, 
-as=0xaaaabdfa8810 <address_space_memory>)
-    at ../softmmu/memory.c:795
-#5  address_space_update_ioeventfds (as=0xaaaabdfa8810 <address_space_memory>) 
-at ../softmmu/memory.c:856
-#6  0x0000aaaabd6e24d8 in memory_region_commit () at ../softmmu/memory.c:1113
-#7  0x0000aaaabd6e25c4 in memory_region_transaction_commit () at 
-../softmmu/memory.c:1144
-#8  0x0000aaaabd394eb4 in pci_bridge_update_mappings 
-(br=br@entry=0xaaaae755f7c0) at ../hw/pci/pci_bridge.c:248
-#9  0x0000aaaabd394f4c in pci_bridge_write_config (d=0xaaaae755f7c0, 
-address=44, val=<optimized out>, len=4) at ../hw/pci/pci_bridge.c:272
-#10 0x0000aaaabd39a928 in rp_write_config (d=0xaaaae755f7c0, address=44, 
-val=128, len=4) at ../hw/pci-bridge/pcie_root_port.c:39
-#11 0x0000aaaabd6df328 in memory_region_write_accessor (mr=0xaaaae63898d0, 
-addr=65580, value=<optimized out>, size=4, shift=<optimized out>, 
-mask=<optimized out>, attrs=...) at ../softmmu/memory.c:494
-#12 0x0000aaaabd6dcb6c in access_with_adjusted_size (addr=addr@entry=65580, 
-value=value@entry=0xffff817adc78, size=size@entry=4, access_size_min=<optimized 
-out>, access_size_max=<optimized out>,
-    access_fn=access_fn@entry=0xaaaabd6df284 <memory_region_write_accessor>, 
-mr=mr@entry=0xaaaae63898d0, attrs=attrs@entry=...) at ../softmmu/memory.c:556
-#13 0x0000aaaabd6e0dc8 in memory_region_dispatch_write 
-(mr=mr@entry=0xaaaae63898d0, addr=65580, data=<optimized out>, op=MO_32, 
-attrs=attrs@entry=...) at ../softmmu/memory.c:1534
-#14 0x0000aaaabd6d0574 in flatview_write_continue (fv=fv@entry=0xffff5c02da00, 
-addr=addr@entry=275146407980, attrs=attrs@entry=..., 
-ptr=ptr@entry=0xffff8aa8c028, len=len@entry=4,
-    addr1=<optimized out>, l=<optimized out>, mr=mr@entry=0xaaaae63898d0) at 
-/usr/src/debug/qemu-6.2.0-226.aarch64/include/qemu/host-utils.h:165
-#15 0x0000aaaabd6d4584 in flatview_write (len=4, buf=0xffff8aa8c028, attrs=..., 
-addr=275146407980, fv=0xffff5c02da00) at ../softmmu/physmem.c:3375
-#16 address_space_write (as=<optimized out>, addr=275146407980, attrs=..., 
-buf=buf@entry=0xffff8aa8c028, len=4) at ../softmmu/physmem.c:3467
-#17 0x0000aaaabd6d462c in address_space_rw (as=<optimized out>, addr=<optimized 
-out>, attrs=..., attrs@entry=..., buf=buf@entry=0xffff8aa8c028, len=<optimized 
-out>, is_write=<optimized out>)
-    at ../softmmu/physmem.c:3477
-#18 0x0000aaaabd7cf6e8 in kvm_cpu_exec (cpu=cpu@entry=0xaaaae625dfd0) at 
-../accel/kvm/kvm-all.c:2970
-#19 0x0000aaaabd7d09bc in kvm_vcpu_thread_fn (arg=arg@entry=0xaaaae625dfd0) at 
-../accel/kvm/kvm-accel-ops.c:49
-#20 0x0000aaaabd94ccd8 in qemu_thread_start (args=<optimized out>) at 
-../util/qemu-thread-posix.c:559
-
-
-By printing more info in the coredump file, I found that the addr of 
-fds_old[146] and fds_new[146] are same, but fds_old[146] belonged to a 
-live-attached virtio-scsi device while fds_new[146] was owned by another 
-live-attached virtio-net.
-The reason why addr conflicted was then been found from vm's console log. Just 
-before qemu aborted, the guest kernel crashed and kdump.service booted the 
-dump-capture kernel where re-alloced address for the devices.
-Because those virtio devices were live-attached after vm creating, different 
-addr may been assigned to them in the dump-capture kernel:
-
-the initial kernel booting log:
-[    1.663297] pci 0000:00:02.1: BAR 14: assigned [mem 0x11900000-0x11afffff]
-[    1.664560] pci 0000:00:02.1: BAR 15: assigned [mem 
-0x8001800000-0x80019fffff 64bit pref]
-
-the dump-capture kernel booting log:
-[    1.845211] pci 0000:00:02.0: BAR 14: assigned [mem 0x11900000-0x11bfffff]
-[    1.846542] pci 0000:00:02.0: BAR 15: assigned [mem 
-0x8001800000-0x8001afffff 64bit pref]
-
-
-I think directly aborting the qemu process may not be the best choice in this 
-case cuz it will interrupt the work of kdump.service so that failed to generate 
-memory dump of the crashed guest kernel.
-Perhaps, IMO, the error could be simply ignored in this case and just let kdump 
-to reboot the system after memory-dump finishing, but I failed to find a 
-suitable judgment in the codes.
-
-Any solution for this problem? Hope I can get some helps here.
-
-Hao
-
diff --git a/results/classifier/016/virtual/24190340 b/results/classifier/016/virtual/24190340
deleted file mode 100644
index 0fdffdcd..00000000
--- a/results/classifier/016/virtual/24190340
+++ /dev/null
@@ -1,2083 +0,0 @@
-virtual: 0.947
-debug: 0.831
-x86: 0.818
-KVM: 0.485
-kernel: 0.456
-TCG: 0.263
-operating system: 0.254
-hypervisor: 0.202
-register: 0.173
-PID: 0.069
-performance: 0.059
-socket: 0.056
-risc-v: 0.029
-VMM: 0.017
-user-level: 0.016
-device: 0.015
-network: 0.012
-files: 0.011
-assembly: 0.011
-semantic: 0.008
-vnc: 0.008
-ppc: 0.005
-architecture: 0.005
-alpha: 0.003
-graphic: 0.003
-peripherals: 0.003
-permissions: 0.002
-boot: 0.002
-i386: 0.001
-arm: 0.000
-mistranslation: 0.000
-
-[BUG, RFC] Block graph deadlock on job-dismiss
-
-Hi all,
-
-There's a bug in block layer which leads to block graph deadlock.
-Notably, it takes place when blockdev IO is processed within a separate
-iothread.
-
-This was initially caught by our tests, and I was able to reduce it to a
-relatively simple reproducer.  Such deadlocks are probably supposed to
-be covered in iotests/graph-changes-while-io, but this deadlock isn't.
-
-Basically what the reproducer does is launches QEMU with a drive having
-'iothread' option set, creates a chain of 2 snapshots, launches
-block-commit job for a snapshot and then dismisses the job, starting
-from the lower snapshot.  If the guest is issuing IO at the same time,
-there's a race in acquiring block graph lock and a potential deadlock.
-
-Here's how it can be reproduced:
-
-1. Run QEMU:
->
-SRCDIR=/path/to/srcdir
->
->
->
->
->
-$SRCDIR/build/qemu-system-x86_64 -enable-kvm \
->
->
--machine q35 -cpu Nehalem \
->
->
--name guest=alma8-vm,debug-threads=on \
->
->
--m 2g -smp 2 \
->
->
--nographic -nodefaults \
->
->
--qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \
->
->
--serial unix:/var/run/alma8-serial.sock,server=on,wait=off \
->
->
--object iothread,id=iothread0 \
->
->
--blockdev
->
-node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2
->
-\
->
--device virtio-blk-pci,drive=disk,iothread=iothread0
-2. Launch IO (random reads) from within the guest:
->
-nc -U /var/run/alma8-serial.sock
->
-...
->
-[root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k
->
---size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting
->
---rw=randread --iodepth=1 --filename=/testfile
-3. Run snapshots creation & removal of lower snapshot operation in a
-loop (script attached):
->
-while /bin/true ; do ./remove_lower_snap.sh ; done
-And then it occasionally hangs.
-
-Note: I've tried bisecting this, and looks like deadlock occurs starting
-from the following commit:
-
-(BAD)  5bdbaebcce virtio: Re-enable notifications after drain
-(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll
-
-On the latest v10.0.0 it does hang as well.
-
-
-Here's backtrace of the main thread:
-
->
-#0  0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1,
->
-timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:43
->
-#1  0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1,
->
-timeout=-1) at ../util/qemu-timer.c:329
->
-#2  0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20,
->
-ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79
->
-#3  0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at
->
-../util/aio-posix.c:730
->
-#4  0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950,
->
-parent=0x0, poll=true) at ../block/io.c:378
->
-#5  0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at
->
-../block/io.c:391
->
-#6  0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7682
->
-#7  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7608
->
-#8  0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7668
->
-#9  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7608
->
-#10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7668
->
-#11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7608
->
-#12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../blockjob.c:157
->
-#13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7592
->
-#14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7661
->
-#15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx
->
-(child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 =
->
-{...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234
->
-#16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7592
->
-#17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-errp=0x0)
->
-at ../block.c:7661
->
-#18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0,
->
-ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715
->
-#19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at
->
-../block.c:3317
->
-#20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at
->
-../blockjob.c:209
->
-#21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at
->
-../blockjob.c:82
->
-#22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at
->
-../job.c:474
->
-#23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at
->
-../job.c:771
->
-#24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400,
->
-errp=0x7ffd94b4f488) at ../job.c:783
->
---Type <RET> for more, q to quit, c to continue without paging--
->
-#25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 "commit-snap1",
->
-errp=0x7ffd94b4f488) at ../job-qmp.c:138
->
-#26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0,
->
-ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221
->
-#27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at
->
-../qapi/qmp-dispatch.c:128
->
-#28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at
->
-../util/async.c:172
->
-#29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at
->
-../util/async.c:219
->
-#30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at
->
-../util/aio-posix.c:436
->
-#31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200,
->
-callback=0x0, user_data=0x0) at ../util/async.c:361
->
-#32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at
->
-../glib/gmain.c:3364
->
-#33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079
->
-#34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287
->
-#35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at
->
-../util/main-loop.c:310
->
-#36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at
->
-../util/main-loop.c:589
->
-#37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835
->
-#38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at
->
-../system/main.c:50
->
-#39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at
->
-../system/main.c:80
-And here's coroutine trying to acquire read lock:
-
->
-(gdb) qemu coroutine reader_queue->entries.sqh_first
->
-#0  0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0,
->
-to_=0x7fc537fff508, action=COROUTINE_YIELD) at
->
-../util/coroutine-ucontext.c:321
->
-#1  0x0000557eb47d4d4a in qemu_coroutine_yield () at
->
-../util/qemu-coroutine.c:339
->
-#2  0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0
->
-<reader_queue>, lock=0x7fc53c57de50, flags=0) at
->
-../util/qemu-coroutine-lock.c:60
->
-#3  0x0000557eb461fea7 in bdrv_graph_co_rdlock () at ../block/graph-lock.c:231
->
-#4  0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at
->
-/home/root/src/qemu/master/include/block/graph-lock.h:213
->
-#5  0x0000557eb460fa41 in blk_co_do_preadv_part
->
-(blk=0x557eb84c0810, offset=6890553344, bytes=4096, qiov=0x7fc530006988,
->
-qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at ../block/block-backend.c:1339
->
-#6  0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at
->
-../block/block-backend.c:1619
->
-#7  0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886) at
->
-../util/coroutine-ucontext.c:175
->
-#8  0x00007fc547c2a360 in __start_context () at
->
-../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
->
-#9  0x00007ffd94b4ea40 in  ()
->
-#10 0x0000000000000000 in  ()
-So it looks like main thread is processing job-dismiss request and is
-holding write lock taken in block_job_remove_all_bdrv() (frame #20
-above).  At the same time iothread spawns a coroutine which performs IO
-request.  Before the coroutine is spawned, blk_aio_prwv() increases
-'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
-trying to acquire the read lock.  But main thread isn't releasing the
-lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
-Here's the deadlock.
-
-Any comments and suggestions on the subject are welcomed.  Thanks!
-
-Andrey
-remove_lower_snap.sh
-Description:
-application/shellscript
-
-On 4/24/25 8:32 PM, Andrey Drobyshev wrote:
->
-Hi all,
->
->
-There's a bug in block layer which leads to block graph deadlock.
->
-Notably, it takes place when blockdev IO is processed within a separate
->
-iothread.
->
->
-This was initially caught by our tests, and I was able to reduce it to a
->
-relatively simple reproducer.  Such deadlocks are probably supposed to
->
-be covered in iotests/graph-changes-while-io, but this deadlock isn't.
->
->
-Basically what the reproducer does is launches QEMU with a drive having
->
-'iothread' option set, creates a chain of 2 snapshots, launches
->
-block-commit job for a snapshot and then dismisses the job, starting
->
-from the lower snapshot.  If the guest is issuing IO at the same time,
->
-there's a race in acquiring block graph lock and a potential deadlock.
->
->
-Here's how it can be reproduced:
->
->
-[...]
->
-I took a closer look at iotests/graph-changes-while-io, and have managed
-to reproduce the same deadlock in a much simpler setup, without a guest.
-
-1. Run QSD:> ./build/storage-daemon/qemu-storage-daemon --object
-iothread,id=iothread0 \
->
---blockdev null-co,node-name=node0,read-zeroes=true \
->
->
---nbd-server addr.type=unix,addr.path=/var/run/qsd_nbd.sock \
->
->
---export
->
-nbd,id=exp0,node-name=node0,iothread=iothread0,fixed-iothread=true,writable=true
->
-\
->
---chardev
->
-socket,id=qmp-sock,path=/var/run/qsd_qmp.sock,server=on,wait=off \
->
---monitor chardev=qmp-sock
-2. Launch IO:
->
-qemu-img bench -f raw -c 2000000
->
-'nbd+unix:///node0?socket=/var/run/qsd_nbd.sock'
-3. Add 2 snapshots and remove lower one (script attached):> while
-/bin/true ; do ./rls_qsd.sh ; done
-
-And then it hangs.
-
-I'll also send a patch with corresponding test case added directly to
-iotests.
-
-This reproduce seems to be hanging starting from Fiona's commit
-67446e605dc ("blockjob: drop AioContext lock before calling
-bdrv_graph_wrlock()").  AioContext locks were dropped entirely later on
-in Stefan's commit b49f4755c7 ("block: remove AioContext locking"), but
-the problem remains.
-
-Andrey
-rls_qsd.sh
-Description:
-application/shellscript
-
-From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
-
-This case is catching potential deadlock which takes place when job-dismiss
-is issued when I/O requests are processed in a separate iothread.
-
-See
-https://mail.gnu.org/archive/html/qemu-devel/2025-04/msg04421.html
-Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
----
- .../qemu-iotests/tests/graph-changes-while-io | 101 ++++++++++++++++--
- .../tests/graph-changes-while-io.out          |   4 +-
- 2 files changed, 96 insertions(+), 9 deletions(-)
-
-diff --git a/tests/qemu-iotests/tests/graph-changes-while-io 
-b/tests/qemu-iotests/tests/graph-changes-while-io
-index 194fda500e..e30f823da4 100755
---- a/tests/qemu-iotests/tests/graph-changes-while-io
-+++ b/tests/qemu-iotests/tests/graph-changes-while-io
-@@ -27,6 +27,8 @@ from iotests import imgfmt, qemu_img, qemu_img_create, 
-qemu_io, \
- 
- 
- top = os.path.join(iotests.test_dir, 'top.img')
-+snap1 = os.path.join(iotests.test_dir, 'snap1.img')
-+snap2 = os.path.join(iotests.test_dir, 'snap2.img')
- nbd_sock = os.path.join(iotests.sock_dir, 'nbd.sock')
- 
- 
-@@ -58,6 +60,15 @@ class TestGraphChangesWhileIO(QMPTestCase):
-     def tearDown(self) -> None:
-         self.qsd.stop()
- 
-+    def _wait_for_blockjob(self, status) -> None:
-+        done = False
-+        while not done:
-+            for event in self.qsd.get_qmp().get_events(wait=10.0):
-+                if event['event'] != 'JOB_STATUS_CHANGE':
-+                    continue
-+                if event['data']['status'] == status:
-+                    done = True
-+
-     def test_blockdev_add_while_io(self) -> None:
-         # Run qemu-img bench in the background
-         bench_thr = Thread(target=do_qemu_img_bench)
-@@ -116,13 +127,89 @@ class TestGraphChangesWhileIO(QMPTestCase):
-                 'device': 'job0',
-             })
- 
--            cancelled = False
--            while not cancelled:
--                for event in self.qsd.get_qmp().get_events(wait=10.0):
--                    if event['event'] != 'JOB_STATUS_CHANGE':
--                        continue
--                    if event['data']['status'] == 'null':
--                        cancelled = True
-+            self._wait_for_blockjob('null')
-+
-+        bench_thr.join()
-+
-+    def test_remove_lower_snapshot_while_io(self) -> None:
-+        # Run qemu-img bench in the background
-+        bench_thr = Thread(target=do_qemu_img_bench, args=(100000, ))
-+        bench_thr.start()
-+
-+        # While I/O is performed on 'node0' node, consequently add 2 snapshots
-+        # on top of it, then remove (commit) them starting from lower one.
-+        while bench_thr.is_alive():
-+            # Recreate snapshot images on every iteration
-+            qemu_img_create('-f', imgfmt, snap1, '1G')
-+            qemu_img_create('-f', imgfmt, snap2, '1G')
-+
-+            self.qsd.cmd('blockdev-add', {
-+                'driver': imgfmt,
-+                'node-name': 'snap1',
-+                'file': {
-+                    'driver': 'file',
-+                    'filename': snap1
-+                }
-+            })
-+
-+            self.qsd.cmd('blockdev-snapshot', {
-+                'node': 'node0',
-+                'overlay': 'snap1',
-+            })
-+
-+            self.qsd.cmd('blockdev-add', {
-+                'driver': imgfmt,
-+                'node-name': 'snap2',
-+                'file': {
-+                    'driver': 'file',
-+                    'filename': snap2
-+                }
-+            })
-+
-+            self.qsd.cmd('blockdev-snapshot', {
-+                'node': 'snap1',
-+                'overlay': 'snap2',
-+            })
-+
-+            self.qsd.cmd('block-commit', {
-+                'job-id': 'commit-snap1',
-+                'device': 'snap2',
-+                'top-node': 'snap1',
-+                'base-node': 'node0',
-+                'auto-finalize': True,
-+                'auto-dismiss': False,
-+            })
-+
-+            self._wait_for_blockjob('concluded')
-+            self.qsd.cmd('job-dismiss', {
-+                'id': 'commit-snap1',
-+            })
-+
-+            self.qsd.cmd('block-commit', {
-+                'job-id': 'commit-snap2',
-+                'device': 'snap2',
-+                'top-node': 'snap2',
-+                'base-node': 'node0',
-+                'auto-finalize': True,
-+                'auto-dismiss': False,
-+            })
-+
-+            self._wait_for_blockjob('ready')
-+            self.qsd.cmd('job-complete', {
-+                'id': 'commit-snap2',
-+            })
-+
-+            self._wait_for_blockjob('concluded')
-+            self.qsd.cmd('job-dismiss', {
-+                'id': 'commit-snap2',
-+            })
-+
-+            self.qsd.cmd('blockdev-del', {
-+                'node-name': 'snap1'
-+            })
-+            self.qsd.cmd('blockdev-del', {
-+                'node-name': 'snap2'
-+            })
- 
-         bench_thr.join()
- 
-diff --git a/tests/qemu-iotests/tests/graph-changes-while-io.out 
-b/tests/qemu-iotests/tests/graph-changes-while-io.out
-index fbc63e62f8..8d7e996700 100644
---- a/tests/qemu-iotests/tests/graph-changes-while-io.out
-+++ b/tests/qemu-iotests/tests/graph-changes-while-io.out
-@@ -1,5 +1,5 @@
--..
-+...
- ----------------------------------------------------------------------
--Ran 2 tests
-+Ran 3 tests
- 
- OK
--- 
-2.43.5
-
-Am 24.04.25 um 19:32 schrieb Andrey Drobyshev:
->
-So it looks like main thread is processing job-dismiss request and is
->
-holding write lock taken in block_job_remove_all_bdrv() (frame #20
->
-above).  At the same time iothread spawns a coroutine which performs IO
->
-request.  Before the coroutine is spawned, blk_aio_prwv() increases
->
-'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
->
-trying to acquire the read lock.  But main thread isn't releasing the
->
-lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
->
-Here's the deadlock.
-And for the IO test you provided, it's client->nb_requests that behaves
-similarly to blk->in_flight here.
-
-The issue also reproduces easily when issuing the following QMP command
-in a loop while doing IO on a device:
-
->
-void qmp_block_locked_drain(const char *node_name, Error **errp)
->
-{
->
-BlockDriverState *bs;
->
->
-bs = bdrv_find_node(node_name);
->
-if (!bs) {
->
-error_setg(errp, "node not found");
->
-return;
->
-}
->
->
-bdrv_graph_wrlock();
->
-bdrv_drained_begin(bs);
->
-bdrv_drained_end(bs);
->
-bdrv_graph_wrunlock();
->
-}
-It seems like either it would be necessary to require:
-1. not draining inside an exclusively locked section
-or
-2. making sure that variables used by drained_poll routines are only set
-while holding the reader lock
-?
-
-Those seem to require rather involved changes, so a third option might
-be to make draining inside an exclusively locked section possible, by
-embedding such locked sections in a drained section:
-
->
-diff --git a/blockjob.c b/blockjob.c
->
-index 32007f31a9..9b2f3b3ea9 100644
->
---- a/blockjob.c
->
-+++ b/blockjob.c
->
-@@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
-* one to make sure that such a concurrent access does not attempt
->
-* to process an already freed BdrvChild.
->
-*/
->
-+    bdrv_drain_all_begin();
->
-bdrv_graph_wrlock();
->
-while (job->nodes) {
->
-GSList *l = job->nodes;
->
-@@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
-g_slist_free_1(l);
->
-}
->
-bdrv_graph_wrunlock();
->
-+    bdrv_drain_all_end();
->
-}
->
->
-bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
-This seems to fix the issue at hand. I can send a patch if this is
-considered an acceptable approach.
-
-Best Regards,
-Fiona
-
-On 4/30/25 11:47 AM, Fiona Ebner wrote:
->
-Am 24.04.25 um 19:32 schrieb Andrey Drobyshev:
->
-> So it looks like main thread is processing job-dismiss request and is
->
-> holding write lock taken in block_job_remove_all_bdrv() (frame #20
->
-> above).  At the same time iothread spawns a coroutine which performs IO
->
-> request.  Before the coroutine is spawned, blk_aio_prwv() increases
->
-> 'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
->
-> trying to acquire the read lock.  But main thread isn't releasing the
->
-> lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
->
-> Here's the deadlock.
->
->
-And for the IO test you provided, it's client->nb_requests that behaves
->
-similarly to blk->in_flight here.
->
->
-The issue also reproduces easily when issuing the following QMP command
->
-in a loop while doing IO on a device:
->
->
-> void qmp_block_locked_drain(const char *node_name, Error **errp)
->
-> {
->
->     BlockDriverState *bs;
->
->
->
->     bs = bdrv_find_node(node_name);
->
->     if (!bs) {
->
->         error_setg(errp, "node not found");
->
->         return;
->
->     }
->
->
->
->     bdrv_graph_wrlock();
->
->     bdrv_drained_begin(bs);
->
->     bdrv_drained_end(bs);
->
->     bdrv_graph_wrunlock();
->
-> }
->
->
-It seems like either it would be necessary to require:
->
-1. not draining inside an exclusively locked section
->
-or
->
-2. making sure that variables used by drained_poll routines are only set
->
-while holding the reader lock
->
-?
->
->
-Those seem to require rather involved changes, so a third option might
->
-be to make draining inside an exclusively locked section possible, by
->
-embedding such locked sections in a drained section:
->
->
-> diff --git a/blockjob.c b/blockjob.c
->
-> index 32007f31a9..9b2f3b3ea9 100644
->
-> --- a/blockjob.c
->
-> +++ b/blockjob.c
->
-> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
->       * one to make sure that such a concurrent access does not attempt
->
->       * to process an already freed BdrvChild.
->
->       */
->
-> +    bdrv_drain_all_begin();
->
->      bdrv_graph_wrlock();
->
->      while (job->nodes) {
->
->          GSList *l = job->nodes;
->
-> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
->          g_slist_free_1(l);
->
->      }
->
->      bdrv_graph_wrunlock();
->
-> +    bdrv_drain_all_end();
->
->  }
->
->
->
->  bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
->
->
-This seems to fix the issue at hand. I can send a patch if this is
->
-considered an acceptable approach.
->
->
-Best Regards,
->
-Fiona
->
-Hello Fiona,
-
-Thanks for looking into it.  I've tried your 3rd option above and can
-confirm it does fix the deadlock, at least I can't reproduce it.  Other
-iotests also don't seem to be breaking.  So I personally am fine with
-that patch.  Would be nice to hear a word from the maintainers though on
-whether there're any caveats with such approach.
-
-Andrey
-
-On Wed, Apr 30, 2025 at 10:11 AM Andrey Drobyshev
-<andrey.drobyshev@virtuozzo.com> wrote:
->
->
-On 4/30/25 11:47 AM, Fiona Ebner wrote:
->
-> Am 24.04.25 um 19:32 schrieb Andrey Drobyshev:
->
->> So it looks like main thread is processing job-dismiss request and is
->
->> holding write lock taken in block_job_remove_all_bdrv() (frame #20
->
->> above).  At the same time iothread spawns a coroutine which performs IO
->
->> request.  Before the coroutine is spawned, blk_aio_prwv() increases
->
->> 'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
->
->> trying to acquire the read lock.  But main thread isn't releasing the
->
->> lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
->
->> Here's the deadlock.
->
->
->
-> And for the IO test you provided, it's client->nb_requests that behaves
->
-> similarly to blk->in_flight here.
->
->
->
-> The issue also reproduces easily when issuing the following QMP command
->
-> in a loop while doing IO on a device:
->
->
->
->> void qmp_block_locked_drain(const char *node_name, Error **errp)
->
->> {
->
->>     BlockDriverState *bs;
->
->>
->
->>     bs = bdrv_find_node(node_name);
->
->>     if (!bs) {
->
->>         error_setg(errp, "node not found");
->
->>         return;
->
->>     }
->
->>
->
->>     bdrv_graph_wrlock();
->
->>     bdrv_drained_begin(bs);
->
->>     bdrv_drained_end(bs);
->
->>     bdrv_graph_wrunlock();
->
->> }
->
->
->
-> It seems like either it would be necessary to require:
->
-> 1. not draining inside an exclusively locked section
->
-> or
->
-> 2. making sure that variables used by drained_poll routines are only set
->
-> while holding the reader lock
->
-> ?
->
->
->
-> Those seem to require rather involved changes, so a third option might
->
-> be to make draining inside an exclusively locked section possible, by
->
-> embedding such locked sections in a drained section:
->
->
->
->> diff --git a/blockjob.c b/blockjob.c
->
->> index 32007f31a9..9b2f3b3ea9 100644
->
->> --- a/blockjob.c
->
->> +++ b/blockjob.c
->
->> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
->>       * one to make sure that such a concurrent access does not attempt
->
->>       * to process an already freed BdrvChild.
->
->>       */
->
->> +    bdrv_drain_all_begin();
->
->>      bdrv_graph_wrlock();
->
->>      while (job->nodes) {
->
->>          GSList *l = job->nodes;
->
->> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
->
->>          g_slist_free_1(l);
->
->>      }
->
->>      bdrv_graph_wrunlock();
->
->> +    bdrv_drain_all_end();
->
->>  }
->
->>
->
->>  bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
->
->
->
-> This seems to fix the issue at hand. I can send a patch if this is
->
-> considered an acceptable approach.
-Kevin is aware of this thread but it's a public holiday tomorrow so it
-may be a little longer.
-
-Stefan
-
-Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben:
->
-Hi all,
->
->
-There's a bug in block layer which leads to block graph deadlock.
->
-Notably, it takes place when blockdev IO is processed within a separate
->
-iothread.
->
->
-This was initially caught by our tests, and I was able to reduce it to a
->
-relatively simple reproducer.  Such deadlocks are probably supposed to
->
-be covered in iotests/graph-changes-while-io, but this deadlock isn't.
->
->
-Basically what the reproducer does is launches QEMU with a drive having
->
-'iothread' option set, creates a chain of 2 snapshots, launches
->
-block-commit job for a snapshot and then dismisses the job, starting
->
-from the lower snapshot.  If the guest is issuing IO at the same time,
->
-there's a race in acquiring block graph lock and a potential deadlock.
->
->
-Here's how it can be reproduced:
->
->
-1. Run QEMU:
->
-> SRCDIR=/path/to/srcdir
->
->
->
->
->
->
->
->
->
-> $SRCDIR/build/qemu-system-x86_64 -enable-kvm \
->
->
->
->   -machine q35 -cpu Nehalem \
->
->
->
->   -name guest=alma8-vm,debug-threads=on \
->
->
->
->   -m 2g -smp 2 \
->
->
->
->   -nographic -nodefaults \
->
->
->
->   -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \
->
->
->
->   -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \
->
->
->
->   -object iothread,id=iothread0 \
->
->
->
->   -blockdev
->
-> node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2
->
->  \
->
->   -device virtio-blk-pci,drive=disk,iothread=iothread0
->
->
-2. Launch IO (random reads) from within the guest:
->
-> nc -U /var/run/alma8-serial.sock
->
-> ...
->
-> [root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k
->
-> --size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting
->
-> --rw=randread --iodepth=1 --filename=/testfile
->
->
-3. Run snapshots creation & removal of lower snapshot operation in a
->
-loop (script attached):
->
-> while /bin/true ; do ./remove_lower_snap.sh ; done
->
->
-And then it occasionally hangs.
->
->
-Note: I've tried bisecting this, and looks like deadlock occurs starting
->
-from the following commit:
->
->
-(BAD)  5bdbaebcce virtio: Re-enable notifications after drain
->
-(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll
->
->
-On the latest v10.0.0 it does hang as well.
->
->
->
-Here's backtrace of the main thread:
->
->
-> #0  0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1,
->
-> timeout=<optimized out>, sigmask=0x0) at
->
-> ../sysdeps/unix/sysv/linux/ppoll.c:43
->
-> #1  0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1,
->
-> timeout=-1) at ../util/qemu-timer.c:329
->
-> #2  0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20,
->
-> ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79
->
-> #3  0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at
->
-> ../util/aio-posix.c:730
->
-> #4  0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950,
->
-> parent=0x0, poll=true) at ../block/io.c:378
->
-> #5  0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at
->
-> ../block/io.c:391
->
-> #6  0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7682
->
-> #7  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7608
->
-> #8  0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7668
->
-> #9  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7608
->
-> #10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7668
->
-> #11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7608
->
-> #12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../blockjob.c:157
->
-> #13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7592
->
-> #14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7661
->
-> #15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx
->
->     (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 =
->
-> {...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234
->
-> #16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7592
->
-> #17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-> ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160,
->
-> errp=0x0)
->
->     at ../block.c:7661
->
-> #18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0,
->
-> ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715
->
-> #19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at
->
-> ../block.c:3317
->
-> #20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at
->
-> ../blockjob.c:209
->
-> #21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at
->
-> ../blockjob.c:82
->
-> #22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at
->
-> ../job.c:474
->
-> #23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at
->
-> ../job.c:771
->
-> #24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400,
->
-> errp=0x7ffd94b4f488) at ../job.c:783
->
-> --Type <RET> for more, q to quit, c to continue without paging--
->
-> #25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0
->
-> "commit-snap1", errp=0x7ffd94b4f488) at ../job-qmp.c:138
->
-> #26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0,
->
-> ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221
->
-> #27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at
->
-> ../qapi/qmp-dispatch.c:128
->
-> #28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at
->
-> ../util/async.c:172
->
-> #29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at
->
-> ../util/async.c:219
->
-> #30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at
->
-> ../util/aio-posix.c:436
->
-> #31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200,
->
-> callback=0x0, user_data=0x0) at ../util/async.c:361
->
-> #32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at
->
-> ../glib/gmain.c:3364
->
-> #33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079
->
-> #34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287
->
-> #35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at
->
-> ../util/main-loop.c:310
->
-> #36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at
->
-> ../util/main-loop.c:589
->
-> #37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835
->
-> #38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at
->
-> ../system/main.c:50
->
-> #39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at
->
-> ../system/main.c:80
->
->
->
-And here's coroutine trying to acquire read lock:
->
->
-> (gdb) qemu coroutine reader_queue->entries.sqh_first
->
-> #0  0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0,
->
-> to_=0x7fc537fff508, action=COROUTINE_YIELD) at
->
-> ../util/coroutine-ucontext.c:321
->
-> #1  0x0000557eb47d4d4a in qemu_coroutine_yield () at
->
-> ../util/qemu-coroutine.c:339
->
-> #2  0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0
->
-> <reader_queue>, lock=0x7fc53c57de50, flags=0) at
->
-> ../util/qemu-coroutine-lock.c:60
->
-> #3  0x0000557eb461fea7 in bdrv_graph_co_rdlock () at
->
-> ../block/graph-lock.c:231
->
-> #4  0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at
->
-> /home/root/src/qemu/master/include/block/graph-lock.h:213
->
-> #5  0x0000557eb460fa41 in blk_co_do_preadv_part
->
->     (blk=0x557eb84c0810, offset=6890553344, bytes=4096,
->
-> qiov=0x7fc530006988, qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at
->
-> ../block/block-backend.c:1339
->
-> #6  0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at
->
-> ../block/block-backend.c:1619
->
-> #7  0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886)
->
-> at ../util/coroutine-ucontext.c:175
->
-> #8  0x00007fc547c2a360 in __start_context () at
->
-> ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
->
-> #9  0x00007ffd94b4ea40 in  ()
->
-> #10 0x0000000000000000 in  ()
->
->
->
-So it looks like main thread is processing job-dismiss request and is
->
-holding write lock taken in block_job_remove_all_bdrv() (frame #20
->
-above).  At the same time iothread spawns a coroutine which performs IO
->
-request.  Before the coroutine is spawned, blk_aio_prwv() increases
->
-'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
->
-trying to acquire the read lock.  But main thread isn't releasing the
->
-lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
->
-Here's the deadlock.
->
->
-Any comments and suggestions on the subject are welcomed.  Thanks!
-I think this is what the blk_wait_while_drained() call was supposed to
-address in blk_co_do_preadv_part(). However, with the use of multiple
-I/O threads, this is racy.
-
-Do you think that in your case we hit the small race window between the
-checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there
-another reason why blk_wait_while_drained() didn't do its job?
-
-Kevin
-
-On 5/2/25 19:34, Kevin Wolf wrote:
-Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben:
-Hi all,
-
-There's a bug in block layer which leads to block graph deadlock.
-Notably, it takes place when blockdev IO is processed within a separate
-iothread.
-
-This was initially caught by our tests, and I was able to reduce it to a
-relatively simple reproducer.  Such deadlocks are probably supposed to
-be covered in iotests/graph-changes-while-io, but this deadlock isn't.
-
-Basically what the reproducer does is launches QEMU with a drive having
-'iothread' option set, creates a chain of 2 snapshots, launches
-block-commit job for a snapshot and then dismisses the job, starting
-from the lower snapshot.  If the guest is issuing IO at the same time,
-there's a race in acquiring block graph lock and a potential deadlock.
-
-Here's how it can be reproduced:
-
-1. Run QEMU:
-SRCDIR=/path/to/srcdir
-$SRCDIR/build/qemu-system-x86_64 -enable-kvm \
--machine q35 -cpu Nehalem \
-   -name guest=alma8-vm,debug-threads=on \
-   -m 2g -smp 2 \
-   -nographic -nodefaults \
-   -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \
-   -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \
-   -object iothread,id=iothread0 \
-   -blockdev 
-node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2
- \
-   -device virtio-blk-pci,drive=disk,iothread=iothread0
-2. Launch IO (random reads) from within the guest:
-nc -U /var/run/alma8-serial.sock
-...
-[root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k 
---size=1G --numjobs=1 --time_based=1 --runtime=300 --group_reporting 
---rw=randread --iodepth=1 --filename=/testfile
-3. Run snapshots creation & removal of lower snapshot operation in a
-loop (script attached):
-while /bin/true ; do ./remove_lower_snap.sh ; done
-And then it occasionally hangs.
-
-Note: I've tried bisecting this, and looks like deadlock occurs starting
-from the following commit:
-
-(BAD)  5bdbaebcce virtio: Re-enable notifications after drain
-(GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll
-
-On the latest v10.0.0 it does hang as well.
-
-
-Here's backtrace of the main thread:
-#0  0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1, timeout=<optimized 
-out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:43
-#1  0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1, timeout=-1) 
-at ../util/qemu-timer.c:329
-#2  0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20, 
-ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79
-#3  0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true) at 
-../util/aio-posix.c:730
-#4  0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950, parent=0x0, 
-poll=true) at ../block/io.c:378
-#5  0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at 
-../block/io.c:391
-#6  0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7682
-#7  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7964250, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7608
-#8  0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7668
-#9  0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb7e59110, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7608
-#10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7668
-#11 0x0000557eb45ebf2b in bdrv_child_change_aio_context (c=0x557eb814ed80, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7608
-#12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../blockjob.c:157
-#13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb7c9d3f0, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7592
-#14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7661
-#15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx
-     (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, 
-tran=0x557eb7a87160, errp=0x0) at ../block.c:1234
-#16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context (c=0x557eb8565af0, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7592
-#17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0, 
-ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...}, tran=0x557eb7a87160, 
-errp=0x0)
-     at ../block.c:7661
-#18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context (bs=0x557eb79575e0, 
-ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at ../block.c:7715
-#19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30) at 
-../block.c:3317
-#20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv (job=0x557eb7952800) at 
-../blockjob.c:209
-#21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at 
-../blockjob.c:82
-#22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at ../job.c:474
-#23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at 
-../job.c:771
-#24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400, 
-errp=0x7ffd94b4f488) at ../job.c:783
---Type <RET> for more, q to quit, c to continue without paging--
-#25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0 "commit-snap1", 
-errp=0x7ffd94b4f488) at ../job-qmp.c:138
-#26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0, 
-ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221
-#27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at 
-../qapi/qmp-dispatch.c:128
-#28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at ../util/async.c:172
-#29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at 
-../util/async.c:219
-#30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at 
-../util/aio-posix.c:436
-#31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200, 
-callback=0x0, user_data=0x0) at ../util/async.c:361
-#32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at 
-../glib/gmain.c:3364
-#33 g_main_context_dispatch (context=0x557eb76c6430) at ../glib/gmain.c:4079
-#34 0x0000557eb47d3ab1 in glib_pollfds_poll () at ../util/main-loop.c:287
-#35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at 
-../util/main-loop.c:310
-#36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at 
-../util/main-loop.c:589
-#37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835
-#38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at ../system/main.c:50
-#39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at 
-../system/main.c:80
-And here's coroutine trying to acquire read lock:
-(gdb) qemu coroutine reader_queue->entries.sqh_first
-#0  0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0, 
-to_=0x7fc537fff508, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:321
-#1  0x0000557eb47d4d4a in qemu_coroutine_yield () at 
-../util/qemu-coroutine.c:339
-#2  0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0 
-<reader_queue>, lock=0x7fc53c57de50, flags=0) at 
-../util/qemu-coroutine-lock.c:60
-#3  0x0000557eb461fea7 in bdrv_graph_co_rdlock () at ../block/graph-lock.c:231
-#4  0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3) at 
-/home/root/src/qemu/master/include/block/graph-lock.h:213
-#5  0x0000557eb460fa41 in blk_co_do_preadv_part
-     (blk=0x557eb84c0810, offset=6890553344, bytes=4096, qiov=0x7fc530006988, 
-qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at ../block/block-backend.c:1339
-#6  0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at 
-../block/block-backend.c:1619
-#7  0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040, i1=21886) at 
-../util/coroutine-ucontext.c:175
-#8  0x00007fc547c2a360 in __start_context () at 
-../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
-#9  0x00007ffd94b4ea40 in  ()
-#10 0x0000000000000000 in  ()
-So it looks like main thread is processing job-dismiss request and is
-holding write lock taken in block_job_remove_all_bdrv() (frame #20
-above).  At the same time iothread spawns a coroutine which performs IO
-request.  Before the coroutine is spawned, blk_aio_prwv() increases
-'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
-trying to acquire the read lock.  But main thread isn't releasing the
-lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
-Here's the deadlock.
-
-Any comments and suggestions on the subject are welcomed.  Thanks!
-I think this is what the blk_wait_while_drained() call was supposed to
-address in blk_co_do_preadv_part(). However, with the use of multiple
-I/O threads, this is racy.
-
-Do you think that in your case we hit the small race window between the
-checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there
-another reason why blk_wait_while_drained() didn't do its job?
-
-Kevin
-At my opinion there is very big race window. Main thread has
-eaten graph write lock. After that another coroutine is stalled
-within GRAPH_RDLOCK_GUARD() as there is no drain at the moment and only
-after that main thread has started drain. That is why Fiona's idea is
-looking working. Though this would mean that normally we should always
-do that at the moment when we acquire write lock. May be even inside
-this function. Den
-
-Am 02.05.2025 um 19:52 hat Denis V. Lunev geschrieben:
->
-On 5/2/25 19:34, Kevin Wolf wrote:
->
-> Am 24.04.2025 um 19:32 hat Andrey Drobyshev geschrieben:
->
-> > Hi all,
->
-> >
->
-> > There's a bug in block layer which leads to block graph deadlock.
->
-> > Notably, it takes place when blockdev IO is processed within a separate
->
-> > iothread.
->
-> >
->
-> > This was initially caught by our tests, and I was able to reduce it to a
->
-> > relatively simple reproducer.  Such deadlocks are probably supposed to
->
-> > be covered in iotests/graph-changes-while-io, but this deadlock isn't.
->
-> >
->
-> > Basically what the reproducer does is launches QEMU with a drive having
->
-> > 'iothread' option set, creates a chain of 2 snapshots, launches
->
-> > block-commit job for a snapshot and then dismisses the job, starting
->
-> > from the lower snapshot.  If the guest is issuing IO at the same time,
->
-> > there's a race in acquiring block graph lock and a potential deadlock.
->
-> >
->
-> > Here's how it can be reproduced:
->
-> >
->
-> > 1. Run QEMU:
->
-> > > SRCDIR=/path/to/srcdir
->
-> > > $SRCDIR/build/qemu-system-x86_64 -enable-kvm \
->
-> > >    -machine q35 -cpu Nehalem \
->
-> > >    -name guest=alma8-vm,debug-threads=on \
->
-> > >    -m 2g -smp 2 \
->
-> > >    -nographic -nodefaults \
->
-> > >    -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \
->
-> > >    -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \
->
-> > >    -object iothread,id=iothread0 \
->
-> > >    -blockdev
->
-> > > node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2
->
-> > >  \
->
-> > >    -device virtio-blk-pci,drive=disk,iothread=iothread0
->
-> > 2. Launch IO (random reads) from within the guest:
->
-> > > nc -U /var/run/alma8-serial.sock
->
-> > > ...
->
-> > > [root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1
->
-> > > --bs=4k --size=1G --numjobs=1 --time_based=1 --runtime=300
->
-> > > --group_reporting --rw=randread --iodepth=1 --filename=/testfile
->
-> > 3. Run snapshots creation & removal of lower snapshot operation in a
->
-> > loop (script attached):
->
-> > > while /bin/true ; do ./remove_lower_snap.sh ; done
->
-> > And then it occasionally hangs.
->
-> >
->
-> > Note: I've tried bisecting this, and looks like deadlock occurs starting
->
-> > from the following commit:
->
-> >
->
-> > (BAD)  5bdbaebcce virtio: Re-enable notifications after drain
->
-> > (GOOD) c42c3833e0 virtio-scsi: Attach event vq notifier with no_poll
->
-> >
->
-> > On the latest v10.0.0 it does hang as well.
->
-> >
->
-> >
->
-> > Here's backtrace of the main thread:
->
-> >
->
-> > > #0  0x00007fc547d427ce in __ppoll (fds=0x557eb79657b0, nfds=1,
->
-> > > timeout=<optimized out>, sigmask=0x0) at
->
-> > > ../sysdeps/unix/sysv/linux/ppoll.c:43
->
-> > > #1  0x0000557eb47d955c in qemu_poll_ns (fds=0x557eb79657b0, nfds=1,
->
-> > > timeout=-1) at ../util/qemu-timer.c:329
->
-> > > #2  0x0000557eb47b2204 in fdmon_poll_wait (ctx=0x557eb76c5f20,
->
-> > > ready_list=0x7ffd94b4edd8, timeout=-1) at ../util/fdmon-poll.c:79
->
-> > > #3  0x0000557eb47b1c45 in aio_poll (ctx=0x557eb76c5f20, blocking=true)
->
-> > > at ../util/aio-posix.c:730
->
-> > > #4  0x0000557eb4621edd in bdrv_do_drained_begin (bs=0x557eb795e950,
->
-> > > parent=0x0, poll=true) at ../block/io.c:378
->
-> > > #5  0x0000557eb4621f7b in bdrv_drained_begin (bs=0x557eb795e950) at
->
-> > > ../block/io.c:391
->
-> > > #6  0x0000557eb45ec125 in bdrv_change_aio_context (bs=0x557eb795e950,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7682
->
-> > > #7  0x0000557eb45ebf2b in bdrv_child_change_aio_context
->
-> > > (c=0x557eb7964250, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7608
->
-> > > #8  0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7668
->
-> > > #9  0x0000557eb45ebf2b in bdrv_child_change_aio_context
->
-> > > (c=0x557eb7e59110, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7608
->
-> > > #10 0x0000557eb45ec0c4 in bdrv_change_aio_context (bs=0x557eb7e51960,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7668
->
-> > > #11 0x0000557eb45ebf2b in bdrv_child_change_aio_context
->
-> > > (c=0x557eb814ed80, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7608
->
-> > > #12 0x0000557eb45ee8e4 in child_job_change_aio_ctx (c=0x557eb7c9d3f0,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../blockjob.c:157
->
-> > > #13 0x0000557eb45ebe2d in bdrv_parent_change_aio_context
->
-> > > (c=0x557eb7c9d3f0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7592
->
-> > > #14 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb7d74310,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7661
->
-> > > #15 0x0000557eb45dcd7e in bdrv_child_cb_change_aio_ctx
->
-> > >      (child=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60
->
-> > > = {...}, tran=0x557eb7a87160, errp=0x0) at ../block.c:1234
->
-> > > #16 0x0000557eb45ebe2d in bdrv_parent_change_aio_context
->
-> > > (c=0x557eb8565af0, ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7592
->
-> > > #17 0x0000557eb45ec06b in bdrv_change_aio_context (bs=0x557eb79575e0,
->
-> > > ctx=0x557eb76c5f20, visited=0x557eb7e06b60 = {...},
->
-> > > tran=0x557eb7a87160, errp=0x0)
->
-> > >      at ../block.c:7661
->
-> > > #18 0x0000557eb45ec1f3 in bdrv_try_change_aio_context
->
-> > > (bs=0x557eb79575e0, ctx=0x557eb76c5f20, ignore_child=0x0, errp=0x0) at
->
-> > > ../block.c:7715
->
-> > > #19 0x0000557eb45e1b15 in bdrv_root_unref_child (child=0x557eb7966f30)
->
-> > > at ../block.c:3317
->
-> > > #20 0x0000557eb45eeaa8 in block_job_remove_all_bdrv
->
-> > > (job=0x557eb7952800) at ../blockjob.c:209
->
-> > > #21 0x0000557eb45ee641 in block_job_free (job=0x557eb7952800) at
->
-> > > ../blockjob.c:82
->
-> > > #22 0x0000557eb45f17af in job_unref_locked (job=0x557eb7952800) at
->
-> > > ../job.c:474
->
-> > > #23 0x0000557eb45f257d in job_do_dismiss_locked (job=0x557eb7952800) at
->
-> > > ../job.c:771
->
-> > > #24 0x0000557eb45f25fe in job_dismiss_locked (jobptr=0x7ffd94b4f400,
->
-> > > errp=0x7ffd94b4f488) at ../job.c:783
->
-> > > --Type <RET> for more, q to quit, c to continue without paging--
->
-> > > #25 0x0000557eb45d8e84 in qmp_job_dismiss (id=0x557eb7aa42b0
->
-> > > "commit-snap1", errp=0x7ffd94b4f488) at ../job-qmp.c:138
->
-> > > #26 0x0000557eb472f6a3 in qmp_marshal_job_dismiss (args=0x7fc52c00a3b0,
->
-> > > ret=0x7fc53c880da8, errp=0x7fc53c880da0) at qapi/qapi-commands-job.c:221
->
-> > > #27 0x0000557eb47a35f3 in do_qmp_dispatch_bh (opaque=0x7fc53c880e40) at
->
-> > > ../qapi/qmp-dispatch.c:128
->
-> > > #28 0x0000557eb47d1cd2 in aio_bh_call (bh=0x557eb79568f0) at
->
-> > > ../util/async.c:172
->
-> > > #29 0x0000557eb47d1df5 in aio_bh_poll (ctx=0x557eb76c0200) at
->
-> > > ../util/async.c:219
->
-> > > #30 0x0000557eb47b12f3 in aio_dispatch (ctx=0x557eb76c0200) at
->
-> > > ../util/aio-posix.c:436
->
-> > > #31 0x0000557eb47d2266 in aio_ctx_dispatch (source=0x557eb76c0200,
->
-> > > callback=0x0, user_data=0x0) at ../util/async.c:361
->
-> > > #32 0x00007fc549232f4f in g_main_dispatch (context=0x557eb76c6430) at
->
-> > > ../glib/gmain.c:3364
->
-> > > #33 g_main_context_dispatch (context=0x557eb76c6430) at
->
-> > > ../glib/gmain.c:4079
->
-> > > #34 0x0000557eb47d3ab1 in glib_pollfds_poll () at
->
-> > > ../util/main-loop.c:287
->
-> > > #35 0x0000557eb47d3b38 in os_host_main_loop_wait (timeout=0) at
->
-> > > ../util/main-loop.c:310
->
-> > > #36 0x0000557eb47d3c58 in main_loop_wait (nonblocking=0) at
->
-> > > ../util/main-loop.c:589
->
-> > > #37 0x0000557eb4218b01 in qemu_main_loop () at ../system/runstate.c:835
->
-> > > #38 0x0000557eb46df166 in qemu_default_main (opaque=0x0) at
->
-> > > ../system/main.c:50
->
-> > > #39 0x0000557eb46df215 in main (argc=24, argv=0x7ffd94b4f8d8) at
->
-> > > ../system/main.c:80
->
-> >
->
-> > And here's coroutine trying to acquire read lock:
->
-> >
->
-> > > (gdb) qemu coroutine reader_queue->entries.sqh_first
->
-> > > #0  0x0000557eb47d7068 in qemu_coroutine_switch (from_=0x557eb7aa48b0,
->
-> > > to_=0x7fc537fff508, action=COROUTINE_YIELD) at
->
-> > > ../util/coroutine-ucontext.c:321
->
-> > > #1  0x0000557eb47d4d4a in qemu_coroutine_yield () at
->
-> > > ../util/qemu-coroutine.c:339
->
-> > > #2  0x0000557eb47d56c8 in qemu_co_queue_wait_impl (queue=0x557eb59954c0
->
-> > > <reader_queue>, lock=0x7fc53c57de50, flags=0) at
->
-> > > ../util/qemu-coroutine-lock.c:60
->
-> > > #3  0x0000557eb461fea7 in bdrv_graph_co_rdlock () at
->
-> > > ../block/graph-lock.c:231
->
-> > > #4  0x0000557eb460c81a in graph_lockable_auto_lock (x=0x7fc53c57dee3)
->
-> > > at /home/root/src/qemu/master/include/block/graph-lock.h:213
->
-> > > #5  0x0000557eb460fa41 in blk_co_do_preadv_part
->
-> > >      (blk=0x557eb84c0810, offset=6890553344, bytes=4096,
->
-> > > qiov=0x7fc530006988, qiov_offset=0, flags=BDRV_REQ_REGISTERED_BUF) at
->
-> > > ../block/block-backend.c:1339
->
-> > > #6  0x0000557eb46104d7 in blk_aio_read_entry (opaque=0x7fc530003240) at
->
-> > > ../block/block-backend.c:1619
->
-> > > #7  0x0000557eb47d6c40 in coroutine_trampoline (i0=-1213577040,
->
-> > > i1=21886) at ../util/coroutine-ucontext.c:175
->
-> > > #8  0x00007fc547c2a360 in __start_context () at
->
-> > > ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
->
-> > > #9  0x00007ffd94b4ea40 in  ()
->
-> > > #10 0x0000000000000000 in  ()
->
-> >
->
-> > So it looks like main thread is processing job-dismiss request and is
->
-> > holding write lock taken in block_job_remove_all_bdrv() (frame #20
->
-> > above).  At the same time iothread spawns a coroutine which performs IO
->
-> > request.  Before the coroutine is spawned, blk_aio_prwv() increases
->
-> > 'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
->
-> > trying to acquire the read lock.  But main thread isn't releasing the
->
-> > lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
->
-> > Here's the deadlock.
->
-> >
->
-> > Any comments and suggestions on the subject are welcomed.  Thanks!
->
-> I think this is what the blk_wait_while_drained() call was supposed to
->
-> address in blk_co_do_preadv_part(). However, with the use of multiple
->
-> I/O threads, this is racy.
->
->
->
-> Do you think that in your case we hit the small race window between the
->
-> checks in blk_wait_while_drained() and GRAPH_RDLOCK_GUARD()? Or is there
->
-> another reason why blk_wait_while_drained() didn't do its job?
->
->
->
-At my opinion there is very big race window. Main thread has
->
-eaten graph write lock. After that another coroutine is stalled
->
-within GRAPH_RDLOCK_GUARD() as there is no drain at the moment and only
->
-after that main thread has started drain.
-You're right, I confused taking the write lock with draining there.
-
->
-That is why Fiona's idea is looking working. Though this would mean
->
-that normally we should always do that at the moment when we acquire
->
-write lock. May be even inside this function.
-I actually see now that not all of my graph locking patches were merged.
-At least I did have the thought that bdrv_drained_begin() must be marked
-GRAPH_UNLOCKED because it polls. That means that calling it from inside
-bdrv_try_change_aio_context() is actually forbidden (and that's the part
-I didn't see back then because it doesn't have TSA annotations).
-
-If you refactor the code to move the drain out to before the lock is
-taken, I think you end up with Fiona's patch, except you'll remove the
-forbidden inner drain and add more annotations for some functions and
-clarify the rules around them. I don't know, but I wouldn't be surprised
-if along the process we find other bugs, too.
-
-So Fiona's drain looks right to me, but we should probably approach it
-more systematically.
-
-Kevin
-
diff --git a/results/classifier/016/virtual/24930826 b/results/classifier/016/virtual/24930826
deleted file mode 100644
index 23479fd3..00000000
--- a/results/classifier/016/virtual/24930826
+++ /dev/null
@@ -1,60 +0,0 @@
-virtual: 0.989
-hypervisor: 0.884
-debug: 0.787
-user-level: 0.571
-device: 0.253
-operating system: 0.212
-x86: 0.071
-TCG: 0.045
-network: 0.044
-files: 0.037
-peripherals: 0.036
-register: 0.020
-KVM: 0.018
-PID: 0.016
-socket: 0.012
-i386: 0.007
-VMM: 0.006
-kernel: 0.004
-semantic: 0.004
-performance: 0.003
-architecture: 0.003
-assembly: 0.002
-alpha: 0.002
-permissions: 0.002
-vnc: 0.002
-boot: 0.001
-graphic: 0.001
-risc-v: 0.001
-ppc: 0.001
-arm: 0.001
-mistranslation: 0.000
-
-[Qemu-devel] [BUG] vhost-user: hot-unplug vhost-user nic for windows guest OS will fail with 100% reproduce rate
-
-Hi, guys
-
-I met a problem when hot-unplug vhost-user nic for Windows 2008 rc2 sp1 64 
-(Guest OS)
-
-The xml of nic is as followed:
-<interface type='vhostuser'>
-  <mac address='52:54:00:3b:83:aa'/>
-  <source type='unix' path='/var/run/vhost-user/port1' mode='client'/>
-  <target dev='port1'/>
-  <model type='virtio'/>
-  <driver queues='4'/>
-  <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
-</interface>
-
-Firstly, I use virsh attach-device win2008 vif.xml to hot-plug a nic for Guest 
-OS. This operation returns success.
-After guest OS discover nic successfully, I use virsh detach-device win2008 
-vif.xml to hot-unplug it. This operation will fail with 100% reproduce rate.
-
-However, if I hot-plug and hot-unplug virtio-net nic , it will not fail.
-
-I have analysis the process of qmp_device_del , I found that qemu have inject 
-interrupt to acpi to let it notice guest OS to remove nic.
-I guess there is something wrong in Windows when handle the interrupt.
-
diff --git a/results/classifier/016/virtual/25892827 b/results/classifier/016/virtual/25892827
deleted file mode 100644
index 79d1a51a..00000000
--- a/results/classifier/016/virtual/25892827
+++ /dev/null
@@ -1,1104 +0,0 @@
-virtual: 0.955
-KVM: 0.881
-x86: 0.715
-debug: 0.617
-hypervisor: 0.546
-PID: 0.132
-register: 0.102
-performance: 0.058
-files: 0.056
-operating system: 0.051
-kernel: 0.049
-boot: 0.025
-assembly: 0.017
-device: 0.014
-socket: 0.013
-semantic: 0.009
-user-level: 0.008
-TCG: 0.008
-risc-v: 0.007
-architecture: 0.006
-ppc: 0.005
-VMM: 0.005
-permissions: 0.004
-vnc: 0.004
-network: 0.004
-peripherals: 0.003
-arm: 0.003
-i386: 0.003
-alpha: 0.002
-graphic: 0.002
-mistranslation: 0.000
-
-[Qemu-devel] [BUG/RFC] Two cpus are not brought up normally in SLES11 sp3 VM after reboot
-
-Hi,
-
-Recently we encountered a problem in our project: 2 CPUs in VM are not brought 
-up normally after reboot.
-
-Our host is using KVM kmod 3.6 and QEMU 2.1.
-A SLES 11 sp3 VM configured with 8 vcpus,
-cpu model is configured with 'host-passthrough'.
-
-After VM's first time started up, everything seems to be OK.
-and then VM is paniced and rebooted.
-After reboot, only 6 cpus are brought up in VM, cpu1 and cpu7 are not online.
-
-This is the only message we can get from VM:
-VM dmesg shows:
-[    0.069867] Booting Node   0, Processors  #1
-[    5.060042] CPU1: Stuck ??
-[    5.060499]  #2
-[    5.088322] kvm-clock: cpu 2, msr 6:3fc90901, secondary cpu clock
-[    5.088335] KVM setup async PF for cpu 2
-[    5.092967] NMI watchdog enabled, takes one hw-pmu counter.
-[    5.094405]  #3
-[    5.108324] kvm-clock: cpu 3, msr 6:3fcd0901, secondary cpu clock
-[    5.108333] KVM setup async PF for cpu 3
-[    5.113553] NMI watchdog enabled, takes one hw-pmu counter.
-[    5.114970]  #4
-[    5.128325] kvm-clock: cpu 4, msr 6:3fd10901, secondary cpu clock
-[    5.128336] KVM setup async PF for cpu 4
-[    5.134576] NMI watchdog enabled, takes one hw-pmu counter.
-[    5.135998]  #5
-[    5.152324] kvm-clock: cpu 5, msr 6:3fd50901, secondary cpu clock
-[    5.152334] KVM setup async PF for cpu 5
-[    5.154764] NMI watchdog enabled, takes one hw-pmu counter.
-[    5.156467]  #6
-[    5.172327] kvm-clock: cpu 6, msr 6:3fd90901, secondary cpu clock
-[    5.172341] KVM setup async PF for cpu 6
-[    5.180738] NMI watchdog enabled, takes one hw-pmu counter.
-[    5.182173]  #7 Ok.
-[   10.170815] CPU7: Stuck ??
-[   10.171648] Brought up 6 CPUs
-[   10.172394] Total of 6 processors activated (28799.97 BogoMIPS).
-
-From host, we found that QEMU vcpu1 thread and vcpu7 thread were not consuming 
-any cpu (Should be in idle state),
-All of VCPUs' stacks in host is like bellow:
-
-[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
-[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
-[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
-[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
-[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
-[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
-[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
-[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
-[<ffffffffffffffff>] 0xffffffffffffffff
-
-We looked into the kernel codes that could leading to the above 'Stuck' warning,
-and found that the only possible is the emulation of 'cpuid' instruct in 
-kvm/qemu has something wrong.
-But since we can’t reproduce this problem, we are not quite sure.
-Is there any possible that the cupid emulation in kvm/qemu has some bug ?
-
-Has anyone come across these problem before? Or any idea?
-
-Thanks,
-zhanghailiang
-
-On 06/07/2015 09:54, zhanghailiang wrote:
->
->
-From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
->
-consuming any cpu (Should be in idle state),
->
-All of VCPUs' stacks in host is like bellow:
->
->
-[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
->
-[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
->
-[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
->
-[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
->
-[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
->
-[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
->
-[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
->
-[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
->
-[<ffffffffffffffff>] 0xffffffffffffffff
->
->
-We looked into the kernel codes that could leading to the above 'Stuck'
->
-warning,
->
-and found that the only possible is the emulation of 'cpuid' instruct in
->
-kvm/qemu has something wrong.
->
-But since we can’t reproduce this problem, we are not quite sure.
->
-Is there any possible that the cupid emulation in kvm/qemu has some bug ?
-Can you explain the relationship to the cpuid emulation?  What do the
-traces say about vcpus 1 and 7?
-
-Paolo
-
-On 2015/7/6 16:45, Paolo Bonzini wrote:
-On 06/07/2015 09:54, zhanghailiang wrote:
-From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
-consuming any cpu (Should be in idle state),
-All of VCPUs' stacks in host is like bellow:
-
-[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
-[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
-[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
-[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
-[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
-[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
-[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
-[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
-[<ffffffffffffffff>] 0xffffffffffffffff
-
-We looked into the kernel codes that could leading to the above 'Stuck'
-warning,
-and found that the only possible is the emulation of 'cpuid' instruct in
-kvm/qemu has something wrong.
-But since we can’t reproduce this problem, we are not quite sure.
-Is there any possible that the cupid emulation in kvm/qemu has some bug ?
-Can you explain the relationship to the cpuid emulation?  What do the
-traces say about vcpus 1 and 7?
-OK, we searched the VM's kernel codes with the 'Stuck' message, and  it is 
-located in
-do_boot_cpu(). It's in BSP context, the call process is:
-BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() 
--> wakeup_secondary_via_INIT() to trigger APs.
-It will wait 5s for APs to startup, if some AP not startup normally, it will 
-print 'CPU%d Stuck' or 'CPU%d: Not responding'.
-
-If it prints 'Stuck', it means the AP has received the SIPI interrupt and 
-begins to execute the code
-'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before 
-smp_callin()(smpboot.c).
-The follow is the starup process of BSP and AP.
-BSP:
-start_kernel()
-  ->smp_init()
-     ->smp_boot_cpus()
-       ->do_boot_cpu()
-           ->start_ip = trampoline_address(); //set the address that AP will go 
-to execute
-           ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU
-           ->for (timeout = 0; timeout < 50000; timeout++)
-               if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if AP 
-startup or not
-
-APs:
-ENTRY(trampoline_data) (trampoline_64.S)
-      ->ENTRY(secondary_startup_64) (head_64.S)
-         ->start_secondary() (smpboot.c)
-            ->cpu_init();
-            ->smp_callin();
-                ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP comes 
-here, the BSP will not prints the error message.
-
-From above call process, we can be sure that, the AP has been stuck between 
-trampoline_data and the cpumask_set_cpu() in
-smp_callin(), we look through these codes path carefully, and only found a 
-'hlt' instruct that could block the process.
-It is located in trampoline_data():
-
-ENTRY(trampoline_data)
-        ...
-
-        call    verify_cpu              # Verify the cpu supports long mode
-        testl   %eax, %eax              # Check for return code
-        jnz     no_longmode
-
-        ...
-
-no_longmode:
-        hlt
-        jmp no_longmode
-
-For the process verify_cpu(),
-we can only find the 'cpuid' sensitive instruct that could lead VM exit from 
-No-root mode.
-This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to 
-the fail in verify_cpu.
-
-From the message in VM, we know vcpu1 and vcpu7 is something wrong.
-[    5.060042] CPU1: Stuck ??
-[   10.170815] CPU7: Stuck ??
-[   10.171648] Brought up 6 CPUs
-
-Besides, the follow is the cpus message got from host.
-80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command 
-instance-0000000
-* CPU #0: pc=0x00007f64160c683d thread_id=68570
-  CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
-  CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
-  CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
-  CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
-  CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
-  CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
-  CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
-
-Oh, i also forgot to mention in the above message that, we have bond each vCPU 
-to different physical CPU in
-host.
-
-Thanks,
-zhanghailiang
-
-On 06/07/2015 11:59, zhanghailiang wrote:
->
->
->
-Besides, the follow is the cpus message got from host.
->
-80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh
->
-qemu-monitor-command instance-0000000
->
-* CPU #0: pc=0x00007f64160c683d thread_id=68570
->
-CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
->
-CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
->
-CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
->
-CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
->
-CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
->
-CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
->
-CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
->
->
-Oh, i also forgot to mention in the above message that, we have bond
->
-each vCPU to different physical CPU in
->
-host.
-Can you capture a trace on the host (trace-cmd record -e kvm) and send
-it privately?  Please note which CPUs get stuck, since I guess it's not
-always 1 and 7.
-
-Paolo
-
-On Mon, 6 Jul 2015 17:59:10 +0800
-zhanghailiang <address@hidden> wrote:
-
->
-On 2015/7/6 16:45, Paolo Bonzini wrote:
->
->
->
->
->
-> On 06/07/2015 09:54, zhanghailiang wrote:
->
->>
->
->>  From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
->
->> consuming any cpu (Should be in idle state),
->
->> All of VCPUs' stacks in host is like bellow:
->
->>
->
->> [<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
->
->> [<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
->
->> [<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
->
->> [<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
->
->> [<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
->
->> [<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
->
->> [<ffffffff81468092>] system_call_fastpath+0x16/0x1b
->
->> [<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
->
->> [<ffffffffffffffff>] 0xffffffffffffffff
->
->>
->
->> We looked into the kernel codes that could leading to the above 'Stuck'
->
->> warning,
-in current upstream there isn't any printk(...Stuck...) left since that code 
-path
-has been reworked.
-I've often seen this on over-committed host during guest CPUs up/down torture 
-test.
-Could you update guest kernel to upstream and see if issue reproduces?
-
->
->> and found that the only possible is the emulation of 'cpuid' instruct in
->
->> kvm/qemu has something wrong.
->
->> But since we can’t reproduce this problem, we are not quite sure.
->
->> Is there any possible that the cupid emulation in kvm/qemu has some bug ?
->
->
->
-> Can you explain the relationship to the cpuid emulation?  What do the
->
-> traces say about vcpus 1 and 7?
->
->
-OK, we searched the VM's kernel codes with the 'Stuck' message, and  it is
->
-located in
->
-do_boot_cpu(). It's in BSP context, the call process is:
->
-BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu()
->
--> wakeup_secondary_via_INIT() to trigger APs.
->
-It will wait 5s for APs to startup, if some AP not startup normally, it will
->
-print 'CPU%d Stuck' or 'CPU%d: Not responding'.
->
->
-If it prints 'Stuck', it means the AP has received the SIPI interrupt and
->
-begins to execute the code
->
-'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places
->
-before smp_callin()(smpboot.c).
->
-The follow is the starup process of BSP and AP.
->
-BSP:
->
-start_kernel()
->
-->smp_init()
->
-->smp_boot_cpus()
->
-->do_boot_cpu()
->
-->start_ip = trampoline_address(); //set the address that AP will
->
-go to execute
->
-->wakeup_secondary_cpu_via_init(); // kick the secondary CPU
->
-->for (timeout = 0; timeout < 50000; timeout++)
->
-if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if
->
-AP startup or not
->
->
-APs:
->
-ENTRY(trampoline_data) (trampoline_64.S)
->
-->ENTRY(secondary_startup_64) (head_64.S)
->
-->start_secondary() (smpboot.c)
->
-->cpu_init();
->
-->smp_callin();
->
-->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP
->
-comes here, the BSP will not prints the error message.
->
->
-From above call process, we can be sure that, the AP has been stuck between
->
-trampoline_data and the cpumask_set_cpu() in
->
-smp_callin(), we look through these codes path carefully, and only found a
->
-'hlt' instruct that could block the process.
->
-It is located in trampoline_data():
->
->
-ENTRY(trampoline_data)
->
-...
->
->
-call    verify_cpu              # Verify the cpu supports long mode
->
-testl   %eax, %eax              # Check for return code
->
-jnz     no_longmode
->
->
-...
->
->
-no_longmode:
->
-hlt
->
-jmp no_longmode
->
->
-For the process verify_cpu(),
->
-we can only find the 'cpuid' sensitive instruct that could lead VM exit from
->
-No-root mode.
->
-This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to
->
-the fail in verify_cpu.
->
->
-From the message in VM, we know vcpu1 and vcpu7 is something wrong.
->
-[    5.060042] CPU1: Stuck ??
->
-[   10.170815] CPU7: Stuck ??
->
-[   10.171648] Brought up 6 CPUs
->
->
-Besides, the follow is the cpus message got from host.
->
-80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh
->
-qemu-monitor-command instance-0000000
->
-* CPU #0: pc=0x00007f64160c683d thread_id=68570
->
-CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
->
-CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
->
-CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
->
-CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
->
-CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
->
-CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
->
-CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
->
->
-Oh, i also forgot to mention in the above message that, we have bond each
->
-vCPU to different physical CPU in
->
-host.
->
->
-Thanks,
->
-zhanghailiang
->
->
->
->
->
---
->
-To unsubscribe from this list: send the line "unsubscribe kvm" in
->
-the body of a message to address@hidden
->
-More majordomo info at
-http://vger.kernel.org/majordomo-info.html
-
-On 2015/7/7 19:23, Igor Mammedov wrote:
-On Mon, 6 Jul 2015 17:59:10 +0800
-zhanghailiang <address@hidden> wrote:
-On 2015/7/6 16:45, Paolo Bonzini wrote:
-On 06/07/2015 09:54, zhanghailiang wrote:
-From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
-consuming any cpu (Should be in idle state),
-All of VCPUs' stacks in host is like bellow:
-
-[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
-[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
-[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
-[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
-[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
-[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
-[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
-[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
-[<ffffffffffffffff>] 0xffffffffffffffff
-
-We looked into the kernel codes that could leading to the above 'Stuck'
-warning,
-in current upstream there isn't any printk(...Stuck...) left since that code 
-path
-has been reworked.
-I've often seen this on over-committed host during guest CPUs up/down torture 
-test.
-Could you update guest kernel to upstream and see if issue reproduces?
-Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to 
-reproduce it.
-
-For your test case, is it a kernel bug?
-Or is there any related patch could solve your test problem been merged into
-upstream ?
-
-Thanks,
-zhanghailiang
-and found that the only possible is the emulation of 'cpuid' instruct in
-kvm/qemu has something wrong.
-But since we can’t reproduce this problem, we are not quite sure.
-Is there any possible that the cupid emulation in kvm/qemu has some bug ?
-Can you explain the relationship to the cpuid emulation?  What do the
-traces say about vcpus 1 and 7?
-OK, we searched the VM's kernel codes with the 'Stuck' message, and  it is 
-located in
-do_boot_cpu(). It's in BSP context, the call process is:
-BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() 
--> wakeup_secondary_via_INIT() to trigger APs.
-It will wait 5s for APs to startup, if some AP not startup normally, it will 
-print 'CPU%d Stuck' or 'CPU%d: Not responding'.
-
-If it prints 'Stuck', it means the AP has received the SIPI interrupt and 
-begins to execute the code
-'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before 
-smp_callin()(smpboot.c).
-The follow is the starup process of BSP and AP.
-BSP:
-start_kernel()
-    ->smp_init()
-       ->smp_boot_cpus()
-         ->do_boot_cpu()
-             ->start_ip = trampoline_address(); //set the address that AP will 
-go to execute
-             ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU
-             ->for (timeout = 0; timeout < 50000; timeout++)
-                 if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if 
-AP startup or not
-
-APs:
-ENTRY(trampoline_data) (trampoline_64.S)
-        ->ENTRY(secondary_startup_64) (head_64.S)
-           ->start_secondary() (smpboot.c)
-              ->cpu_init();
-              ->smp_callin();
-                  ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP 
-comes here, the BSP will not prints the error message.
-
-  From above call process, we can be sure that, the AP has been stuck between 
-trampoline_data and the cpumask_set_cpu() in
-smp_callin(), we look through these codes path carefully, and only found a 
-'hlt' instruct that could block the process.
-It is located in trampoline_data():
-
-ENTRY(trampoline_data)
-          ...
-
-        call    verify_cpu              # Verify the cpu supports long mode
-        testl   %eax, %eax              # Check for return code
-        jnz     no_longmode
-
-          ...
-
-no_longmode:
-        hlt
-        jmp no_longmode
-
-For the process verify_cpu(),
-we can only find the 'cpuid' sensitive instruct that could lead VM exit from 
-No-root mode.
-This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to 
-the fail in verify_cpu.
-
-  From the message in VM, we know vcpu1 and vcpu7 is something wrong.
-[    5.060042] CPU1: Stuck ??
-[   10.170815] CPU7: Stuck ??
-[   10.171648] Brought up 6 CPUs
-
-Besides, the follow is the cpus message got from host.
-80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command 
-instance-0000000
-* CPU #0: pc=0x00007f64160c683d thread_id=68570
-    CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
-    CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
-    CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
-    CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
-    CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
-    CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
-    CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
-
-Oh, i also forgot to mention in the above message that, we have bond each vCPU 
-to different physical CPU in
-host.
-
-Thanks,
-zhanghailiang
-
-
-
-
---
-To unsubscribe from this list: send the line "unsubscribe kvm" in
-the body of a message to address@hidden
-More majordomo info at
-http://vger.kernel.org/majordomo-info.html
-.
-
-On Tue, 7 Jul 2015 19:43:35 +0800
-zhanghailiang <address@hidden> wrote:
-
->
-On 2015/7/7 19:23, Igor Mammedov wrote:
->
-> On Mon, 6 Jul 2015 17:59:10 +0800
->
-> zhanghailiang <address@hidden> wrote:
->
->
->
->> On 2015/7/6 16:45, Paolo Bonzini wrote:
->
->>>
->
->>>
->
->>> On 06/07/2015 09:54, zhanghailiang wrote:
->
->>>>
->
->>>>   From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
->
->>>> consuming any cpu (Should be in idle state),
->
->>>> All of VCPUs' stacks in host is like bellow:
->
->>>>
->
->>>> [<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
->
->>>> [<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
->
->>>> [<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
->
->>>> [<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
->
->>>> [<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
->
->>>> [<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
->
->>>> [<ffffffff81468092>] system_call_fastpath+0x16/0x1b
->
->>>> [<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
->
->>>> [<ffffffffffffffff>] 0xffffffffffffffff
->
->>>>
->
->>>> We looked into the kernel codes that could leading to the above 'Stuck'
->
->>>> warning,
->
-> in current upstream there isn't any printk(...Stuck...) left since that
->
-> code path
->
-> has been reworked.
->
-> I've often seen this on over-committed host during guest CPUs up/down
->
-> torture test.
->
-> Could you update guest kernel to upstream and see if issue reproduces?
->
->
->
->
-Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to
->
-reproduce it.
->
->
-For your test case, is it a kernel bug?
->
-Or is there any related patch could solve your test problem been merged into
->
-upstream ?
-I don't remember all prerequisite patches but you should be able to find
-http://marc.info/?l=linux-kernel&m=140326703108009&w=2
-"x86/smpboot: Initialize secondary CPU only if master CPU will wait for it"
-and then look for dependencies.
-
-
->
->
-Thanks,
->
-zhanghailiang
->
->
->>>> and found that the only possible is the emulation of 'cpuid' instruct in
->
->>>> kvm/qemu has something wrong.
->
->>>> But since we can’t reproduce this problem, we are not quite sure.
->
->>>> Is there any possible that the cupid emulation in kvm/qemu has some bug ?
->
->>>
->
->>> Can you explain the relationship to the cpuid emulation?  What do the
->
->>> traces say about vcpus 1 and 7?
->
->>
->
->> OK, we searched the VM's kernel codes with the 'Stuck' message, and  it is
->
->> located in
->
->> do_boot_cpu(). It's in BSP context, the call process is:
->
->> BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() ->
->
->> do_boot_cpu() -> wakeup_secondary_via_INIT() to trigger APs.
->
->> It will wait 5s for APs to startup, if some AP not startup normally, it
->
->> will print 'CPU%d Stuck' or 'CPU%d: Not responding'.
->
->>
->
->> If it prints 'Stuck', it means the AP has received the SIPI interrupt and
->
->> begins to execute the code
->
->> 'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places
->
->> before smp_callin()(smpboot.c).
->
->> The follow is the starup process of BSP and AP.
->
->> BSP:
->
->> start_kernel()
->
->>     ->smp_init()
->
->>        ->smp_boot_cpus()
->
->>          ->do_boot_cpu()
->
->>              ->start_ip = trampoline_address(); //set the address that AP
->
->> will go to execute
->
->>              ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU
->
->>              ->for (timeout = 0; timeout < 50000; timeout++)
->
->>                  if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;//
->
->> check if AP startup or not
->
->>
->
->> APs:
->
->> ENTRY(trampoline_data) (trampoline_64.S)
->
->>         ->ENTRY(secondary_startup_64) (head_64.S)
->
->>            ->start_secondary() (smpboot.c)
->
->>               ->cpu_init();
->
->>               ->smp_callin();
->
->>                   ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP
->
->> comes here, the BSP will not prints the error message.
->
->>
->
->>   From above call process, we can be sure that, the AP has been stuck
->
->> between trampoline_data and the cpumask_set_cpu() in
->
->> smp_callin(), we look through these codes path carefully, and only found a
->
->> 'hlt' instruct that could block the process.
->
->> It is located in trampoline_data():
->
->>
->
->> ENTRY(trampoline_data)
->
->>           ...
->
->>
->
->>    call    verify_cpu              # Verify the cpu supports long mode
->
->>    testl   %eax, %eax              # Check for return code
->
->>    jnz     no_longmode
->
->>
->
->>           ...
->
->>
->
->> no_longmode:
->
->>    hlt
->
->>    jmp no_longmode
->
->>
->
->> For the process verify_cpu(),
->
->> we can only find the 'cpuid' sensitive instruct that could lead VM exit
->
->> from No-root mode.
->
->> This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading
->
->> to the fail in verify_cpu.
->
->>
->
->>   From the message in VM, we know vcpu1 and vcpu7 is something wrong.
->
->> [    5.060042] CPU1: Stuck ??
->
->> [   10.170815] CPU7: Stuck ??
->
->> [   10.171648] Brought up 6 CPUs
->
->>
->
->> Besides, the follow is the cpus message got from host.
->
->> 80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh
->
->> qemu-monitor-command instance-0000000
->
->> * CPU #0: pc=0x00007f64160c683d thread_id=68570
->
->>     CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
->
->>     CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
->
->>     CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
->
->>     CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
->
->>     CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
->
->>     CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
->
->>     CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
->
->>
->
->> Oh, i also forgot to mention in the above message that, we have bond each
->
->> vCPU to different physical CPU in
->
->> host.
->
->>
->
->> Thanks,
->
->> zhanghailiang
->
->>
->
->>
->
->>
->
->>
->
->> --
->
->> To unsubscribe from this list: send the line "unsubscribe kvm" in
->
->> the body of a message to address@hidden
->
->> More majordomo info at
-http://vger.kernel.org/majordomo-info.html
->
->
->
->
->
-> .
->
->
->
->
->
-
-On 2015/7/7 20:21, Igor Mammedov wrote:
-On Tue, 7 Jul 2015 19:43:35 +0800
-zhanghailiang <address@hidden> wrote:
-On 2015/7/7 19:23, Igor Mammedov wrote:
-On Mon, 6 Jul 2015 17:59:10 +0800
-zhanghailiang <address@hidden> wrote:
-On 2015/7/6 16:45, Paolo Bonzini wrote:
-On 06/07/2015 09:54, zhanghailiang wrote:
-From host, we found that QEMU vcpu1 thread and vcpu7 thread were not
-consuming any cpu (Should be in idle state),
-All of VCPUs' stacks in host is like bellow:
-
-[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
-[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
-[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
-[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
-[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
-[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
-[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
-[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
-[<ffffffffffffffff>] 0xffffffffffffffff
-
-We looked into the kernel codes that could leading to the above 'Stuck'
-warning,
-in current upstream there isn't any printk(...Stuck...) left since that code 
-path
-has been reworked.
-I've often seen this on over-committed host during guest CPUs up/down torture 
-test.
-Could you update guest kernel to upstream and see if issue reproduces?
-Hmm, Unfortunately, it is very hard to reproduce, and we are still trying to 
-reproduce it.
-
-For your test case, is it a kernel bug?
-Or is there any related patch could solve your test problem been merged into
-upstream ?
-I don't remember all prerequisite patches but you should be able to find
-http://marc.info/?l=linux-kernel&m=140326703108009&w=2
-"x86/smpboot: Initialize secondary CPU only if master CPU will wait for it"
-and then look for dependencies.
-Er, we have investigated this patch, and it is not related to our problem, :)
-
-Thanks.
-Thanks,
-zhanghailiang
-and found that the only possible is the emulation of 'cpuid' instruct in
-kvm/qemu has something wrong.
-But since we can’t reproduce this problem, we are not quite sure.
-Is there any possible that the cupid emulation in kvm/qemu has some bug ?
-Can you explain the relationship to the cpuid emulation?  What do the
-traces say about vcpus 1 and 7?
-OK, we searched the VM's kernel codes with the 'Stuck' message, and  it is 
-located in
-do_boot_cpu(). It's in BSP context, the call process is:
-BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_boot_cpu() 
--> wakeup_secondary_via_INIT() to trigger APs.
-It will wait 5s for APs to startup, if some AP not startup normally, it will 
-print 'CPU%d Stuck' or 'CPU%d: Not responding'.
-
-If it prints 'Stuck', it means the AP has received the SIPI interrupt and 
-begins to execute the code
-'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some places before 
-smp_callin()(smpboot.c).
-The follow is the starup process of BSP and AP.
-BSP:
-start_kernel()
-     ->smp_init()
-        ->smp_boot_cpus()
-          ->do_boot_cpu()
-              ->start_ip = trampoline_address(); //set the address that AP will 
-go to execute
-              ->wakeup_secondary_cpu_via_init(); // kick the secondary CPU
-              ->for (timeout = 0; timeout < 50000; timeout++)
-                  if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;// check if 
-AP startup or not
-
-APs:
-ENTRY(trampoline_data) (trampoline_64.S)
-         ->ENTRY(secondary_startup_64) (head_64.S)
-            ->start_secondary() (smpboot.c)
-               ->cpu_init();
-               ->smp_callin();
-                   ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note: if AP 
-comes here, the BSP will not prints the error message.
-
-   From above call process, we can be sure that, the AP has been stuck between 
-trampoline_data and the cpumask_set_cpu() in
-smp_callin(), we look through these codes path carefully, and only found a 
-'hlt' instruct that could block the process.
-It is located in trampoline_data():
-
-ENTRY(trampoline_data)
-           ...
-
-        call    verify_cpu              # Verify the cpu supports long mode
-        testl   %eax, %eax              # Check for return code
-        jnz     no_longmode
-
-           ...
-
-no_longmode:
-        hlt
-        jmp no_longmode
-
-For the process verify_cpu(),
-we can only find the 'cpuid' sensitive instruct that could lead VM exit from 
-No-root mode.
-This is why we doubt if cpuid emulation is wrong in KVM/QEMU that leading to 
-the fail in verify_cpu.
-
-   From the message in VM, we know vcpu1 and vcpu7 is something wrong.
-[    5.060042] CPU1: Stuck ??
-[   10.170815] CPU7: Stuck ??
-[   10.171648] Brought up 6 CPUs
-
-Besides, the follow is the cpus message got from host.
-80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-monitor-command 
-instance-0000000
-* CPU #0: pc=0x00007f64160c683d thread_id=68570
-     CPU #1: pc=0xffffffff810301f1 (halted) thread_id=68573
-     CPU #2: pc=0xffffffff810301e2 (halted) thread_id=68575
-     CPU #3: pc=0xffffffff810301e2 (halted) thread_id=68576
-     CPU #4: pc=0xffffffff810301e2 (halted) thread_id=68577
-     CPU #5: pc=0xffffffff810301e2 (halted) thread_id=68578
-     CPU #6: pc=0xffffffff810301e2 (halted) thread_id=68583
-     CPU #7: pc=0xffffffff810301f1 (halted) thread_id=68584
-
-Oh, i also forgot to mention in the above message that, we have bond each vCPU 
-to different physical CPU in
-host.
-
-Thanks,
-zhanghailiang
-
-
-
-
---
-To unsubscribe from this list: send the line "unsubscribe kvm" in
-the body of a message to address@hidden
-More majordomo info at
-http://vger.kernel.org/majordomo-info.html
-.
-.
-
diff --git a/results/classifier/016/virtual/35170175 b/results/classifier/016/virtual/35170175
deleted file mode 100644
index 7ea10dab..00000000
--- a/results/classifier/016/virtual/35170175
+++ /dev/null
@@ -1,548 +0,0 @@
-virtual: 0.801
-debug: 0.796
-x86: 0.144
-files: 0.086
-operating system: 0.076
-PID: 0.072
-TCG: 0.033
-register: 0.023
-kernel: 0.020
-assembly: 0.019
-i386: 0.018
-ppc: 0.013
-hypervisor: 0.013
-user-level: 0.008
-performance: 0.008
-semantic: 0.007
-device: 0.005
-architecture: 0.003
-arm: 0.003
-network: 0.003
-boot: 0.003
-VMM: 0.002
-graphic: 0.002
-permissions: 0.002
-peripherals: 0.002
-KVM: 0.002
-alpha: 0.001
-socket: 0.001
-risc-v: 0.001
-vnc: 0.001
-mistranslation: 0.001
-
-[Qemu-devel] [BUG] QEMU crashes with dpdk virtio pmd
-
-Qemu crashes, with pre-condition:
-vm xml config with multiqueue, and the vm's driver virtio-net support 
-multi-queue
-
-reproduce steps:
-i. start dpdk testpmd in VM with the virtio nic
-ii. stop testpmd
-iii. reboot the VM
-
-This commit "f9d6dbf0  remove virtio queues if the guest doesn't support 
-multiqueue" is introduced.
-
-Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a)
-VM DPDK version:  DPDK-1.6.1
-
-Call Trace:
-#0  0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6
-#1  0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6
-#2  0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6
-#3  0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6
-#4  0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0
-#5  0x00007f6088fea32c in iter_remove_or_steal () from 
-/usr/lib64/libglib-2.0.so.0
-#6  0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at 
-qom/object.c:410
-#7  object_finalize (data=0x7f6091e74800) at qom/object.c:467
-#8  object_unref (address@hidden) at qom/object.c:903
-#9  0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at 
-git/qemu/exec.c:1154
-#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163
-#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514
-#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6
-
-Call Trace:
-#0  0x00007fdccaeb9790 in ?? ()
-#1  0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at 
-qom/object.c:405
-#2  object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467
-#3  object_unref (address@hidden) at qom/object.c:903
-#4  0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at 
-git/qemu/exec.c:1154
-#5  phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163
-#6  address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514
-#7  0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#8  0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#9  0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6
-
-The q->tx_bh will free in virtio_net_del_queue() function, when remove virtio 
-queues 
-if the guest doesn't support multiqueue. But it might be still referenced by 
-others (eg . virtio_net_set_status()),
-which need so set NULL.
-
-diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
-index 7d091c9..98bd683 100644
---- a/hw/net/virtio-net.c
-+++ b/hw/net/virtio-net.c
-@@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n, int index)
-     if (q->tx_timer) {
-         timer_del(q->tx_timer);
-         timer_free(q->tx_timer);
-+        q->tx_timer = NULL;
-     } else {
-         qemu_bh_delete(q->tx_bh);
-+        q->tx_bh = NULL;
-     }
-+    q->tx_waiting = 0;
-     virtio_del_queue(vdev, index * 2 + 1);
- }
-
-From: wangyunjian 
-Sent: Monday, April 24, 2017 6:10 PM
-To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason Wang' 
-<address@hidden>
-Cc: wangyunjian <address@hidden>; caihe <address@hidden>
-Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd 
-
-Qemu crashes, with pre-condition:
-vm xml config with multiqueue, and the vm's driver virtio-net support 
-multi-queue
-
-reproduce steps:
-i. start dpdk testpmd in VM with the virtio nic
-ii. stop testpmd
-iii. reboot the VM
-
-This commit "f9d6dbf0  remove virtio queues if the guest doesn't support 
-multiqueue" is introduced.
-
-Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a)
-VM DPDK version:  DPDK-1.6.1
-
-Call Trace:
-#0  0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6
-#1  0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6
-#2  0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6
-#3  0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6
-#4  0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0
-#5  0x00007f6088fea32c in iter_remove_or_steal () from 
-/usr/lib64/libglib-2.0.so.0
-#6  0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at 
-qom/object.c:410
-#7  object_finalize (data=0x7f6091e74800) at qom/object.c:467
-#8  object_unref (address@hidden) at qom/object.c:903
-#9  0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at 
-git/qemu/exec.c:1154
-#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163
-#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514
-#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6
-
-Call Trace:
-#0  0x00007fdccaeb9790 in ?? ()
-#1  0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at 
-qom/object.c:405
-#2  object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467
-#3  object_unref (address@hidden) at qom/object.c:903
-#4  0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at 
-git/qemu/exec.c:1154
-#5  phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163
-#6  address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514
-#7  0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#8  0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#9  0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6
-
-On 2017年04月25日 19:37, wangyunjian wrote:
-The q->tx_bh will free in virtio_net_del_queue() function, when remove virtio 
-queues
-if the guest doesn't support multiqueue. But it might be still referenced by 
-others (eg . virtio_net_set_status()),
-which need so set NULL.
-
-diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
-index 7d091c9..98bd683 100644
---- a/hw/net/virtio-net.c
-+++ b/hw/net/virtio-net.c
-@@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n, int index)
-      if (q->tx_timer) {
-          timer_del(q->tx_timer);
-          timer_free(q->tx_timer);
-+        q->tx_timer = NULL;
-      } else {
-          qemu_bh_delete(q->tx_bh);
-+        q->tx_bh = NULL;
-      }
-+    q->tx_waiting = 0;
-      virtio_del_queue(vdev, index * 2 + 1);
-  }
-Thanks a lot for the fix.
-
-Two questions:
-- If virtio_net_set_status() is the only function that may access tx_bh,
-it looks like setting tx_waiting to zero is sufficient?
-- Can you post a formal patch for this?
-
-Thanks
-From: wangyunjian
-Sent: Monday, April 24, 2017 6:10 PM
-To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason Wang' 
-<address@hidden>
-Cc: wangyunjian <address@hidden>; caihe <address@hidden>
-Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd
-
-Qemu crashes, with pre-condition:
-vm xml config with multiqueue, and the vm's driver virtio-net support 
-multi-queue
-
-reproduce steps:
-i. start dpdk testpmd in VM with the virtio nic
-ii. stop testpmd
-iii. reboot the VM
-
-This commit "f9d6dbf0  remove virtio queues if the guest doesn't support 
-multiqueue" is introduced.
-
-Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a)
-VM DPDK version:  DPDK-1.6.1
-
-Call Trace:
-#0  0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6
-#1  0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6
-#2  0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6
-#3  0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6
-#4  0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0
-#5  0x00007f6088fea32c in iter_remove_or_steal () from 
-/usr/lib64/libglib-2.0.so.0
-#6  0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800) at 
-qom/object.c:410
-#7  object_finalize (data=0x7f6091e74800) at qom/object.c:467
-#8  object_unref (address@hidden) at qom/object.c:903
-#9  0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at 
-git/qemu/exec.c:1154
-#10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163
-#11 address_space_dispatch_free (d=0x7f6090b72b90) at git/qemu/exec.c:2514
-#12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6
-
-Call Trace:
-#0  0x00007fdccaeb9790 in ?? ()
-#1  0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at 
-qom/object.c:405
-#2  object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467
-#3  object_unref (address@hidden) at qom/object.c:903
-#4  0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at 
-git/qemu/exec.c:1154
-#5  phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163
-#6  address_space_dispatch_free (d=0x7fdcdc86a9e0) at git/qemu/exec.c:2514
-#7  0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at 
-util/rcu.c:272
-#8  0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0
-#9  0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6
-
-CCing Paolo and Stefan, since it has a relationship with bh in Qemu.
-
->
------Original Message-----
->
-From: Jason Wang [
-mailto:address@hidden
->
->
->
-On 2017年04月25日 19:37, wangyunjian wrote:
->
-> The q->tx_bh will free in virtio_net_del_queue() function, when remove
->
-> virtio
->
-queues
->
-> if the guest doesn't support multiqueue. But it might be still referenced by
->
-others (eg . virtio_net_set_status()),
->
-> which need so set NULL.
->
->
->
-> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
->
-> index 7d091c9..98bd683 100644
->
-> --- a/hw/net/virtio-net.c
->
-> +++ b/hw/net/virtio-net.c
->
-> @@ -1522,9 +1522,12 @@ static void virtio_net_del_queue(VirtIONet *n,
->
-int index)
->
->       if (q->tx_timer) {
->
->           timer_del(q->tx_timer);
->
->           timer_free(q->tx_timer);
->
-> +        q->tx_timer = NULL;
->
->       } else {
->
->           qemu_bh_delete(q->tx_bh);
->
-> +        q->tx_bh = NULL;
->
->       }
->
-> +    q->tx_waiting = 0;
->
->       virtio_del_queue(vdev, index * 2 + 1);
->
->   }
->
->
-Thanks a lot for the fix.
->
->
-Two questions:
->
->
-- If virtio_net_set_status() is the only function that may access tx_bh,
->
-it looks like setting tx_waiting to zero is sufficient?
-Currently yes, but we don't assure that it works for all scenarios, so
-we set the tx_bh and tx_timer to NULL to avoid to possibly access wild pointer,
-which is the common method for usage of bh in Qemu.
-
-I have another question about the root cause of this issure.
-
-This below trace is the path of setting tx_waiting to one in 
-virtio_net_handle_tx_bh() :
-
-Breakpoint 1, virtio_net_handle_tx_bh (vdev=0x0, vq=0x7f335ad13900) at 
-/data/wyj/git/qemu/hw/net/virtio-net.c:1398
-1398    {
-(gdb) bt
-#0  virtio_net_handle_tx_bh (vdev=0x0, vq=0x7f335ad13900) at 
-/data/wyj/git/qemu/hw/net/virtio-net.c:1398
-#1  0x00007f3357bddf9c in virtio_bus_set_host_notifier (bus=<optimized out>, 
-address@hidden, address@hidden) at hw/virtio/virtio-bus.c:297
-#2  0x00007f3357a0055d in vhost_dev_disable_notifiers (address@hidden, 
-address@hidden) at /data/wyj/git/qemu/hw/virtio/vhost.c:1422
-#3  0x00007f33579e3373 in vhost_net_stop_one (net=0x7f335ad84dc0, 
-dev=0x7f335c6f5f90) at /data/wyj/git/qemu/hw/net/vhost_net.c:289
-#4  0x00007f33579e385b in vhost_net_stop (address@hidden, ncs=<optimized out>, 
-address@hidden) at /data/wyj/git/qemu/hw/net/vhost_net.c:367
-#5  0x00007f33579e15de in virtio_net_vhost_status (status=<optimized out>, 
-n=0x7f335c6f5f90) at /data/wyj/git/qemu/hw/net/virtio-net.c:176
-#6  virtio_net_set_status (vdev=0x7f335c6f5f90, status=0 '\000') at 
-/data/wyj/git/qemu/hw/net/virtio-net.c:250
-#7  0x00007f33579f8dc6 in virtio_set_status (address@hidden, address@hidden 
-'\000') at /data/wyj/git/qemu/hw/virtio/virtio.c:1146
-#8  0x00007f3357bdd3cc in virtio_ioport_write (val=0, addr=18, 
-opaque=0x7f335c6eda80) at hw/virtio/virtio-pci.c:387
-#9  virtio_pci_config_write (opaque=0x7f335c6eda80, addr=18, val=0, 
-size=<optimized out>) at hw/virtio/virtio-pci.c:511
-#10 0x00007f33579b2155 in memory_region_write_accessor (mr=0x7f335c6ee470, 
-addr=18, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized 
-out>, attrs=...) at /data/wyj/git/qemu/memory.c:526
-#11 0x00007f33579af2e9 in access_with_adjusted_size (address@hidden, 
-address@hidden, address@hidden, access_size_min=<optimized out>, 
-access_size_max=<optimized out>, address@hidden
-    0x7f33579b20f0 <memory_region_write_accessor>, address@hidden, 
-address@hidden) at /data/wyj/git/qemu/memory.c:592
-#12 0x00007f33579b2e15 in memory_region_dispatch_write (address@hidden, 
-address@hidden, data=0, address@hidden, address@hidden) at 
-/data/wyj/git/qemu/memory.c:1319
-#13 0x00007f335796cd93 in address_space_write_continue (mr=0x7f335c6ee470, l=1, 
-addr1=18, len=1, buf=0x7f335773d000 "", attrs=..., addr=49170, 
-as=0x7f3358317060 <address_space_io>) at /data/wyj/git/qemu/exec.c:2834
-#14 address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., 
-buf=<optimized out>, len=<optimized out>) at /data/wyj/git/qemu/exec.c:2879
-#15 0x00007f335796d3ad in address_space_rw (as=<optimized out>, address@hidden, 
-attrs=..., address@hidden, buf=<optimized out>, address@hidden, address@hidden) 
-at /data/wyj/git/qemu/exec.c:2981
-#16 0x00007f33579ae226 in kvm_handle_io (count=1, size=1, direction=<optimized 
-out>, data=<optimized out>, attrs=..., port=49170) at 
-/data/wyj/git/qemu/kvm-all.c:1803
-#17 kvm_cpu_exec (address@hidden) at /data/wyj/git/qemu/kvm-all.c:2032
-#18 0x00007f335799b632 in qemu_kvm_cpu_thread_fn (arg=0x7f335ae82070) at 
-/data/wyj/git/qemu/cpus.c:1118
-#19 0x00007f3352983dc5 in start_thread () from /usr/lib64/libpthread.so.0
-#20 0x00007f335113571d in clone () from /usr/lib64/libc.so.6
-
-It calls qemu_bh_schedule(q->tx_bh) at the bottom of virtio_net_handle_tx_bh(),
-I don't know why virtio_net_tx_bh() doesn't be invoked, so that the 
-q->tx_waiting is not zero.
-[ps: we added logs in virtio_net_tx_bh() to verify that]
-
-Some other information: 
-
-It won't crash if we don't use vhost-net.
-
-
-Thanks,
--Gonglei
-
->
-- Can you post a formal patch for this?
->
->
-Thanks
->
->
-> From: wangyunjian
->
-> Sent: Monday, April 24, 2017 6:10 PM
->
-> To: address@hidden; Michael S. Tsirkin <address@hidden>; 'Jason
->
-Wang' <address@hidden>
->
-> Cc: wangyunjian <address@hidden>; caihe <address@hidden>
->
-> Subject: [Qemu-devel][BUG] QEMU crashes with dpdk virtio pmd
->
->
->
-> Qemu crashes, with pre-condition:
->
-> vm xml config with multiqueue, and the vm's driver virtio-net support
->
-multi-queue
->
->
->
-> reproduce steps:
->
-> i. start dpdk testpmd in VM with the virtio nic
->
-> ii. stop testpmd
->
-> iii. reboot the VM
->
->
->
-> This commit "f9d6dbf0  remove virtio queues if the guest doesn't support
->
-multiqueue" is introduced.
->
->
->
-> Qemu version: QEMU emulator version 2.9.50 (v2.9.0-137-g32c7e0a)
->
-> VM DPDK version:  DPDK-1.6.1
->
->
->
-> Call Trace:
->
-> #0  0x00007f60881fe5d7 in raise () from /usr/lib64/libc.so.6
->
-> #1  0x00007f60881ffcc8 in abort () from /usr/lib64/libc.so.6
->
-> #2  0x00007f608823e2f7 in __libc_message () from /usr/lib64/libc.so.6
->
-> #3  0x00007f60882456d3 in _int_free () from /usr/lib64/libc.so.6
->
-> #4  0x00007f608900158f in g_free () from /usr/lib64/libglib-2.0.so.0
->
-> #5  0x00007f6088fea32c in iter_remove_or_steal () from
->
-/usr/lib64/libglib-2.0.so.0
->
-> #6  0x00007f608edc0986 in object_property_del_all (obj=0x7f6091e74800)
->
-at qom/object.c:410
->
-> #7  object_finalize (data=0x7f6091e74800) at qom/object.c:467
->
-> #8  object_unref (address@hidden) at qom/object.c:903
->
-> #9  0x00007f608eaf1fd3 in phys_section_destroy (mr=0x7f6091e74800) at
->
-git/qemu/exec.c:1154
->
-> #10 phys_sections_free (map=0x7f6090b72bb0) at git/qemu/exec.c:1163
->
-> #11 address_space_dispatch_free (d=0x7f6090b72b90) at
->
-git/qemu/exec.c:2514
->
-> #12 0x00007f608ee91ace in call_rcu_thread (opaque=<optimized out>) at
->
-util/rcu.c:272
->
-> #13 0x00007f6089b0ddc5 in start_thread () from /usr/lib64/libpthread.so.0
->
-> #14 0x00007f60882bf71d in clone () from /usr/lib64/libc.so.6
->
->
->
-> Call Trace:
->
-> #0  0x00007fdccaeb9790 in ?? ()
->
-> #1  0x00007fdcd82d09fc in object_property_del_all (obj=0x7fdcdb8acf60) at
->
-qom/object.c:405
->
-> #2  object_finalize (data=0x7fdcdb8acf60) at qom/object.c:467
->
-> #3  object_unref (address@hidden) at qom/object.c:903
->
-> #4  0x00007fdcd8001fd3 in phys_section_destroy (mr=0x7fdcdb8acf60) at
->
-git/qemu/exec.c:1154
->
-> #5  phys_sections_free (map=0x7fdcdc86aa00) at git/qemu/exec.c:1163
->
-> #6  address_space_dispatch_free (d=0x7fdcdc86a9e0) at
->
-git/qemu/exec.c:2514
->
-> #7  0x00007fdcd83a1ace in call_rcu_thread (opaque=<optimized out>) at
->
-util/rcu.c:272
->
-> #8  0x00007fdcd301ddc5 in start_thread () from /usr/lib64/libpthread.so.0
->
-> #9  0x00007fdcd17cf71d in clone () from /usr/lib64/libc.so.6
->
->
->
->
-
-On 25/04/2017 14:02, Jason Wang wrote:
->
->
-Thanks a lot for the fix.
->
->
-Two questions:
->
->
-- If virtio_net_set_status() is the only function that may access tx_bh,
->
-it looks like setting tx_waiting to zero is sufficient?
-I think clearing tx_bh is better anyway, as leaving a dangling pointer
-is not very hygienic.
-
-Paolo
-
->
-- Can you post a formal patch for this?
-
diff --git a/results/classifier/016/virtual/36568044 b/results/classifier/016/virtual/36568044
deleted file mode 100644
index 4507a105..00000000
--- a/results/classifier/016/virtual/36568044
+++ /dev/null
@@ -1,4608 +0,0 @@
-virtual: 0.897
-hypervisor: 0.757
-KVM: 0.717
-debug: 0.694
-socket: 0.541
-kernel: 0.452
-TCG: 0.366
-x86: 0.260
-network: 0.208
-register: 0.159
-operating system: 0.097
-device: 0.063
-PID: 0.049
-files: 0.034
-VMM: 0.031
-risc-v: 0.023
-assembly: 0.017
-ppc: 0.015
-alpha: 0.011
-peripherals: 0.010
-user-level: 0.009
-i386: 0.008
-semantic: 0.008
-performance: 0.007
-architecture: 0.007
-graphic: 0.004
-permissions: 0.004
-arm: 0.003
-vnc: 0.003
-boot: 0.003
-mistranslation: 0.001
-
-[BUG, RFC] cpr-transfer: qxl guest driver crashes after migration
-
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
->
-EMULATOR=/path/to/emulator
->
-ROOTFS=/path/to/image
->
-QMPSOCK=/var/run/alma8qmp-src.sock
->
->
-$EMULATOR -enable-kvm \
->
--machine q35 \
->
--cpu host -smp 2 -m 2G \
->
--object
->
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
->
--machine memory-backend=ram0 \
->
--machine aux-ram-share=on \
->
--drive file=$ROOTFS,media=disk,if=virtio \
->
--qmp unix:$QMPSOCK,server=on,wait=off \
->
--nographic \
->
--device qxl-vga
-Run migration target:
->
-EMULATOR=/path/to/emulator
->
-ROOTFS=/path/to/image
->
-QMPSOCK=/var/run/alma8qmp-dst.sock
->
->
->
->
-$EMULATOR -enable-kvm \
->
--machine q35 \
->
--cpu host -smp 2 -m 2G \
->
--object
->
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
->
--machine memory-backend=ram0 \
->
--machine aux-ram-share=on \
->
--drive file=$ROOTFS,media=disk,if=virtio \
->
--qmp unix:$QMPSOCK,server=on,wait=off \
->
--nographic \
->
--device qxl-vga \
->
--incoming tcp:0:44444 \
->
--incoming '{"channel-type": "cpr", "addr": { "transport": "socket",
->
-"type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
->
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
->
-QMPSOCK=/var/run/alma8qmp-src.sock
->
->
-$QMPSHELL -p $QMPSOCK <<EOF
->
-migrate-set-parameters mode=cpr-transfer
->
-migrate
->
-channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}]
->
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
->
-[   73.962002] [TTM] Buffer eviction failed
->
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
->
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate
->
-VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which speeds up
-the crash in the guest):
->
-#!/bin/bash
->
->
-chvt 3
->
->
-for j in $(seq 80); do
->
-echo "$(date) starting round $j"
->
-if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != ""
->
-]; then
->
-echo "bug was reproduced after $j tries"
->
-exit 1
->
-fi
->
-for i in $(seq 100); do
->
-dmesg > /dev/tty3
->
-done
->
-done
->
->
-echo "bug could not be reproduced"
->
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into this?  Any
-suggestions would be appreciated.  Thanks!
-
-Andrey
-
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-     -machine q35 \
-     -cpu host -smp 2 -m 2G \
-     -object 
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
-     -machine memory-backend=ram0 \
-     -machine aux-ram-share=on \
-     -drive file=$ROOTFS,media=disk,if=virtio \
-     -qmp unix:$QMPSOCK,server=on,wait=off \
-     -nographic \
-     -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
--machine q35 \
-     -cpu host -smp 2 -m 2G \
-     -object 
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
-     -machine memory-backend=ram0 \
-     -machine aux-ram-share=on \
-     -drive file=$ROOTFS,media=disk,if=virtio \
-     -qmp unix:$QMPSOCK,server=on,wait=off \
-     -nographic \
-     -device qxl-vga \
-     -incoming tcp:0:44444 \
-     -incoming '{"channel-type": "cpr", "addr": { "transport": "socket", "type": "unix", 
-"path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-     migrate-set-parameters mode=cpr-transfer
-     migrate 
-channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate 
-VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-         echo "$(date) starting round $j"
-         if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" 
-]; then
-                 echo "bug was reproduced after $j tries"
-                 exit 1
-         fi
-         for i in $(seq 100); do
-                 dmesg > /dev/tty3
-         done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-
-- Steve
-
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-     -machine q35 \
-     -cpu host -smp 2 -m 2G \
-     -object 
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
-     -machine memory-backend=ram0 \
-     -machine aux-ram-share=on \
-     -drive file=$ROOTFS,media=disk,if=virtio \
-     -qmp unix:$QMPSOCK,server=on,wait=off \
-     -nographic \
-     -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-     -machine q35 \
-     -cpu host -smp 2 -m 2G \
-     -object 
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/ram0,share=on\
-     -machine memory-backend=ram0 \
-     -machine aux-ram-share=on \
-     -drive file=$ROOTFS,media=disk,if=virtio \
-     -qmp unix:$QMPSOCK,server=on,wait=off \
-     -nographic \
-     -device qxl-vga \
-     -incoming tcp:0:44444 \
-     -incoming '{"channel-type": "cpr", "addr": { "transport": "socket", "type": "unix", 
-"path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-     migrate-set-parameters mode=cpr-transfer
-     migrate 
-channels=[{"channel-type":"main","addr":{"transport":"socket","type":"inet","host":"0","port":"44444"}},{"channel-type":"cpr","addr":{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate 
-VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-         echo "$(date) starting round $j"
-         if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" 
-]; then
-                 echo "bug was reproduced after $j tries"
-                 exit 1
-         fi
-         for i in $(seq 100); do
-                 dmesg > /dev/tty3
-         done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-1740667681-257312-1-git-send-email-steven.sistare@oracle.com
-/">https://lore.kernel.org/qemu-devel/
-1740667681-257312-1-git-send-email-steven.sistare@oracle.com
-/
-- Steve
-
-On 2/28/25 8:20 PM, Steven Sistare wrote:
->
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
->
-> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
->
->> Hi all,
->
->>
->
->> We've been experimenting with cpr-transfer migration mode recently and
->
->> have discovered the following issue with the guest QXL driver:
->
->>
->
->> Run migration source:
->
->>> EMULATOR=/path/to/emulator
->
->>> ROOTFS=/path/to/image
->
->>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>
->
->>> $EMULATOR -enable-kvm \
->
->>>      -machine q35 \
->
->>>      -cpu host -smp 2 -m 2G \
->
->>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>> ram0,share=on\
->
->>>      -machine memory-backend=ram0 \
->
->>>      -machine aux-ram-share=on \
->
->>>      -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>      -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>      -nographic \
->
->>>      -device qxl-vga
->
->>
->
->> Run migration target:
->
->>> EMULATOR=/path/to/emulator
->
->>> ROOTFS=/path/to/image
->
->>> QMPSOCK=/var/run/alma8qmp-dst.sock
->
->>> $EMULATOR -enable-kvm \
->
->>>      -machine q35 \
->
->>>      -cpu host -smp 2 -m 2G \
->
->>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>> ram0,share=on\
->
->>>      -machine memory-backend=ram0 \
->
->>>      -machine aux-ram-share=on \
->
->>>      -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>      -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>      -nographic \
->
->>>      -device qxl-vga \
->
->>>      -incoming tcp:0:44444 \
->
->>>      -incoming '{"channel-type": "cpr", "addr": { "transport":
->
->>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
->
->>
->
->>
->
->> Launch the migration:
->
->>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
->
->>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>
->
->>> $QMPSHELL -p $QMPSOCK <<EOF
->
->>>      migrate-set-parameters mode=cpr-transfer
->
->>>      migrate channels=[{"channel-type":"main","addr":
->
->>> {"transport":"socket","type":"inet","host":"0","port":"44444"}},
->
->>> {"channel-type":"cpr","addr":
->
->>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
->
->>> dst.sock"}}]
->
->>> EOF
->
->>
->
->> Then, after a while, QXL guest driver on target crashes spewing the
->
->> following messages:
->
->>> [   73.962002] [TTM] Buffer eviction failed
->
->>> [   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
->
->>> 0x00000001)
->
->>> [   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
->
->>> allocate VRAM BO
->
->>
->
->> That seems to be a known kernel QXL driver bug:
->
->>
->
->>
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
->
->>
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
->
->>
->
->> (the latter discussion contains that reproduce script which speeds up
->
->> the crash in the guest):
->
->>> #!/bin/bash
->
->>>
->
->>> chvt 3
->
->>>
->
->>> for j in $(seq 80); do
->
->>>          echo "$(date) starting round $j"
->
->>>          if [ "$(journalctl --boot | grep "failed to allocate VRAM
->
->>> BO")" != "" ]; then
->
->>>                  echo "bug was reproduced after $j tries"
->
->>>                  exit 1
->
->>>          fi
->
->>>          for i in $(seq 100); do
->
->>>                  dmesg > /dev/tty3
->
->>>          done
->
->>> done
->
->>>
->
->>> echo "bug could not be reproduced"
->
->>> exit 0
->
->>
->
->> The bug itself seems to remain unfixed, as I was able to reproduce that
->
->> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
->
->> cpr-transfer code also seems to be buggy as it triggers the crash -
->
->> without the cpr-transfer migration the above reproduce doesn't lead to
->
->> crash on the source VM.
->
->>
->
->> I suspect that, as cpr-transfer doesn't migrate the guest memory, but
->
->> rather passes it through the memory backend object, our code might
->
->> somehow corrupt the VRAM.  However, I wasn't able to trace the
->
->> corruption so far.
->
->>
->
->> Could somebody help the investigation and take a look into this?  Any
->
->> suggestions would be appreciated.  Thanks!
->
->
->
-> Possibly some memory region created by qxl is not being preserved.
->
-> Try adding these traces to see what is preserved:
->
->
->
-> -trace enable='*cpr*'
->
-> -trace enable='*ram_alloc*'
->
->
-Also try adding this patch to see if it flags any ram blocks as not
->
-compatible with cpr.  A message is printed at migration start time.
->

-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email-
->
-steven.sistare@oracle.com/
->
->
-- Steve
->
-With the traces enabled + the "migration: ram block cpr blockers" patch
-applied:
-
-Source:
->
-cpr_find_fd pc.bios, id 0 returns -1
->
-cpr_save_fd pc.bios, id 0, fd 22
->
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
->
-0x7fec18e00000
->
-cpr_find_fd pc.rom, id 0 returns -1
->
-cpr_save_fd pc.rom, id 0, fd 23
->
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
->
-0x7fec18c00000
->
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
->
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
->
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd
->
-24 host 0x7fec18a00000
->
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
->
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
->
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864
->
-fd 25 host 0x7feb77e00000
->
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
->
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27
->
-host 0x7fec18800000
->
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
->
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864
->
-fd 28 host 0x7feb73c00000
->
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
->
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34
->
-host 0x7fec18600000
->
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
->
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
->
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 35
->
-host 0x7fec18200000
->
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
->
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
->
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36
->
-host 0x7feb8b600000
->
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
->
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
->
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host
->
-0x7feb8b400000
->
->
-cpr_state_save cpr-transfer mode
->
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
->
-cpr_transfer_input /var/run/alma8cpr-dst.sock
->
-cpr_state_load cpr-transfer mode
->
-cpr_find_fd pc.bios, id 0 returns 20
->
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
->
-0x7fcdc9800000
->
-cpr_find_fd pc.rom, id 0 returns 19
->
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
->
-0x7fcdc9600000
->
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
->
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd
->
-18 host 0x7fcdc9400000
->
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
->
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864
->
-fd 17 host 0x7fcd27e00000
->
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16
->
-host 0x7fcdc9200000
->
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864
->
-fd 15 host 0x7fcd23c00000
->
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
->
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14
->
-host 0x7fcdc8800000
->
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
->
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 13
->
-host 0x7fcdc8400000
->
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
->
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11
->
-host 0x7fcdc8200000
->
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
->
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host
->
-0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the same
-addresses), and no incompatible ram blocks are found during migration.
-
-Andrey
-
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
->
-On 2/28/25 8:20 PM, Steven Sistare wrote:
->
-> On 2/28/2025 1:13 PM, Steven Sistare wrote:
->
->> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
->
->>> Hi all,
->
->>>
->
->>> We've been experimenting with cpr-transfer migration mode recently and
->
->>> have discovered the following issue with the guest QXL driver:
->
->>>
->
->>> Run migration source:
->
->>>> EMULATOR=/path/to/emulator
->
->>>> ROOTFS=/path/to/image
->
->>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>
->
->>>> $EMULATOR -enable-kvm \
->
->>>>      -machine q35 \
->
->>>>      -cpu host -smp 2 -m 2G \
->
->>>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>>> ram0,share=on\
->
->>>>      -machine memory-backend=ram0 \
->
->>>>      -machine aux-ram-share=on \
->
->>>>      -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>      -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>      -nographic \
->
->>>>      -device qxl-vga
->
->>>
->
->>> Run migration target:
->
->>>> EMULATOR=/path/to/emulator
->
->>>> ROOTFS=/path/to/image
->
->>>> QMPSOCK=/var/run/alma8qmp-dst.sock
->
->>>> $EMULATOR -enable-kvm \
->
->>>>      -machine q35 \
->
->>>>      -cpu host -smp 2 -m 2G \
->
->>>>      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>>> ram0,share=on\
->
->>>>      -machine memory-backend=ram0 \
->
->>>>      -machine aux-ram-share=on \
->
->>>>      -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>      -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>      -nographic \
->
->>>>      -device qxl-vga \
->
->>>>      -incoming tcp:0:44444 \
->
->>>>      -incoming '{"channel-type": "cpr", "addr": { "transport":
->
->>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
->
->>>
->
->>>
->
->>> Launch the migration:
->
->>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
->
->>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>
->
->>>> $QMPSHELL -p $QMPSOCK <<EOF
->
->>>>      migrate-set-parameters mode=cpr-transfer
->
->>>>      migrate channels=[{"channel-type":"main","addr":
->
->>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}},
->
->>>> {"channel-type":"cpr","addr":
->
->>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
->
->>>> dst.sock"}}]
->
->>>> EOF
->
->>>
->
->>> Then, after a while, QXL guest driver on target crashes spewing the
->
->>> following messages:
->
->>>> [   73.962002] [TTM] Buffer eviction failed
->
->>>> [   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
->
->>>> 0x00000001)
->
->>>> [   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
->
->>>> allocate VRAM BO
->
->>>
->
->>> That seems to be a known kernel QXL driver bug:
->
->>>
->
->>>
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
->
->>>
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
->
->>>
->
->>> (the latter discussion contains that reproduce script which speeds up
->
->>> the crash in the guest):
->
->>>> #!/bin/bash
->
->>>>
->
->>>> chvt 3
->
->>>>
->
->>>> for j in $(seq 80); do
->
->>>>          echo "$(date) starting round $j"
->
->>>>          if [ "$(journalctl --boot | grep "failed to allocate VRAM
->
->>>> BO")" != "" ]; then
->
->>>>                  echo "bug was reproduced after $j tries"
->
->>>>                  exit 1
->
->>>>          fi
->
->>>>          for i in $(seq 100); do
->
->>>>                  dmesg > /dev/tty3
->
->>>>          done
->
->>>> done
->
->>>>
->
->>>> echo "bug could not be reproduced"
->
->>>> exit 0
->
->>>
->
->>> The bug itself seems to remain unfixed, as I was able to reproduce that
->
->>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
->
->>> cpr-transfer code also seems to be buggy as it triggers the crash -
->
->>> without the cpr-transfer migration the above reproduce doesn't lead to
->
->>> crash on the source VM.
->
->>>
->
->>> I suspect that, as cpr-transfer doesn't migrate the guest memory, but
->
->>> rather passes it through the memory backend object, our code might
->
->>> somehow corrupt the VRAM.  However, I wasn't able to trace the
->
->>> corruption so far.
->
->>>
->
->>> Could somebody help the investigation and take a look into this?  Any
->
->>> suggestions would be appreciated.  Thanks!
->
->>
->
->> Possibly some memory region created by qxl is not being preserved.
->
->> Try adding these traces to see what is preserved:
->
->>
->
->> -trace enable='*cpr*'
->
->> -trace enable='*ram_alloc*'
->
->
->
-> Also try adding this patch to see if it flags any ram blocks as not
->
-> compatible with cpr.  A message is printed at migration start time.
->
-> Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email-
->
-> steven.sistare@oracle.com/
->
->
->
-> - Steve
->
->
->
->
-With the traces enabled + the "migration: ram block cpr blockers" patch
->
-applied:
->
->
-Source:
->
-> cpr_find_fd pc.bios, id 0 returns -1
->
-> cpr_save_fd pc.bios, id 0, fd 22
->
-> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
->
-> 0x7fec18e00000
->
-> cpr_find_fd pc.rom, id 0 returns -1
->
-> cpr_save_fd pc.rom, id 0, fd 23
->
-> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
->
-> 0x7fec18c00000
->
-> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
->
-> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
->
-> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd
->
-> 24 host 0x7fec18a00000
->
-> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
->
-> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
->
-> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864
->
-> fd 25 host 0x7feb77e00000
->
-> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
->
-> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27
->
-> host 0x7fec18800000
->
-> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
->
-> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864
->
-> fd 28 host 0x7feb73c00000
->
-> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
->
-> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34
->
-> host 0x7fec18600000
->
-> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
->
-> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
->
-> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd
->
-> 35 host 0x7fec18200000
->
-> cpr_find_fd /rom@etc/table-loader, id 0 returns -1
->
-> cpr_save_fd /rom@etc/table-loader, id 0, fd 36
->
-> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36
->
-> host 0x7feb8b600000
->
-> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
->
-> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
->
-> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host
->
-> 0x7feb8b400000
->
->
->
-> cpr_state_save cpr-transfer mode
->
-> cpr_transfer_output /var/run/alma8cpr-dst.sock
->
->
-Target:
->
-> cpr_transfer_input /var/run/alma8cpr-dst.sock
->
-> cpr_state_load cpr-transfer mode
->
-> cpr_find_fd pc.bios, id 0 returns 20
->
-> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
->
-> 0x7fcdc9800000
->
-> cpr_find_fd pc.rom, id 0 returns 19
->
-> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
->
-> 0x7fcdc9600000
->
-> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
->
-> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd
->
-> 18 host 0x7fcdc9400000
->
-> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
->
-> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864
->
-> fd 17 host 0x7fcd27e00000
->
-> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16
->
-> host 0x7fcdc9200000
->
-> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864
->
-> fd 15 host 0x7fcd23c00000
->
-> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
->
-> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14
->
-> host 0x7fcdc8800000
->
-> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
->
-> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd
->
-> 13 host 0x7fcdc8400000
->
-> cpr_find_fd /rom@etc/table-loader, id 0 returns 11
->
-> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11
->
-> host 0x7fcdc8200000
->
-> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
->
-> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host
->
-> 0x7fcd3be00000
->
->
-Looks like both vga.vram and qxl.vram are being preserved (with the same
->
-addresses), and no incompatible ram blocks are found during migration.
->
-Sorry, addressed are not the same, of course.  However corresponding ram
-blocks do seem to be preserved and initialized.
-
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-      -machine q35 \
-      -cpu host -smp 2 -m 2G \
-      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-      -machine memory-backend=ram0 \
-      -machine aux-ram-share=on \
-      -drive file=$ROOTFS,media=disk,if=virtio \
-      -qmp unix:$QMPSOCK,server=on,wait=off \
-      -nographic \
-      -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-      -machine q35 \
-      -cpu host -smp 2 -m 2G \
-      -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-      -machine memory-backend=ram0 \
-      -machine aux-ram-share=on \
-      -drive file=$ROOTFS,media=disk,if=virtio \
-      -qmp unix:$QMPSOCK,server=on,wait=off \
-      -nographic \
-      -device qxl-vga \
-      -incoming tcp:0:44444 \
-      -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-      migrate-set-parameters mode=cpr-transfer
-      migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-          echo "$(date) starting round $j"
-          if [ "$(journalctl --boot | grep "failed to allocate VRAM
-BO")" != "" ]; then
-                  echo "bug was reproduced after $j tries"
-                  exit 1
-          fi
-          for i in $(seq 100); do
-                  dmesg > /dev/tty3
-          done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
- Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers" patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host 
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host 
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd 24 
-host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 fd 
-25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 27 host 
-0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 fd 
-28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 34 host 
-0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 35 
-host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 36 host 
-0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 37 host 
-0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host 
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host 
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size 262144 fd 18 
-host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size 67108864 fd 
-17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192 fd 16 host 
-0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size 67108864 fd 
-15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536 fd 14 host 
-0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size 2097152 fd 13 
-host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536 fd 11 host 
-0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd 10 host 
-0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the same
-addresses), and no incompatible ram blocks are found during migration.
-Sorry, addressed are not the same, of course.  However corresponding ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-  qemu_ram_alloc_internal()
-    if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-        ram_flags |= RAM_READONLY;
-    new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-0001-hw-qxl-cpr-support-preliminary.patch
-Description:
-Text document
-
-On 3/4/25 9:05 PM, Steven Sistare wrote:
->
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
->
-> On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
->
->> On 2/28/25 8:20 PM, Steven Sistare wrote:
->
->>> On 2/28/2025 1:13 PM, Steven Sistare wrote:
->
->>>> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
->
->>>>> Hi all,
->
->>>>>
->
->>>>> We've been experimenting with cpr-transfer migration mode recently
->
->>>>> and
->
->>>>> have discovered the following issue with the guest QXL driver:
->
->>>>>
->
->>>>> Run migration source:
->
->>>>>> EMULATOR=/path/to/emulator
->
->>>>>> ROOTFS=/path/to/image
->
->>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>>>
->
->>>>>> $EMULATOR -enable-kvm \
->
->>>>>>       -machine q35 \
->
->>>>>>       -cpu host -smp 2 -m 2G \
->
->>>>>>       -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>>>>> ram0,share=on\
->
->>>>>>       -machine memory-backend=ram0 \
->
->>>>>>       -machine aux-ram-share=on \
->
->>>>>>       -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>>>       -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>>>       -nographic \
->
->>>>>>       -device qxl-vga
->
->>>>>
->
->>>>> Run migration target:
->
->>>>>> EMULATOR=/path/to/emulator
->
->>>>>> ROOTFS=/path/to/image
->
->>>>>> QMPSOCK=/var/run/alma8qmp-dst.sock
->
->>>>>> $EMULATOR -enable-kvm \
->
->>>>>>       -machine q35 \
->
->>>>>>       -cpu host -smp 2 -m 2G \
->
->>>>>>       -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
->
->>>>>> ram0,share=on\
->
->>>>>>       -machine memory-backend=ram0 \
->
->>>>>>       -machine aux-ram-share=on \
->
->>>>>>       -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>>>       -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>>>       -nographic \
->
->>>>>>       -device qxl-vga \
->
->>>>>>       -incoming tcp:0:44444 \
->
->>>>>>       -incoming '{"channel-type": "cpr", "addr": { "transport":
->
->>>>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
->
->>>>>
->
->>>>>
->
->>>>> Launch the migration:
->
->>>>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
->
->>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>>>
->
->>>>>> $QMPSHELL -p $QMPSOCK <<EOF
->
->>>>>>       migrate-set-parameters mode=cpr-transfer
->
->>>>>>       migrate channels=[{"channel-type":"main","addr":
->
->>>>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}},
->
->>>>>> {"channel-type":"cpr","addr":
->
->>>>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
->
->>>>>> dst.sock"}}]
->
->>>>>> EOF
->
->>>>>
->
->>>>> Then, after a while, QXL guest driver on target crashes spewing the
->
->>>>> following messages:
->
->>>>>> [   73.962002] [TTM] Buffer eviction failed
->
->>>>>> [   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
->
->>>>>> 0x00000001)
->
->>>>>> [   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
->
->>>>>> allocate VRAM BO
->
->>>>>
->
->>>>> That seems to be a known kernel QXL driver bug:
->
->>>>>
->
->>>>>
-https://lore.kernel.org/all/20220907094423.93581-1-
->
->>>>> min_halo@163.com/T/
->
->>>>>
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
->
->>>>>
->
->>>>> (the latter discussion contains that reproduce script which speeds up
->
->>>>> the crash in the guest):
->
->>>>>> #!/bin/bash
->
->>>>>>
->
->>>>>> chvt 3
->
->>>>>>
->
->>>>>> for j in $(seq 80); do
->
->>>>>>           echo "$(date) starting round $j"
->
->>>>>>           if [ "$(journalctl --boot | grep "failed to allocate VRAM
->
->>>>>> BO")" != "" ]; then
->
->>>>>>                   echo "bug was reproduced after $j tries"
->
->>>>>>                   exit 1
->
->>>>>>           fi
->
->>>>>>           for i in $(seq 100); do
->
->>>>>>                   dmesg > /dev/tty3
->
->>>>>>           done
->
->>>>>> done
->
->>>>>>
->
->>>>>> echo "bug could not be reproduced"
->
->>>>>> exit 0
->
->>>>>
->
->>>>> The bug itself seems to remain unfixed, as I was able to reproduce
->
->>>>> that
->
->>>>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
->
->>>>> cpr-transfer code also seems to be buggy as it triggers the crash -
->
->>>>> without the cpr-transfer migration the above reproduce doesn't
->
->>>>> lead to
->
->>>>> crash on the source VM.
->
->>>>>
->
->>>>> I suspect that, as cpr-transfer doesn't migrate the guest memory, but
->
->>>>> rather passes it through the memory backend object, our code might
->
->>>>> somehow corrupt the VRAM.  However, I wasn't able to trace the
->
->>>>> corruption so far.
->
->>>>>
->
->>>>> Could somebody help the investigation and take a look into this?  Any
->
->>>>> suggestions would be appreciated.  Thanks!
->
->>>>
->
->>>> Possibly some memory region created by qxl is not being preserved.
->
->>>> Try adding these traces to see what is preserved:
->
->>>>
->
->>>> -trace enable='*cpr*'
->
->>>> -trace enable='*ram_alloc*'
->
->>>
->
->>> Also try adding this patch to see if it flags any ram blocks as not
->
->>> compatible with cpr.  A message is printed at migration start time.
->
->>>  Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
->
->>> email-
->
->>> steven.sistare@oracle.com/
->
->>>
->
->>> - Steve
->
->>>
->
->>
->
->> With the traces enabled + the "migration: ram block cpr blockers" patch
->
->> applied:
->
->>
->
->> Source:
->
->>> cpr_find_fd pc.bios, id 0 returns -1
->
->>> cpr_save_fd pc.bios, id 0, fd 22
->
->>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
->
->>> 0x7fec18e00000
->
->>> cpr_find_fd pc.rom, id 0 returns -1
->
->>> cpr_save_fd pc.rom, id 0, fd 23
->
->>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
->
->>> 0x7fec18c00000
->
->>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
->
->>> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
->
->>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
->
->>> 262144 fd 24 host 0x7fec18a00000
->
->>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
->
->>> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
->
->>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
->
->>> 67108864 fd 25 host 0x7feb77e00000
->
->>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
->
->>> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
->
->>> fd 27 host 0x7fec18800000
->
->>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
->
->>> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
->
->>> 67108864 fd 28 host 0x7feb73c00000
->
->>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
->
->>> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
->
->>> fd 34 host 0x7fec18600000
->
->>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
->
->>> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
->
->>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
->
->>> 2097152 fd 35 host 0x7fec18200000
->
->>> cpr_find_fd /rom@etc/table-loader, id 0 returns -1
->
->>> cpr_save_fd /rom@etc/table-loader, id 0, fd 36
->
->>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
->
->>> fd 36 host 0x7feb8b600000
->
->>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
->
->>> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
->
->>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
->
->>> 37 host 0x7feb8b400000
->
->>>
->
->>> cpr_state_save cpr-transfer mode
->
->>> cpr_transfer_output /var/run/alma8cpr-dst.sock
->
->>
->
->> Target:
->
->>> cpr_transfer_input /var/run/alma8cpr-dst.sock
->
->>> cpr_state_load cpr-transfer mode
->
->>> cpr_find_fd pc.bios, id 0 returns 20
->
->>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
->
->>> 0x7fcdc9800000
->
->>> cpr_find_fd pc.rom, id 0 returns 19
->
->>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
->
->>> 0x7fcdc9600000
->
->>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
->
->>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
->
->>> 262144 fd 18 host 0x7fcdc9400000
->
->>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
->
->>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
->
->>> 67108864 fd 17 host 0x7fcd27e00000
->
->>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
->
->>> fd 16 host 0x7fcdc9200000
->
->>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
->
->>> 67108864 fd 15 host 0x7fcd23c00000
->
->>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
->
->>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
->
->>> fd 14 host 0x7fcdc8800000
->
->>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
->
->>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
->
->>> 2097152 fd 13 host 0x7fcdc8400000
->
->>> cpr_find_fd /rom@etc/table-loader, id 0 returns 11
->
->>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
->
->>> fd 11 host 0x7fcdc8200000
->
->>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
->
->>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
->
->>> 10 host 0x7fcd3be00000
->
->>
->
->> Looks like both vga.vram and qxl.vram are being preserved (with the same
->
->> addresses), and no incompatible ram blocks are found during migration.
->
->
->
-> Sorry, addressed are not the same, of course.  However corresponding ram
->
-> blocks do seem to be preserved and initialized.
->
->
-So far, I have not reproduced the guest driver failure.
->
->
-However, I have isolated places where new QEMU improperly writes to
->
-the qxl memory regions prior to starting the guest, by mmap'ing them
->
-readonly after cpr:
->
->
-  qemu_ram_alloc_internal()
->
-    if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
->
-        ram_flags |= RAM_READONLY;
->
-    new_block = qemu_ram_alloc_from_fd(...)
->
->
-I have attached a draft fix; try it and let me know.
->
-My console window looks fine before and after cpr, using
->
--vnc $hostip:0 -vga qxl
->
->
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?  Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-
->
-Program terminated with signal SIGSEGV, Segmentation fault.
->
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
->
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
->
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
->
-(gdb) bt
->
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
->
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
->
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
->
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
->
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
->
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
->
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
->
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
->
-value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
->
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
->
-v=0x5638996f3770, name=0x56389759b141 "realized", opaque=0x5638987893d0,
->
-errp=0x7ffd3c2b84e0)
->
-at ../qom/object.c:2374
->
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
->
-name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
->
-at ../qom/object.c:1449
->
-#7  0x00005638970f8586 in object_property_set_qobject (obj=0x5638996e0e70,
->
-name=0x56389759b141 "realized", value=0x5638996df900, errp=0x7ffd3c2b84e0)
->
-at ../qom/qom-qobject.c:28
->
-#8  0x00005638970f3d8d in object_property_set_bool (obj=0x5638996e0e70,
->
-name=0x56389759b141 "realized", value=true, errp=0x7ffd3c2b84e0)
->
-at ../qom/object.c:1519
->
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
->
-bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
->
-#10 0x0000563896dba675 in qdev_device_add_from_qdict (opts=0x5638996dfe50,
->
-from_json=false, errp=0x7ffd3c2b84e0) at ../system/qdev-monitor.c:714
->
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
->
-errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
->
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, opts=0x563898786150,
->
-errp=0x56389855dc40 <error_fatal>) at ../system/vl.c:1207
->
-#13 0x000056389737a6cc in qemu_opts_foreach
->
-(list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
->
-<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
->
-at ../util/qemu-option.c:1135
->
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/vl.c:2745
->
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
->
-<error_fatal>) at ../system/vl.c:2806
->
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) at
->
-../system/vl.c:3838
->
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at
->
-../system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-
-Andrey
-0001-hw-qxl-cpr-support-preliminary.patch
-Description:
-Text Data
-
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
-On 3/4/25 9:05 PM, Steven Sistare wrote:
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently
-and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-       -machine q35 \
-       -cpu host -smp 2 -m 2G \
-       -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-       -machine memory-backend=ram0 \
-       -machine aux-ram-share=on \
-       -drive file=$ROOTFS,media=disk,if=virtio \
-       -qmp unix:$QMPSOCK,server=on,wait=off \
-       -nographic \
-       -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-       -machine q35 \
-       -cpu host -smp 2 -m 2G \
-       -object memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-       -machine memory-backend=ram0 \
-       -machine aux-ram-share=on \
-       -drive file=$ROOTFS,media=disk,if=virtio \
-       -qmp unix:$QMPSOCK,server=on,wait=off \
-       -nographic \
-       -device qxl-vga \
-       -incoming tcp:0:44444 \
-       -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-       migrate-set-parameters mode=cpr-transfer
-       migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-
-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-           echo "$(date) starting round $j"
-           if [ "$(journalctl --boot | grep "failed to allocate VRAM
-BO")" != "" ]; then
-                   echo "bug was reproduced after $j tries"
-                   exit 1
-           fi
-           for i in $(seq 100); do
-                   dmesg > /dev/tty3
-           done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce
-that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't
-lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-  Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers" patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 24 host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 27 host 0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 34 host 0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 35 host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 36 host 0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-37 host 0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 18 host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 16 host 0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 14 host 0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 13 host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 11 host 0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-10 host 0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the same
-addresses), and no incompatible ram blocks are found during migration.
-Sorry, addressed are not the same, of course.  However corresponding ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-   qemu_ram_alloc_internal()
-     if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-         ram_flags |= RAM_READONLY;
-     new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?
-cpr does not preserve the vnc connection and session.  To test, I specify
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
-dormant the dest vnc becomes active.
-Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-I have been running like that, but have not reproduced the qxl driver crash,
-and I suspect my guest image+kernel is too old.  However, once I realized the
-issue was post-cpr modification of qxl memory, I switched my attention to the
-fix.
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
-(gdb) bt
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, 
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, 
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, 
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, value=true, 
-errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, v=0x5638996f3770, 
-name=0x56389759b141 "realized", opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:2374
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, name=0x56389759b141 
-"realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:1449
-#7  0x00005638970f8586 in object_property_set_qobject (obj=0x5638996e0e70, 
-name=0x56389759b141 "realized", value=0x5638996df900, errp=0x7ffd3c2b84e0)
-     at ../qom/qom-qobject.c:28
-#8  0x00005638970f3d8d in object_property_set_bool (obj=0x5638996e0e70, 
-name=0x56389759b141 "realized", value=true, errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:1519
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, bus=0x563898cf3c20, 
-errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
-#10 0x0000563896dba675 in qdev_device_add_from_qdict (opts=0x5638996dfe50, 
-from_json=false, errp=0x7ffd3c2b84e0) at ../system/qdev-monitor.c:714
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, errp=0x56389855dc40 
-<error_fatal>) at ../system/qdev-monitor.c:733
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0, opts=0x563898786150, 
-errp=0x56389855dc40 <error_fatal>) at ../system/vl.c:1207
-#13 0x000056389737a6cc in qemu_opts_foreach
-     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca <device_init_func>, 
-opaque=0x0, errp=0x56389855dc40 <error_fatal>)
-     at ../util/qemu-option.c:1135
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/vl.c:2745
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 
-<error_fatal>) at ../system/vl.c:2806
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) at 
-../system/vl.c:3838
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at 
-../system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram are
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
-of init_qxl_ram that modify guest memory.
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-It's a useful debugging technique, but changing protection on a large memory 
-region
-can be too expensive for production due to TLB shootdowns.
-
-Also, there are cases where writes are performed but the value is guaranteed to
-be the same:
-  qxl_post_load()
-    qxl_set_mode()
-      d->rom->mode = cpu_to_le32(modenr);
-The value is the same because mode and shadow_rom.mode were passed in vmstate
-from old qemu.
-
-- Steve
-0001-hw-qxl-cpr-support-preliminary-V2.patch
-Description:
-Text document
-
-On 3/5/25 22:19, Steven Sistare wrote:
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
-On 3/4/25 9:05 PM, Steven Sistare wrote:
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently
-and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-       -machine q35 \
-       -cpu host -smp 2 -m 2G \
-       -object
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-       -machine memory-backend=ram0 \
-       -machine aux-ram-share=on \
-       -drive file=$ROOTFS,media=disk,if=virtio \
-       -qmp unix:$QMPSOCK,server=on,wait=off \
-       -nographic \
-       -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-       -machine q35 \
-       -cpu host -smp 2 -m 2G \
-       -object
-memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
-ram0,share=on\
-       -machine memory-backend=ram0 \
-       -machine aux-ram-share=on \
-       -drive file=$ROOTFS,media=disk,if=virtio \
-       -qmp unix:$QMPSOCK,server=on,wait=off \
-       -nographic \
-       -device qxl-vga \
-       -incoming tcp:0:44444 \
-       -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-       migrate-set-parameters mode=cpr-transfer
-       migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing
-the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR*
-failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-
-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which
-speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-           echo "$(date) starting round $j"
-           if [ "$(journalctl --boot | grep "failed to
-allocate VRAM
-BO")" != "" ]; then
-                   echo "bug was reproduced after $j tries"
-                   exit 1
-           fi
-           for i in $(seq 100); do
-                   dmesg > /dev/tty3
-           done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce
-that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the
-crash -
-without the cpr-transfer migration the above reproduce doesn't
-lead to
-crash on the source VM.
-I suspect that, as cpr-transfer doesn't migrate the guest
-memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-Could somebody help the investigation and take a look into
-this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers"
-patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 24 host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 27 host 0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 34 host 0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 35 host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 36 host 0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-37 host 0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 18 host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 16 host 0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 14 host 0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 13 host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 11 host 0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-10 host 0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with
-the same
-addresses), and no incompatible ram blocks are found during
-migration.
-Sorry, addressed are not the same, of course.  However
-corresponding ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-   qemu_ram_alloc_internal()
-     if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-         ram_flags |= RAM_READONLY;
-     new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?
-cpr does not preserve the vnc connection and session.  To test, I specify
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
-dormant the dest vnc becomes active.
-Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-I have been running like that, but have not reproduced the qxl driver
-crash,
-and I suspect my guest image+kernel is too old.  However, once I
-realized the
-issue was post-cpr modification of qxl memory, I switched my attention
-to the
-fix.
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
-(gdb) bt
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
-value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
-v=0x5638996f3770, name=0x56389759b141 "realized",
-opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:2374
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
-name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:1449
-#7  0x00005638970f8586 in object_property_set_qobject
-(obj=0x5638996e0e70, name=0x56389759b141 "realized",
-value=0x5638996df900, errp=0x7ffd3c2b84e0)
-     at ../qom/qom-qobject.c:28
-#8  0x00005638970f3d8d in object_property_set_bool
-(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true,
-errp=0x7ffd3c2b84e0)
-     at ../qom/object.c:1519
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
-bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
-#10 0x0000563896dba675 in qdev_device_add_from_qdict
-(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at
-../system/qdev-monitor.c:714
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
-errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0,
-opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at
-../system/vl.c:1207
-#13 0x000056389737a6cc in qemu_opts_foreach
-     (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
-<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
-     at ../util/qemu-option.c:1135
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at
-../system/vl.c:2745
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
-<error_fatal>) at ../system/vl.c:2806
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948)
-at ../system/vl.c:3838
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at
-../system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-Thanks for the stack trace; the calls to SPICE_RING_INIT in
-init_qxl_ram are
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
-of init_qxl_ram that modify guest memory.
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-It's a useful debugging technique, but changing protection on a large
-memory region
-can be too expensive for production due to TLB shootdowns.
-Good point. Though we could move this code under non-default option to
-avoid re-writing.
-
-Den
-
-On 3/5/25 11:19 PM, Steven Sistare wrote:
->
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
->
-> On 3/4/25 9:05 PM, Steven Sistare wrote:
->
->> On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
->
->>> On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
->
->>>> On 2/28/25 8:20 PM, Steven Sistare wrote:
->
->>>>> On 2/28/2025 1:13 PM, Steven Sistare wrote:
->
->>>>>> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
->
->>>>>>> Hi all,
->
->>>>>>>
->
->>>>>>> We've been experimenting with cpr-transfer migration mode recently
->
->>>>>>> and
->
->>>>>>> have discovered the following issue with the guest QXL driver:
->
->>>>>>>
->
->>>>>>> Run migration source:
->
->>>>>>>> EMULATOR=/path/to/emulator
->
->>>>>>>> ROOTFS=/path/to/image
->
->>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>>>>>
->
->>>>>>>> $EMULATOR -enable-kvm \
->
->>>>>>>>        -machine q35 \
->
->>>>>>>>        -cpu host -smp 2 -m 2G \
->
->>>>>>>>        -object memory-backend-file,id=ram0,size=2G,mem-path=/
->
->>>>>>>> dev/shm/
->
->>>>>>>> ram0,share=on\
->
->>>>>>>>        -machine memory-backend=ram0 \
->
->>>>>>>>        -machine aux-ram-share=on \
->
->>>>>>>>        -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>>>>>        -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>>>>>        -nographic \
->
->>>>>>>>        -device qxl-vga
->
->>>>>>>
->
->>>>>>> Run migration target:
->
->>>>>>>> EMULATOR=/path/to/emulator
->
->>>>>>>> ROOTFS=/path/to/image
->
->>>>>>>> QMPSOCK=/var/run/alma8qmp-dst.sock
->
->>>>>>>> $EMULATOR -enable-kvm \
->
->>>>>>>>        -machine q35 \
->
->>>>>>>>        -cpu host -smp 2 -m 2G \
->
->>>>>>>>        -object memory-backend-file,id=ram0,size=2G,mem-path=/
->
->>>>>>>> dev/shm/
->
->>>>>>>> ram0,share=on\
->
->>>>>>>>        -machine memory-backend=ram0 \
->
->>>>>>>>        -machine aux-ram-share=on \
->
->>>>>>>>        -drive file=$ROOTFS,media=disk,if=virtio \
->
->>>>>>>>        -qmp unix:$QMPSOCK,server=on,wait=off \
->
->>>>>>>>        -nographic \
->
->>>>>>>>        -device qxl-vga \
->
->>>>>>>>        -incoming tcp:0:44444 \
->
->>>>>>>>        -incoming '{"channel-type": "cpr", "addr": { "transport":
->
->>>>>>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
->
->>>>>>>
->
->>>>>>>
->
->>>>>>> Launch the migration:
->
->>>>>>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
->
->>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
->
->>>>>>>>
->
->>>>>>>> $QMPSHELL -p $QMPSOCK <<EOF
->
->>>>>>>>        migrate-set-parameters mode=cpr-transfer
->
->>>>>>>>        migrate channels=[{"channel-type":"main","addr":
->
->>>>>>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}},
->
->>>>>>>> {"channel-type":"cpr","addr":
->
->>>>>>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
->
->>>>>>>> dst.sock"}}]
->
->>>>>>>> EOF
->
->>>>>>>
->
->>>>>>> Then, after a while, QXL guest driver on target crashes spewing the
->
->>>>>>> following messages:
->
->>>>>>>> [   73.962002] [TTM] Buffer eviction failed
->
->>>>>>>> [   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
->
->>>>>>>> 0x00000001)
->
->>>>>>>> [   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
->
->>>>>>>> allocate VRAM BO
->
->>>>>>>
->
->>>>>>> That seems to be a known kernel QXL driver bug:
->
->>>>>>>
->
->>>>>>>
-https://lore.kernel.org/all/20220907094423.93581-1-
->
->>>>>>> min_halo@163.com/T/
->
->>>>>>>
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
->
->>>>>>>
->
->>>>>>> (the latter discussion contains that reproduce script which
->
->>>>>>> speeds up
->
->>>>>>> the crash in the guest):
->
->>>>>>>> #!/bin/bash
->
->>>>>>>>
->
->>>>>>>> chvt 3
->
->>>>>>>>
->
->>>>>>>> for j in $(seq 80); do
->
->>>>>>>>            echo "$(date) starting round $j"
->
->>>>>>>>            if [ "$(journalctl --boot | grep "failed to allocate
->
->>>>>>>> VRAM
->
->>>>>>>> BO")" != "" ]; then
->
->>>>>>>>                    echo "bug was reproduced after $j tries"
->
->>>>>>>>                    exit 1
->
->>>>>>>>            fi
->
->>>>>>>>            for i in $(seq 100); do
->
->>>>>>>>                    dmesg > /dev/tty3
->
->>>>>>>>            done
->
->>>>>>>> done
->
->>>>>>>>
->
->>>>>>>> echo "bug could not be reproduced"
->
->>>>>>>> exit 0
->
->>>>>>>
->
->>>>>>> The bug itself seems to remain unfixed, as I was able to reproduce
->
->>>>>>> that
->
->>>>>>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
->
->>>>>>> cpr-transfer code also seems to be buggy as it triggers the crash -
->
->>>>>>> without the cpr-transfer migration the above reproduce doesn't
->
->>>>>>> lead to
->
->>>>>>> crash on the source VM.
->
->>>>>>>
->
->>>>>>> I suspect that, as cpr-transfer doesn't migrate the guest
->
->>>>>>> memory, but
->
->>>>>>> rather passes it through the memory backend object, our code might
->
->>>>>>> somehow corrupt the VRAM.  However, I wasn't able to trace the
->
->>>>>>> corruption so far.
->
->>>>>>>
->
->>>>>>> Could somebody help the investigation and take a look into
->
->>>>>>> this?  Any
->
->>>>>>> suggestions would be appreciated.  Thanks!
->
->>>>>>
->
->>>>>> Possibly some memory region created by qxl is not being preserved.
->
->>>>>> Try adding these traces to see what is preserved:
->
->>>>>>
->
->>>>>> -trace enable='*cpr*'
->
->>>>>> -trace enable='*ram_alloc*'
->
->>>>>
->
->>>>> Also try adding this patch to see if it flags any ram blocks as not
->
->>>>> compatible with cpr.  A message is printed at migration start time.
->
->>>>>   Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
->
->>>>> email-
->
->>>>> steven.sistare@oracle.com/
->
->>>>>
->
->>>>> - Steve
->
->>>>>
->
->>>>
->
->>>> With the traces enabled + the "migration: ram block cpr blockers"
->
->>>> patch
->
->>>> applied:
->
->>>>
->
->>>> Source:
->
->>>>> cpr_find_fd pc.bios, id 0 returns -1
->
->>>>> cpr_save_fd pc.bios, id 0, fd 22
->
->>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
->
->>>>> 0x7fec18e00000
->
->>>>> cpr_find_fd pc.rom, id 0 returns -1
->
->>>>> cpr_save_fd pc.rom, id 0, fd 23
->
->>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
->
->>>>> 0x7fec18c00000
->
->>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
->
->>>>> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
->
->>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
->
->>>>> 262144 fd 24 host 0x7fec18a00000
->
->>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
->
->>>>> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
->
->>>>> 67108864 fd 25 host 0x7feb77e00000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
->
->>>>> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
->
->>>>> fd 27 host 0x7fec18800000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
->
->>>>> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
->
->>>>> 67108864 fd 28 host 0x7feb73c00000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
->
->>>>> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
->
->>>>> fd 34 host 0x7fec18600000
->
->>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
->
->>>>> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
->
->>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
->
->>>>> 2097152 fd 35 host 0x7fec18200000
->
->>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns -1
->
->>>>> cpr_save_fd /rom@etc/table-loader, id 0, fd 36
->
->>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
->
->>>>> fd 36 host 0x7feb8b600000
->
->>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
->
->>>>> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
->
->>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
->
->>>>> 37 host 0x7feb8b400000
->
->>>>>
->
->>>>> cpr_state_save cpr-transfer mode
->
->>>>> cpr_transfer_output /var/run/alma8cpr-dst.sock
->
->>>>
->
->>>> Target:
->
->>>>> cpr_transfer_input /var/run/alma8cpr-dst.sock
->
->>>>> cpr_state_load cpr-transfer mode
->
->>>>> cpr_find_fd pc.bios, id 0 returns 20
->
->>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
->
->>>>> 0x7fcdc9800000
->
->>>>> cpr_find_fd pc.rom, id 0 returns 19
->
->>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
->
->>>>> 0x7fcdc9600000
->
->>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
->
->>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
->
->>>>> 262144 fd 18 host 0x7fcdc9400000
->
->>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
->
->>>>> 67108864 fd 17 host 0x7fcd27e00000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
->
->>>>> fd 16 host 0x7fcdc9200000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
->
->>>>> 67108864 fd 15 host 0x7fcd23c00000
->
->>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
->
->>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
->
->>>>> fd 14 host 0x7fcdc8800000
->
->>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
->
->>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
->
->>>>> 2097152 fd 13 host 0x7fcdc8400000
->
->>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns 11
->
->>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
->
->>>>> fd 11 host 0x7fcdc8200000
->
->>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
->
->>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
->
->>>>> 10 host 0x7fcd3be00000
->
->>>>
->
->>>> Looks like both vga.vram and qxl.vram are being preserved (with the
->
->>>> same
->
->>>> addresses), and no incompatible ram blocks are found during migration.
->
->>>
->
->>> Sorry, addressed are not the same, of course.  However corresponding
->
->>> ram
->
->>> blocks do seem to be preserved and initialized.
->
->>
->
->> So far, I have not reproduced the guest driver failure.
->
->>
->
->> However, I have isolated places where new QEMU improperly writes to
->
->> the qxl memory regions prior to starting the guest, by mmap'ing them
->
->> readonly after cpr:
->
->>
->
->>    qemu_ram_alloc_internal()
->
->>      if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
->
->>          ram_flags |= RAM_READONLY;
->
->>      new_block = qemu_ram_alloc_from_fd(...)
->
->>
->
->> I have attached a draft fix; try it and let me know.
->
->> My console window looks fine before and after cpr, using
->
->> -vnc $hostip:0 -vga qxl
->
->>
->
->> - Steve
->
->
->
-> Regarding the reproduce: when I launch the buggy version with the same
->
-> options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
->
-> my VNC client silently hangs on the target after a while.  Could it
->
-> happen on your stand as well?Â
->
->
-cpr does not preserve the vnc connection and session.  To test, I specify
->
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
->
-dormant the dest vnc becomes active.
->
-Sure, I meant that VNC on the dest (on the port 1) works for a while
-after the migration and then hangs, apparently after the guest QXL crash.
-
->
-> Could you try launching VM with
->
-> "-nographic -device qxl-vga"?  That way VM's serial console is given you
->
-> directly in the shell, so when qxl driver crashes you're still able to
->
-> inspect the kernel messages.
->
->
-I have been running like that, but have not reproduced the qxl driver
->
-crash,
->
-and I suspect my guest image+kernel is too old.
-Yes, that's probably the case.  But the crash occurs on my Fedora 41
-guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to
-be buggy.
-
-
->
-However, once I realized the
->
-issue was post-cpr modification of qxl memory, I switched my attention
->
-to the
->
-fix.
->
->
-> As for your patch, I can report that it doesn't resolve the issue as it
->
-> is.  But I was able to track down another possible memory corruption
->
-> using your approach with readonly mmap'ing:
->
->
->
->> Program terminated with signal SIGSEGV, Segmentation fault.
->
->> #0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
->
->> 412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
->
->> [Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
->
->> (gdb) bt
->
->> #0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
->
->> #1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
->
->> errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
->
->> #2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
->
->> errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
->
->> #3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
->
->> errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
->
->> #4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
->
->> value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
->
->> #5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
->
->> v=0x5638996f3770, name=0x56389759b141 "realized",
->
->> opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
->
->>      at ../qom/object.c:2374
->
->> #6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
->
->> name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
->
->>      at ../qom/object.c:1449
->
->> #7  0x00005638970f8586 in object_property_set_qobject
->
->> (obj=0x5638996e0e70, name=0x56389759b141 "realized",
->
->> value=0x5638996df900, errp=0x7ffd3c2b84e0)
->
->>      at ../qom/qom-qobject.c:28
->
->> #8  0x00005638970f3d8d in object_property_set_bool
->
->> (obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true,
->
->> errp=0x7ffd3c2b84e0)
->
->>      at ../qom/object.c:1519
->
->> #9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
->
->> bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
->
->> #10 0x0000563896dba675 in qdev_device_add_from_qdict
->
->> (opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../
->
->> system/qdev-monitor.c:714
->
->> #11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
->
->> errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
->
->> #12 0x0000563896dc48f1 in device_init_func (opaque=0x0,
->
->> opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/
->
->> vl.c:1207
->
->> #13 0x000056389737a6cc in qemu_opts_foreach
->
->>      (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
->
->> <device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
->
->>      at ../util/qemu-option.c:1135
->
->> #14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/
->
->> vl.c:2745
->
->> #15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
->
->> <error_fatal>) at ../system/vl.c:2806
->
->> #16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948)
->
->> at ../system/vl.c:3838
->
->> #17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../
->
->> system/main.c:72
->
->
->
-> So the attached adjusted version of your patch does seem to help.  At
->
-> least I can't reproduce the crash on my stand.
->
->
-Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram
->
-are
->
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
->
-of init_qxl_ram that modify guest memory.
->
-Thanks, your v2 patch does seem to prevent the crash.  Would you re-send
-it to the list as a proper fix?
-
->
-> I'm wondering, could it be useful to explicitly mark all the reused
->
-> memory regions readonly upon cpr-transfer, and then make them writable
->
-> back again after the migration is done?  That way we will be segfaulting
->
-> early on instead of debugging tricky memory corruptions.
->
->
-It's a useful debugging technique, but changing protection on a large
->
-memory region
->
-can be too expensive for production due to TLB shootdowns.
->
->
-Also, there are cases where writes are performed but the value is
->
-guaranteed to
->
-be the same:
->
-  qxl_post_load()
->
-    qxl_set_mode()
->
-      d->rom->mode = cpu_to_le32(modenr);
->
-The value is the same because mode and shadow_rom.mode were passed in
->
-vmstate
->
-from old qemu.
->
-There're also cases where devices' ROM might be re-initialized.  E.g.
-this segfault occures upon further exploration of RO mapped RAM blocks:
-
->
-Program terminated with signal SIGSEGV, Segmentation fault.
->
-#0  __memmove_avx_unaligned_erms () at
->
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
->
-664             rep     movsb
->
-[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))]
->
-(gdb) bt
->
-#0  __memmove_avx_unaligned_erms () at
->
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
->
-#1  0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380,
->
-owner=0x55aa2019ac10, name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true)
->
-at ../hw/core/loader.c:1032
->
-#2  0x000055aa1d031577 in rom_add_blob
->
-(name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072,
->
-max_len=2097152, addr=18446744073709551615, fw_file_name=0x55aa1da51f13
->
-"etc/acpi/tables", fw_callback=0x55aa1d441f59 <acpi_build_update>,
->
-callback_opaque=0x55aa20ff0010, as=0x0, read_only=true) at
->
-../hw/core/loader.c:1147
->
-#3  0x000055aa1cfd788d in acpi_add_rom_blob
->
-(update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010,
->
-blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at
->
-../hw/acpi/utils.c:46
->
-#4  0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720
->
-#5  0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0)
->
-at ../hw/i386/pc.c:638
->
-#6  0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10
->
-<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39
->
-#7  0x000055aa1d039ee5 in qdev_machine_creation_done () at
->
-../hw/core/machine.c:1749
->
-#8  0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40
->
-<error_fatal>) at ../system/vl.c:2779
->
-#9  0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40
->
-<error_fatal>) at ../system/vl.c:2807
->
-#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at
->
-../system/vl.c:3838
->
-#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at
->
-../system/main.c:72
-I'm not sure whether ACPI tables ROM in particular is rewritten with the
-same content, but there might be cases where ROM can be read from file
-system upon initialization.  That is undesirable as guest kernel
-certainly won't be too happy about sudden change of the device's ROM
-content.
-
-So the issue we're dealing with here is any unwanted memory related
-device initialization upon cpr.
-
-For now the only thing that comes to my mind is to make a test where we
-put as many devices as we can into a VM, make ram blocks RO upon cpr
-(and remap them as RW later after migration is done, if needed), and
-catch any unwanted memory violations.  As Den suggested, we might
-consider adding that behaviour as a separate non-default option (or
-"migrate" command flag specific to cpr-transfer), which would only be
-used in the testing.
-
-Andrey
-
-On 3/6/25 16:16, Andrey Drobyshev wrote:
-On 3/5/25 11:19 PM, Steven Sistare wrote:
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
-On 3/4/25 9:05 PM, Steven Sistare wrote:
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently
-and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga \
-        -incoming tcp:0:44444 \
-        -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-        migrate-set-parameters mode=cpr-transfer
-        migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-
-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which
-speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-            echo "$(date) starting round $j"
-            if [ "$(journalctl --boot | grep "failed to allocate
-VRAM
-BO")" != "" ]; then
-                    echo "bug was reproduced after $j tries"
-                    exit 1
-            fi
-            for i in $(seq 100); do
-                    dmesg > /dev/tty3
-            done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce
-that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't
-lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest
-memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into
-this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-   Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers"
-patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 24 host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 27 host 0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 34 host 0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 35 host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 36 host 0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-37 host 0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 18 host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 16 host 0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 14 host 0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 13 host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 11 host 0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-10 host 0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the
-same
-addresses), and no incompatible ram blocks are found during migration.
-Sorry, addressed are not the same, of course.  However corresponding
-ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-    qemu_ram_alloc_internal()
-      if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-          ram_flags |= RAM_READONLY;
-      new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?
-cpr does not preserve the vnc connection and session.  To test, I specify
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
-dormant the dest vnc becomes active.
-Sure, I meant that VNC on the dest (on the port 1) works for a while
-after the migration and then hangs, apparently after the guest QXL crash.
-Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-I have been running like that, but have not reproduced the qxl driver
-crash,
-and I suspect my guest image+kernel is too old.
-Yes, that's probably the case.  But the crash occurs on my Fedora 41
-guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to
-be buggy.
-However, once I realized the
-issue was post-cpr modification of qxl memory, I switched my attention
-to the
-fix.
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
-(gdb) bt
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
-value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
-v=0x5638996f3770, name=0x56389759b141 "realized",
-opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:2374
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
-name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1449
-#7  0x00005638970f8586 in object_property_set_qobject
-(obj=0x5638996e0e70, name=0x56389759b141 "realized",
-value=0x5638996df900, errp=0x7ffd3c2b84e0)
-      at ../qom/qom-qobject.c:28
-#8  0x00005638970f3d8d in object_property_set_bool
-(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true,
-errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1519
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
-bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
-#10 0x0000563896dba675 in qdev_device_add_from_qdict
-(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../
-system/qdev-monitor.c:714
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
-errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0,
-opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/
-vl.c:1207
-#13 0x000056389737a6cc in qemu_opts_foreach
-      (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
-<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
-      at ../util/qemu-option.c:1135
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/
-vl.c:2745
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
-<error_fatal>) at ../system/vl.c:2806
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948)
-at ../system/vl.c:3838
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../
-system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram
-are
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
-of init_qxl_ram that modify guest memory.
-Thanks, your v2 patch does seem to prevent the crash.  Would you re-send
-it to the list as a proper fix?
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-It's a useful debugging technique, but changing protection on a large
-memory region
-can be too expensive for production due to TLB shootdowns.
-
-Also, there are cases where writes are performed but the value is
-guaranteed to
-be the same:
-   qxl_post_load()
-     qxl_set_mode()
-       d->rom->mode = cpu_to_le32(modenr);
-The value is the same because mode and shadow_rom.mode were passed in
-vmstate
-from old qemu.
-There're also cases where devices' ROM might be re-initialized.  E.g.
-this segfault occures upon further exploration of RO mapped RAM blocks:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-664             rep     movsb
-[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))]
-(gdb) bt
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-#1  0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, 
-name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true)
-     at ../hw/core/loader.c:1032
-#2  0x000055aa1d031577 in rom_add_blob
-     (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, 
-addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", 
-fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, 
-read_only=true) at ../hw/core/loader.c:1147
-#3  0x000055aa1cfd788d in acpi_add_rom_blob
-     (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, 
-blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46
-#4  0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720
-#5  0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) 
-at ../hw/i386/pc.c:638
-#6  0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 
-<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39
-#7  0x000055aa1d039ee5 in qdev_machine_creation_done () at 
-../hw/core/machine.c:1749
-#8  0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2779
-#9  0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2807
-#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at 
-../system/vl.c:3838
-#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at 
-../system/main.c:72
-I'm not sure whether ACPI tables ROM in particular is rewritten with the
-same content, but there might be cases where ROM can be read from file
-system upon initialization.  That is undesirable as guest kernel
-certainly won't be too happy about sudden change of the device's ROM
-content.
-
-So the issue we're dealing with here is any unwanted memory related
-device initialization upon cpr.
-
-For now the only thing that comes to my mind is to make a test where we
-put as many devices as we can into a VM, make ram blocks RO upon cpr
-(and remap them as RW later after migration is done, if needed), and
-catch any unwanted memory violations.  As Den suggested, we might
-consider adding that behaviour as a separate non-default option (or
-"migrate" command flag specific to cpr-transfer), which would only be
-used in the testing.
-
-Andrey
-No way. ACPI with the source must be used in the same way as BIOSes
-and optional ROMs.
-
-Den
-
-On 3/6/2025 10:52 AM, Denis V. Lunev wrote:
-On 3/6/25 16:16, Andrey Drobyshev wrote:
-On 3/5/25 11:19 PM, Steven Sistare wrote:
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
-On 3/4/25 9:05 PM, Steven Sistare wrote:
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently
-and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga \
-        -incoming tcp:0:44444 \
-        -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-        migrate-set-parameters mode=cpr-transfer
-        migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-
-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which
-speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-            echo "$(date) starting round $j"
-            if [ "$(journalctl --boot | grep "failed to allocate
-VRAM
-BO")" != "" ]; then
-                    echo "bug was reproduced after $j tries"
-                    exit 1
-            fi
-            for i in $(seq 100); do
-                    dmesg > /dev/tty3
-            done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce
-that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't
-lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest
-memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into
-this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-   Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers"
-patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 24 host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 27 host 0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 34 host 0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 35 host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 36 host 0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-37 host 0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 18 host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 16 host 0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 14 host 0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 13 host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 11 host 0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-10 host 0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the
-same
-addresses), and no incompatible ram blocks are found during migration.
-Sorry, addressed are not the same, of course.  However corresponding
-ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-    qemu_ram_alloc_internal()
-      if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-          ram_flags |= RAM_READONLY;
-      new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?
-cpr does not preserve the vnc connection and session.  To test, I specify
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
-dormant the dest vnc becomes active.
-Sure, I meant that VNC on the dest (on the port 1) works for a while
-after the migration and then hangs, apparently after the guest QXL crash.
-Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-I have been running like that, but have not reproduced the qxl driver
-crash,
-and I suspect my guest image+kernel is too old.
-Yes, that's probably the case.  But the crash occurs on my Fedora 41
-guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to
-be buggy.
-However, once I realized the
-issue was post-cpr modification of qxl memory, I switched my attention
-to the
-fix.
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
-(gdb) bt
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
-value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
-v=0x5638996f3770, name=0x56389759b141 "realized",
-opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:2374
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
-name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1449
-#7  0x00005638970f8586 in object_property_set_qobject
-(obj=0x5638996e0e70, name=0x56389759b141 "realized",
-value=0x5638996df900, errp=0x7ffd3c2b84e0)
-      at ../qom/qom-qobject.c:28
-#8  0x00005638970f3d8d in object_property_set_bool
-(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true,
-errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1519
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
-bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
-#10 0x0000563896dba675 in qdev_device_add_from_qdict
-(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../
-system/qdev-monitor.c:714
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
-errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0,
-opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/
-vl.c:1207
-#13 0x000056389737a6cc in qemu_opts_foreach
-      (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
-<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
-      at ../util/qemu-option.c:1135
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/
-vl.c:2745
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
-<error_fatal>) at ../system/vl.c:2806
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948)
-at ../system/vl.c:3838
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../
-system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram
-are
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
-of init_qxl_ram that modify guest memory.
-Thanks, your v2 patch does seem to prevent the crash.  Would you re-send
-it to the list as a proper fix?
-Yes.  Was waiting for your confirmation.
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-It's a useful debugging technique, but changing protection on a large
-memory region
-can be too expensive for production due to TLB shootdowns.
-
-Also, there are cases where writes are performed but the value is
-guaranteed to
-be the same:
-   qxl_post_load()
-     qxl_set_mode()
-       d->rom->mode = cpu_to_le32(modenr);
-The value is the same because mode and shadow_rom.mode were passed in
-vmstate
-from old qemu.
-There're also cases where devices' ROM might be re-initialized.  E.g.
-this segfault occures upon further exploration of RO mapped RAM blocks:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-664             rep     movsb
-[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))]
-(gdb) bt
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-#1  0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, 
-name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true)
-     at ../hw/core/loader.c:1032
-#2  0x000055aa1d031577 in rom_add_blob
-     (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, 
-addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", 
-fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, 
-read_only=true) at ../hw/core/loader.c:1147
-#3  0x000055aa1cfd788d in acpi_add_rom_blob
-     (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, 
-blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46
-#4  0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720
-#5  0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) 
-at ../hw/i386/pc.c:638
-#6  0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 
-<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39
-#7  0x000055aa1d039ee5 in qdev_machine_creation_done () at 
-../hw/core/machine.c:1749
-#8  0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2779
-#9  0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2807
-#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at 
-../system/vl.c:3838
-#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at 
-../system/main.c:72
-I'm not sure whether ACPI tables ROM in particular is rewritten with the
-same content, but there might be cases where ROM can be read from file
-system upon initialization.  That is undesirable as guest kernel
-certainly won't be too happy about sudden change of the device's ROM
-content.
-
-So the issue we're dealing with here is any unwanted memory related
-device initialization upon cpr.
-
-For now the only thing that comes to my mind is to make a test where we
-put as many devices as we can into a VM, make ram blocks RO upon cpr
-(and remap them as RW later after migration is done, if needed), and
-catch any unwanted memory violations.  As Den suggested, we might
-consider adding that behaviour as a separate non-default option (or
-"migrate" command flag specific to cpr-transfer), which would only be
-used in the testing.
-I'll look into adding an option, but there may be too many false positives,
-such as the qxl_set_mode case above.  And the maintainers may object to me
-eliminating the false positives by adding more CPR_IN tests, due to gratuitous
-(from their POV) ugliness.
-
-But I will use the technique to look for more write violations.
-Andrey
-No way. ACPI with the source must be used in the same way as BIOSes
-and optional ROMs.
-Yup, its a bug.  Will fix.
-
-- Steve
-
-see
-1741380954-341079-1-git-send-email-steven.sistare@oracle.com
-/">https://lore.kernel.org/qemu-devel/
-1741380954-341079-1-git-send-email-steven.sistare@oracle.com
-/
-- Steve
-
-On 3/6/2025 11:13 AM, Steven Sistare wrote:
-On 3/6/2025 10:52 AM, Denis V. Lunev wrote:
-On 3/6/25 16:16, Andrey Drobyshev wrote:
-On 3/5/25 11:19 PM, Steven Sistare wrote:
-On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
-On 3/4/25 9:05 PM, Steven Sistare wrote:
-On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
-On 2/28/25 8:20 PM, Steven Sistare wrote:
-On 2/28/2025 1:13 PM, Steven Sistare wrote:
-On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
-Hi all,
-
-We've been experimenting with cpr-transfer migration mode recently
-and
-have discovered the following issue with the guest QXL driver:
-
-Run migration source:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga
-Run migration target:
-EMULATOR=/path/to/emulator
-ROOTFS=/path/to/image
-QMPSOCK=/var/run/alma8qmp-dst.sock
-$EMULATOR -enable-kvm \
-        -machine q35 \
-        -cpu host -smp 2 -m 2G \
-        -object memory-backend-file,id=ram0,size=2G,mem-path=/
-dev/shm/
-ram0,share=on\
-        -machine memory-backend=ram0 \
-        -machine aux-ram-share=on \
-        -drive file=$ROOTFS,media=disk,if=virtio \
-        -qmp unix:$QMPSOCK,server=on,wait=off \
-        -nographic \
-        -device qxl-vga \
-        -incoming tcp:0:44444 \
-        -incoming '{"channel-type": "cpr", "addr": { "transport":
-"socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
-Launch the migration:
-QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
-QMPSOCK=/var/run/alma8qmp-src.sock
-
-$QMPSHELL -p $QMPSOCK <<EOF
-        migrate-set-parameters mode=cpr-transfer
-        migrate channels=[{"channel-type":"main","addr":
-{"transport":"socket","type":"inet","host":"0","port":"44444"}},
-{"channel-type":"cpr","addr":
-{"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
-dst.sock"}}]
-EOF
-Then, after a while, QXL guest driver on target crashes spewing the
-following messages:
-[   73.962002] [TTM] Buffer eviction failed
-[   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
-0x00000001)
-[   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to
-allocate VRAM BO
-That seems to be a known kernel QXL driver bug:
-https://lore.kernel.org/all/20220907094423.93581-1-
-min_halo@163.com/T/
-https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
-(the latter discussion contains that reproduce script which
-speeds up
-the crash in the guest):
-#!/bin/bash
-
-chvt 3
-
-for j in $(seq 80); do
-            echo "$(date) starting round $j"
-            if [ "$(journalctl --boot | grep "failed to allocate
-VRAM
-BO")" != "" ]; then
-                    echo "bug was reproduced after $j tries"
-                    exit 1
-            fi
-            for i in $(seq 100); do
-                    dmesg > /dev/tty3
-            done
-done
-
-echo "bug could not be reproduced"
-exit 0
-The bug itself seems to remain unfixed, as I was able to reproduce
-that
-with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
-cpr-transfer code also seems to be buggy as it triggers the crash -
-without the cpr-transfer migration the above reproduce doesn't
-lead to
-crash on the source VM.
-
-I suspect that, as cpr-transfer doesn't migrate the guest
-memory, but
-rather passes it through the memory backend object, our code might
-somehow corrupt the VRAM.  However, I wasn't able to trace the
-corruption so far.
-
-Could somebody help the investigation and take a look into
-this?  Any
-suggestions would be appreciated.  Thanks!
-Possibly some memory region created by qxl is not being preserved.
-Try adding these traces to see what is preserved:
-
--trace enable='*cpr*'
--trace enable='*ram_alloc*'
-Also try adding this patch to see if it flags any ram blocks as not
-compatible with cpr.  A message is printed at migration start time.
-   Â
-https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
-email-
-steven.sistare@oracle.com/
-
-- Steve
-With the traces enabled + the "migration: ram block cpr blockers"
-patch
-applied:
-
-Source:
-cpr_find_fd pc.bios, id 0 returns -1
-cpr_save_fd pc.bios, id 0, fd 22
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
-0x7fec18e00000
-cpr_find_fd pc.rom, id 0 returns -1
-cpr_save_fd pc.rom, id 0, fd 23
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
-0x7fec18c00000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
-cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 24 host 0x7fec18a00000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 25 host 0x7feb77e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 27 host 0x7fec18800000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 28 host 0x7feb73c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
-cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 34 host 0x7fec18600000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 35 host 0x7fec18200000
-cpr_find_fd /rom@etc/table-loader, id 0 returns -1
-cpr_save_fd /rom@etc/table-loader, id 0, fd 36
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 36 host 0x7feb8b600000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
-cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-37 host 0x7feb8b400000
-
-cpr_state_save cpr-transfer mode
-cpr_transfer_output /var/run/alma8cpr-dst.sock
-Target:
-cpr_transfer_input /var/run/alma8cpr-dst.sock
-cpr_state_load cpr-transfer mode
-cpr_find_fd pc.bios, id 0 returns 20
-qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
-0x7fcdc9800000
-cpr_find_fd pc.rom, id 0 returns 19
-qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
-0x7fcdc9600000
-cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
-qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
-262144 fd 18 host 0x7fcdc9400000
-cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
-qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
-67108864 fd 17 host 0x7fcd27e00000
-cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
-fd 16 host 0x7fcdc9200000
-cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
-qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
-67108864 fd 15 host 0x7fcd23c00000
-cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
-qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
-fd 14 host 0x7fcdc8800000
-cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
-qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
-2097152 fd 13 host 0x7fcdc8400000
-cpr_find_fd /rom@etc/table-loader, id 0 returns 11
-qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
-fd 11 host 0x7fcdc8200000
-cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
-qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
-10 host 0x7fcd3be00000
-Looks like both vga.vram and qxl.vram are being preserved (with the
-same
-addresses), and no incompatible ram blocks are found during migration.
-Sorry, addressed are not the same, of course.  However corresponding
-ram
-blocks do seem to be preserved and initialized.
-So far, I have not reproduced the guest driver failure.
-
-However, I have isolated places where new QEMU improperly writes to
-the qxl memory regions prior to starting the guest, by mmap'ing them
-readonly after cpr:
-
-    qemu_ram_alloc_internal()
-      if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
-          ram_flags |= RAM_READONLY;
-      new_block = qemu_ram_alloc_from_fd(...)
-
-I have attached a draft fix; try it and let me know.
-My console window looks fine before and after cpr, using
--vnc $hostip:0 -vga qxl
-
-- Steve
-Regarding the reproduce: when I launch the buggy version with the same
-options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
-my VNC client silently hangs on the target after a while.  Could it
-happen on your stand as well?
-cpr does not preserve the vnc connection and session.  To test, I specify
-port 0 for the source VM and port 1 for the dest.  When the src vnc goes
-dormant the dest vnc becomes active.
-Sure, I meant that VNC on the dest (on the port 1) works for a while
-after the migration and then hangs, apparently after the guest QXL crash.
-Could you try launching VM with
-"-nographic -device qxl-vga"?  That way VM's serial console is given you
-directly in the shell, so when qxl driver crashes you're still able to
-inspect the kernel messages.
-I have been running like that, but have not reproduced the qxl driver
-crash,
-and I suspect my guest image+kernel is too old.
-Yes, that's probably the case.  But the crash occurs on my Fedora 41
-guest with the 6.11.5-300.fc41.x86_64 kernel, so newer kernels seem to
-be buggy.
-However, once I realized the
-issue was post-cpr modification of qxl memory, I switched my attention
-to the
-fix.
-As for your patch, I can report that it doesn't resolve the issue as it
-is.  But I was able to track down another possible memory corruption
-using your approach with readonly mmap'ing:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
-[Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
-(gdb) bt
-#0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
-#1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70,
-errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
-#2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70,
-errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
-#3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70,
-errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
-#4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70,
-value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
-#5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70,
-v=0x5638996f3770, name=0x56389759b141 "realized",
-opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:2374
-#6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70,
-name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1449
-#7  0x00005638970f8586 in object_property_set_qobject
-(obj=0x5638996e0e70, name=0x56389759b141 "realized",
-value=0x5638996df900, errp=0x7ffd3c2b84e0)
-      at ../qom/qom-qobject.c:28
-#8  0x00005638970f3d8d in object_property_set_bool
-(obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true,
-errp=0x7ffd3c2b84e0)
-      at ../qom/object.c:1519
-#9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70,
-bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
-#10 0x0000563896dba675 in qdev_device_add_from_qdict
-(opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at ../
-system/qdev-monitor.c:714
-#11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150,
-errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
-#12 0x0000563896dc48f1 in device_init_func (opaque=0x0,
-opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at ../system/
-vl.c:1207
-#13 0x000056389737a6cc in qemu_opts_foreach
-      (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca
-<device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
-      at ../util/qemu-option.c:1135
-#14 0x0000563896dc89b5 in qemu_create_cli_devices () at ../system/
-vl.c:2745
-#15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40
-<error_fatal>) at ../system/vl.c:2806
-#16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948)
-at ../system/vl.c:3838
-#17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at ../
-system/main.c:72
-So the attached adjusted version of your patch does seem to help.  At
-least I can't reproduce the crash on my stand.
-Thanks for the stack trace; the calls to SPICE_RING_INIT in init_qxl_ram
-are
-definitely harmful.  Try V2 of the patch, attached, which skips the lines
-of init_qxl_ram that modify guest memory.
-Thanks, your v2 patch does seem to prevent the crash.  Would you re-send
-it to the list as a proper fix?
-Yes.  Was waiting for your confirmation.
-I'm wondering, could it be useful to explicitly mark all the reused
-memory regions readonly upon cpr-transfer, and then make them writable
-back again after the migration is done?  That way we will be segfaulting
-early on instead of debugging tricky memory corruptions.
-It's a useful debugging technique, but changing protection on a large
-memory region
-can be too expensive for production due to TLB shootdowns.
-
-Also, there are cases where writes are performed but the value is
-guaranteed to
-be the same:
-   qxl_post_load()
-     qxl_set_mode()
-       d->rom->mode = cpu_to_le32(modenr);
-The value is the same because mode and shadow_rom.mode were passed in
-vmstate
-from old qemu.
-There're also cases where devices' ROM might be re-initialized.  E.g.
-this segfault occures upon further exploration of RO mapped RAM blocks:
-Program terminated with signal SIGSEGV, Segmentation fault.
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-664             rep     movsb
-[Current thread is 1 (Thread 0x7f6e7d08b480 (LWP 310379))]
-(gdb) bt
-#0  __memmove_avx_unaligned_erms () at 
-../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:664
-#1  0x000055aa1d030ecd in rom_set_mr (rom=0x55aa200ba380, owner=0x55aa2019ac10, 
-name=0x7fffb8272bc0 "/rom@etc/acpi/tables", ro=true)
-     at ../hw/core/loader.c:1032
-#2  0x000055aa1d031577 in rom_add_blob
-     (name=0x55aa1da51f13 "etc/acpi/tables", blob=0x55aa208a1070, len=131072, max_len=2097152, 
-addr=18446744073709551615, fw_file_name=0x55aa1da51f13 "etc/acpi/tables", 
-fw_callback=0x55aa1d441f59 <acpi_build_update>, callback_opaque=0x55aa20ff0010, as=0x0, 
-read_only=true) at ../hw/core/loader.c:1147
-#3  0x000055aa1cfd788d in acpi_add_rom_blob
-     (update=0x55aa1d441f59 <acpi_build_update>, opaque=0x55aa20ff0010, 
-blob=0x55aa1fc9aa00, name=0x55aa1da51f13 "etc/acpi/tables") at ../hw/acpi/utils.c:46
-#4  0x000055aa1d44213f in acpi_setup () at ../hw/i386/acpi-build.c:2720
-#5  0x000055aa1d434199 in pc_machine_done (notifier=0x55aa1ff15050, data=0x0) 
-at ../hw/i386/pc.c:638
-#6  0x000055aa1d876845 in notifier_list_notify (list=0x55aa1ea25c10 
-<machine_init_done_notifiers>, data=0x0) at ../util/notify.c:39
-#7  0x000055aa1d039ee5 in qdev_machine_creation_done () at 
-../hw/core/machine.c:1749
-#8  0x000055aa1d2c7b3e in qemu_machine_creation_done (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2779
-#9  0x000055aa1d2c7c7d in qmp_x_exit_preconfig (errp=0x55aa1ea5cc40 
-<error_fatal>) at ../system/vl.c:2807
-#10 0x000055aa1d2ca64f in qemu_init (argc=35, argv=0x7fffb82730e8) at 
-../system/vl.c:3838
-#11 0x000055aa1d79638c in main (argc=35, argv=0x7fffb82730e8) at 
-../system/main.c:72
-I'm not sure whether ACPI tables ROM in particular is rewritten with the
-same content, but there might be cases where ROM can be read from file
-system upon initialization.  That is undesirable as guest kernel
-certainly won't be too happy about sudden change of the device's ROM
-content.
-
-So the issue we're dealing with here is any unwanted memory related
-device initialization upon cpr.
-
-For now the only thing that comes to my mind is to make a test where we
-put as many devices as we can into a VM, make ram blocks RO upon cpr
-(and remap them as RW later after migration is done, if needed), and
-catch any unwanted memory violations.  As Den suggested, we might
-consider adding that behaviour as a separate non-default option (or
-"migrate" command flag specific to cpr-transfer), which would only be
-used in the testing.
-I'll look into adding an option, but there may be too many false positives,
-such as the qxl_set_mode case above.  And the maintainers may object to me
-eliminating the false positives by adding more CPR_IN tests, due to gratuitous
-(from their POV) ugliness.
-
-But I will use the technique to look for more write violations.
-Andrey
-No way. ACPI with the source must be used in the same way as BIOSes
-and optional ROMs.
-Yup, its a bug.  Will fix.
-
-- Steve
-
diff --git a/results/classifier/016/virtual/46572227 b/results/classifier/016/virtual/46572227
deleted file mode 100644
index a2aa74d5..00000000
--- a/results/classifier/016/virtual/46572227
+++ /dev/null
@@ -1,433 +0,0 @@
-virtual: 0.980
-x86: 0.924
-operating system: 0.179
-hypervisor: 0.141
-performance: 0.100
-debug: 0.091
-vnc: 0.088
-KVM: 0.065
-user-level: 0.060
-VMM: 0.025
-network: 0.025
-TCG: 0.021
-files: 0.015
-socket: 0.013
-boot: 0.009
-device: 0.009
-PID: 0.007
-permissions: 0.006
-assembly: 0.006
-register: 0.005
-kernel: 0.004
-semantic: 0.004
-architecture: 0.003
-alpha: 0.003
-peripherals: 0.003
-risc-v: 0.002
-graphic: 0.001
-ppc: 0.001
-i386: 0.001
-mistranslation: 0.000
-arm: 0.000
-
-[Qemu-devel] [Bug?] Windows 7's time drift obviously while RTC rate switching frequently between high and low timer rate
-
-Hi,
-
-We tested with the latest QEMU, and found that time drift obviously (clock fast 
-in guest)
-in Windows 7 64 bits guest in some cases.
-
-It is easily to reproduce, using the follow QEMU command line to start windows 
-7:
-
-# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine 
-pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp 
-4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet 
--global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc 
-:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device 
-piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 
--device usb-kbd,id=input2 -monitor stdio
-
-Adjust the VM's time to host time, and run java application or run the follow 
-program
-in windows 7:
-
-#pragma comment(lib, "winmm")
-#include <stdio.h>
-#include <windows.h>
-
-#define SWITCH_PEROID  13
-
-int main()
-{
-        DWORD count = 0;
-
-        while (1)
-        {
-                count++;
-                timeBeginPeriod(1);
-                DWORD start = timeGetTime();
-                Sleep(40);
-                timeEndPeriod(1);
-                if ((count % SWITCH_PEROID) == 0) {
-                        Sleep(1);
-                }
-        }
-        return 0;
-}
-
-After few minutes, you will find that the time in windows 7 goes ahead of the
-host time, drifts about several seconds.
-
-I have dug deeper in this problem. For windows systems that use the CMOS timer,
-the base interrupt rate is usually 64Hz, but running some application in VM
-will raise the timer rate to 1024Hz, running java application and or above
-program will raise the timer rate.
-Besides, Windows operating systems generally keep time by counting timer
-interrupts (ticks). But QEMU seems not emulate the rate converting fine.
-
-We update the timer in function periodic_timer_update():
-static void periodic_timer_update(RTCState *s, int64_t current_time)
-{
-
-        cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, get_ticks_per_sec());
-        next_irq_clock = (cur_clock & ~(period - 1)) + period;
-                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Here we calculate the next interrupt time by align the current clock with the
-new period, I'm a little confused that why we care about the *history* time ?
-If VM switches from high rate to low rate, the next interrupt time may come
-earlier than it supposed to be. We have observed it in our test. we printed the
-interval time of interrupts and the VM's current time (We got the time from VM).
-
-Here is part of the log:
-... ...
-period=512 irq inject 1534: 15625 us
-Tue Mar 29 04:38:00 2016
-*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 
-us
-... ...
-*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 
-us
-Convert 32 --- > 512: 703: 96578 us
-period=512 irq inject 44391: 12702 us
-Convert 512 --- > 32: 704: 12704 us11
-period=32 irq inject 44392: 979 us
-... ...
-32 --- > 512: 705: 24388 us
-period=512 irq inject 44417: 6834 us
-Convert 512 --- > 32: 706: 6830 us
-period=32 irq inject 44418: 978 us
-... ...
-Convert 32 --- > 512: 707: 60525 us
-period=512 irq inject 44480: 1945 us
-Convert 512 --- > 32: 708: 1955 us
-period=32 irq inject 44481: 977 us
-... ...
-Convert 32 --- > 512: 709: 36105 us
-period=512 irq inject 44518: 10741 us
-Convert 512 --- > 32: 710: 10736 us
-period=32 irq inject 44519: 989 us
-... ...
-Convert 32 --- > 512: 711: 123998 us
-period=512 irq inject 44646: 974 us
-period=512 irq inject 44647: 15607 us
-Convert 512 --- > 32: 712: 16560 us
-period=32 irq inject 44648: 980 us
-... ...
-period=32 irq inject 44738: 974 us
-Convert 32 --- > 512: 713: 88828 us
-period=512 irq inject 44739: 4885 us
-Convert 512 --- > 32: 714: 4882 us
-period=32 irq inject 44740: 989 us
-... ...
-period=32 irq inject 44842: 974 us
-Convert 32 --- > 512: 715: 100537 us
-period=512 irq inject 44843: 8788 us
-Convert 512 --- > 32: 716: 8789 us
-period=32 irq inject 44844: 972 us
-... ...
-period=32 irq inject 44941: 979 us
-Convert 32 --- > 512: 717: 95677 us
-period=512 irq inject 44942: 13661 us
-Convert 512 --- > 32: 718: 13657 us
-period=32 irq inject 44943: 987 us
-... ...
-Convert 32 --- > 512: 719: 94690 us
-period=512 irq inject 45040: 14643 us
-Convert 512 --- > 32: 720: 14642 us
-period=32 irq inject 45041: 974 us
-... ...
-Convert 32 --- > 512: 721: 88848 us
-period=512 irq inject 45132: 4892 us
-Convert 512 --- > 32: 722: 4931 us
-period=32 irq inject 45133: 964 us
-... ...
-Tue Mar 29 04:39:19 2016
-*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is 
-911520 us
-
-For windows 7, it has got 835 IRQs which injected during the period of 32,
-and got 11 IRQs that injected during the period of 512. it updated the 
-wall-clock
-time with one second, because it supposed it has counted
-(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us.
-
-IMHO, we should calculate the next interrupt time based on the time of last
-interrupt injected, and it seems to be more similar with hardware CMOS timer
-in this way.
-Maybe someone can tell me the reason why we calculated the interrupt timer
-in that way, or is it a bug ? ;)
-
-Thanks,
-Hailiang
-
-ping...
-
-It seems that we can eliminate the drift by the following patch.
-(I tested it for two hours, and there is no drift, before, the timer
-in Windows 7 drifts about 2 seconds per minute.) I'm not sure if it is
-the right way to solve the problem.
-Any comments are welcomed. Thanks.
-
-From bd6acd577cbbc9d92d6376c770219470f184f7de Mon Sep 17 00:00:00 2001
-From: zhanghailiang <address@hidden>
-Date: Thu, 31 Mar 2016 16:36:15 -0400
-Subject: [PATCH] timer/mc146818rtc: fix timer drift in Windows OS while RTC
- rate converting frequently
-
-Signed-off-by: zhanghailiang <address@hidden>
----
- hw/timer/mc146818rtc.c | 25 ++++++++++++++++++++++---
- 1 file changed, 22 insertions(+), 3 deletions(-)
-
-diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
-index 2ac0fd3..e39d2da 100644
---- a/hw/timer/mc146818rtc.c
-+++ b/hw/timer/mc146818rtc.c
-@@ -79,6 +79,7 @@ typedef struct RTCState {
-     /* periodic timer */
-     QEMUTimer *periodic_timer;
-     int64_t next_periodic_time;
-+    uint64_t last_periodic_time;
-     /* update-ended timer */
-     QEMUTimer *update_timer;
-     uint64_t next_alarm_time;
-@@ -152,7 +153,8 @@ static void rtc_coalesced_timer(void *opaque)
- static void periodic_timer_update(RTCState *s, int64_t current_time)
- {
-     int period_code, period;
--    int64_t cur_clock, next_irq_clock;
-+    int64_t cur_clock, next_irq_clock, pre_irq_clock;
-+    bool change = false;
-
-     period_code = s->cmos_data[RTC_REG_A] & 0x0f;
-     if (period_code != 0
-@@ -165,14 +167,28 @@ static void periodic_timer_update(RTCState *s, int64_t 
-current_time)
-         if (period != s->period) {
-             s->irq_coalesced = (s->irq_coalesced * s->period) / period;
-             DPRINTF_C("cmos: coalesced irqs scaled to %d\n", s->irq_coalesced);
-+            if (s->period && period) {
-+                change = true;
-+            }
-         }
-         s->period = period;
- #endif
-         /* compute 32 khz clock */
-         cur_clock =
-             muldiv64(current_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND);
-+        if (change) {
-+            int offset = 0;
-
--        next_irq_clock = (cur_clock & ~(period - 1)) + period;
-+            pre_irq_clock = muldiv64(s->last_periodic_time, RTC_CLOCK_RATE,
-+                                    NANOSECONDS_PER_SECOND);
-+            if ((cur_clock - pre_irq_clock) >  period) {
-+                offset =  (cur_clock - pre_irq_clock) / period;
-+            }
-+            s->irq_coalesced += offset;
-+            next_irq_clock = pre_irq_clock + (offset + 1) * period;
-+        } else {
-+            next_irq_clock = (cur_clock & ~(period - 1)) + period;
-+        }
-         s->next_periodic_time = muldiv64(next_irq_clock, 
-NANOSECONDS_PER_SECOND,
-                                          RTC_CLOCK_RATE) + 1;
-         timer_mod(s->periodic_timer, s->next_periodic_time);
-@@ -187,7 +203,9 @@ static void periodic_timer_update(RTCState *s, int64_t 
-current_time)
- static void rtc_periodic_timer(void *opaque)
- {
-     RTCState *s = opaque;
--
-+    int64_t next_periodic_time;
-+
-+    next_periodic_time = s->next_periodic_time;
-     periodic_timer_update(s, s->next_periodic_time);
-     s->cmos_data[RTC_REG_C] |= REG_C_PF;
-     if (s->cmos_data[RTC_REG_B] & REG_B_PIE) {
-@@ -204,6 +222,7 @@ static void rtc_periodic_timer(void *opaque)
-                 DPRINTF_C("cmos: coalesced irqs increased to %d\n",
-                           s->irq_coalesced);
-             }
-+            s->last_periodic_time = next_periodic_time;
-         } else
- #endif
-         qemu_irq_raise(s->irq);
---
-1.8.3.1
-
-
-On 2016/3/29 19:58, Hailiang Zhang wrote:
-Hi,
-
-We tested with the latest QEMU, and found that time drift obviously (clock fast 
-in guest)
-in Windows 7 64 bits guest in some cases.
-
-It is easily to reproduce, using the follow QEMU command line to start windows 
-7:
-
-# x86_64-softmmu/qemu-system-x86_64 -name win7_64_2U_raw -machine 
-pc-i440fx-2.6,accel=kvm,usb=off -cpu host -m 2048 -realtime mlock=off -smp 
-4,sockets=2,cores=2,threads=1 -rtc base=utc,clock=vm,driftfix=slew -no-hpet 
--global kvm-pit.lost_tick_policy=discard -hda /mnt/nfs/win7_sp1_32_2U_raw -vnc 
-:11 -netdev tap,id=hn0,vhost=off -device rtl8139,id=net-pci0,netdev=hn0 -device 
-piix3-usb-uhci,id=usb -device usb-tablet,id=input0 -device usb-mouse,id=input1 
--device usb-kbd,id=input2 -monitor stdio
-
-Adjust the VM's time to host time, and run java application or run the follow 
-program
-in windows 7:
-
-#pragma comment(lib, "winmm")
-#include <stdio.h>
-#include <windows.h>
-
-#define SWITCH_PEROID  13
-
-int main()
-{
-        DWORD count = 0;
-
-        while (1)
-        {
-                count++;
-                timeBeginPeriod(1);
-                DWORD start = timeGetTime();
-                Sleep(40);
-                timeEndPeriod(1);
-                if ((count % SWITCH_PEROID) == 0) {
-                        Sleep(1);
-                }
-        }
-        return 0;
-}
-
-After few minutes, you will find that the time in windows 7 goes ahead of the
-host time, drifts about several seconds.
-
-I have dug deeper in this problem. For windows systems that use the CMOS timer,
-the base interrupt rate is usually 64Hz, but running some application in VM
-will raise the timer rate to 1024Hz, running java application and or above
-program will raise the timer rate.
-Besides, Windows operating systems generally keep time by counting timer
-interrupts (ticks). But QEMU seems not emulate the rate converting fine.
-
-We update the timer in function periodic_timer_update():
-static void periodic_timer_update(RTCState *s, int64_t current_time)
-{
-
-          cur_clock = muldiv64(current_time, RTC_CLOCK_RATE, 
-get_ticks_per_sec());
-          next_irq_clock = (cur_clock & ~(period - 1)) + period;
-                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Here we calculate the next interrupt time by align the current clock with the
-new period, I'm a little confused that why we care about the *history* time ?
-If VM switches from high rate to low rate, the next interrupt time may come
-earlier than it supposed to be. We have observed it in our test. we printed the
-interval time of interrupts and the VM's current time (We got the time from VM).
-
-Here is part of the log:
-... ...
-period=512 irq inject 1534: 15625 us
-Tue Mar 29 04:38:00 2016
-*irq_num_period_32=0, irq_num_period_512=64: [3]: Real time interval is 999696 
-us
-... ...
-*irq_num_period_32=893, irq_num_period_512=9 [81]: Real time interval is 951086 
-us
-Convert 32 --- > 512: 703: 96578 us
-period=512 irq inject 44391: 12702 us
-Convert 512 --- > 32: 704: 12704 us11
-period=32 irq inject 44392: 979 us
-... ...
-32 --- > 512: 705: 24388 us
-period=512 irq inject 44417: 6834 us
-Convert 512 --- > 32: 706: 6830 us
-period=32 irq inject 44418: 978 us
-... ...
-Convert 32 --- > 512: 707: 60525 us
-period=512 irq inject 44480: 1945 us
-Convert 512 --- > 32: 708: 1955 us
-period=32 irq inject 44481: 977 us
-... ...
-Convert 32 --- > 512: 709: 36105 us
-period=512 irq inject 44518: 10741 us
-Convert 512 --- > 32: 710: 10736 us
-period=32 irq inject 44519: 989 us
-... ...
-Convert 32 --- > 512: 711: 123998 us
-period=512 irq inject 44646: 974 us
-period=512 irq inject 44647: 15607 us
-Convert 512 --- > 32: 712: 16560 us
-period=32 irq inject 44648: 980 us
-... ...
-period=32 irq inject 44738: 974 us
-Convert 32 --- > 512: 713: 88828 us
-period=512 irq inject 44739: 4885 us
-Convert 512 --- > 32: 714: 4882 us
-period=32 irq inject 44740: 989 us
-... ...
-period=32 irq inject 44842: 974 us
-Convert 32 --- > 512: 715: 100537 us
-period=512 irq inject 44843: 8788 us
-Convert 512 --- > 32: 716: 8789 us
-period=32 irq inject 44844: 972 us
-... ...
-period=32 irq inject 44941: 979 us
-Convert 32 --- > 512: 717: 95677 us
-period=512 irq inject 44942: 13661 us
-Convert 512 --- > 32: 718: 13657 us
-period=32 irq inject 44943: 987 us
-... ...
-Convert 32 --- > 512: 719: 94690 us
-period=512 irq inject 45040: 14643 us
-Convert 512 --- > 32: 720: 14642 us
-period=32 irq inject 45041: 974 us
-... ...
-Convert 32 --- > 512: 721: 88848 us
-period=512 irq inject 45132: 4892 us
-Convert 512 --- > 32: 722: 4931 us
-period=32 irq inject 45133: 964 us
-... ...
-Tue Mar 29 04:39:19 2016
-*irq_num_period_32:835, irq_num_period_512:11 [82], Real time interval is 
-911520 us
-
-For windows 7, it has got 835 IRQs which injected during the period of 32,
-and got 11 IRQs that injected during the period of 512. it updated the 
-wall-clock
-time with one second, because it supposed it has counted
-(835*976.5+11*15625)= 987252.5 us, but the real interval time is 911520 us.
-
-IMHO, we should calculate the next interrupt time based on the time of last
-interrupt injected, and it seems to be more similar with hardware CMOS timer
-in this way.
-Maybe someone can tell me the reason why we calculated the interrupt timer
-in that way, or is it a bug ? ;)
-
-Thanks,
-Hailiang
-
diff --git a/results/classifier/016/virtual/53568181 b/results/classifier/016/virtual/53568181
deleted file mode 100644
index 1a1dafcd..00000000
--- a/results/classifier/016/virtual/53568181
+++ /dev/null
@@ -1,105 +0,0 @@
-virtual: 0.966
-x86: 0.963
-performance: 0.800
-KVM: 0.696
-kernel: 0.598
-debug: 0.238
-hypervisor: 0.107
-device: 0.097
-graphic: 0.058
-operating system: 0.053
-i386: 0.045
-files: 0.018
-user-level: 0.011
-register: 0.009
-TCG: 0.007
-architecture: 0.006
-semantic: 0.006
-PID: 0.005
-alpha: 0.005
-socket: 0.004
-peripherals: 0.004
-VMM: 0.003
-network: 0.003
-risc-v: 0.002
-permissions: 0.002
-boot: 0.002
-assembly: 0.001
-vnc: 0.001
-mistranslation: 0.001
-ppc: 0.000
-arm: 0.000
-
-[BUG] x86/PAT handling severely crippled AMD-V SVM KVM performance
-
-Hi, I maintain an out-of-tree 3D APIs pass-through QEMU device models at
-https://github.com/kjliew/qemu-3dfx
-that provide 3D acceleration for legacy
-32-bit Windows guests (Win98SE, WinME, Win2k and WinXP) with the focus on
-playing old legacy games from 1996-2003. It currently supports the now-defunct
-3Dfx propriety API called Glide and an alternative OpenGL pass-through based on
-MESA implementation.
-
-The basic concept of both implementations create memory-mapped virtual
-interfaces consist of host/guest shared memory with guest-push model instead of
-a more common host-pull model for typical QEMU device model implementation.
-Guest uses shared memory as FIFOs for drawing commands and data to bulk up the
-operations until serialization event that flushes the FIFOs into host. This
-achieves extremely good performance since virtual CPUs are fast with hardware
-acceleration (Intel VT/AMD-V) and reduces the overhead of frequent VMEXITs to
-service the device emulation. Both implementations work on Windows 10 with WHPX
-and HAXM accelerators as well as KVM in Linux.
-
-On Windows 10, QEMU WHPX implementation does not sync MSR_IA32_PAT during
-host/guest states sync. There is no visibility into the closed-source WHPX on
-how things are managed behind the scene, but from measuring performance figures
-I can conclude that it didn't handle the MSR_IA32_PAT correctly for both Intel
-and AMD. Call this fair enough, if you will, it didn't flag any concerns, in
-fact games such as Quake2 and Quake3 were still within playable frame rate of
-40~60FPS on Win2k/XP guest. Until the same games were run on Win98/ME guest and
-the frame rate blew off the roof (300~500FPS) on the same CPU and GPU. In fact,
-the later seemed to be more inlined with runnng the games bare-metal with vsync
-off.
-
-On Linux (at the time of writing kernel 5.6.7/Mesa 20.0), the difference
-prevailed. Intel CPUs (and it so happened that I was on laptop with Intel GPU),
-the VMX-based kvm_intel got it right while SVM-based kvm_amd did not.
-To put this in simple exaggeration, an aging Core i3-4010U/HD Graphics 4400
-(Haswell GT2) exhibited an insane performance in Quake2/Quake3 timedemos that
-totally crushed more recent AMD Ryzen 2500U APU/Vega 8 Graphics and AMD
-FX8300/NVIDIA GT730 on desktop. Simply unbelievable!
-
-It turned out that there was something to do with AMD-V NPT. By loading kvm_amd
-with npt=0, AMD Ryzen APU and FX8300 regained a huge performance leap. However,
-AMD NPT issue with KVM was supposedly fixed in 2017 kernel commits. NPT=0 would
-actually incur performance loss for VM due to intervention required by
-hypervisors to maintain the shadow page tables.  Finally, I was able to find the
-pointer that pointed to MSR_IA32_PAT register. By updating the MSR_IA32_PAT to
-0x0606xxxx0606xxxxULL, AMD CPUs now regain their rightful performance without
-taking the hit of NPT=0 for Linux KVM. Taking the same solution into Windows,
-both Intel and AMD CPUs no longer require Win98/ME guest to unleash the full
-performance potentials and performance figures based on games measured on WHPX
-were not very far behind Linux KVM.
-
-So I guess the problem lies in host/guest shared memory regions mapped as
-uncacheable from virtual CPU perspective. As virtual CPUs now completely execute
-in hardware context with x86 hardware virtualiztion extensions, the cacheability
-of memory types would severely impact the performance on guests. WHPX didn't
-handle it for both Intel EPT and AMD NPT, but KVM seems to do it right for Intel
-EPT. I don't have the correct fix for QEMU. But what I can do for my 3D APIs
-pass-through device models is to implement host-side hooks to reprogram and
-restore MSR_IA32_PAT upon activation/deactivation of the 3D APIs. Perhaps there
-is also a better solution of having the proper kernel drivers for virtual
-interfaces to manage the memory types of host/guest shared memory in kernel
-space, but to do that and the needs of Microsoft tools/DDKs, I will just forget
-it. The guest stubs uses the same kernel drivers included in 3Dfx drivers for
-memory mapping and the virtual interfaces remain driver-less from Windows OS
-perspective. Considering the current state of halting progress for QEMU native
-virgil3D to support Windows OS, I am just being pragmatic. I understand that
-QEMU virgil3D will eventually bring 3D acceleration for Windows guests, but I do
-not expect anything to support legacy 32-bit Windows OSes which have out-grown
-their commercial usefulness.
-
-Regards,
-KJ Liew
-
diff --git a/results/classifier/016/virtual/57231878 b/results/classifier/016/virtual/57231878
deleted file mode 100644
index f18786d1..00000000
--- a/results/classifier/016/virtual/57231878
+++ /dev/null
@@ -1,269 +0,0 @@
-virtual: 0.932
-x86: 0.926
-hypervisor: 0.360
-user-level: 0.314
-debug: 0.271
-operating system: 0.200
-TCG: 0.039
-kernel: 0.039
-PID: 0.025
-files: 0.025
-boot: 0.023
-register: 0.020
-network: 0.018
-VMM: 0.014
-socket: 0.013
-semantic: 0.012
-device: 0.012
-alpha: 0.006
-architecture: 0.005
-performance: 0.005
-KVM: 0.003
-risc-v: 0.003
-assembly: 0.003
-graphic: 0.002
-vnc: 0.002
-mistranslation: 0.001
-peripherals: 0.001
-permissions: 0.001
-ppc: 0.000
-i386: 0.000
-arm: 0.000
-
-[Qemu-devel] [BUG] qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed.
-
-Hello all,
-I wanted to submit a bug report in the tracker, but it seem to require
-an Ubuntu One account, which I'm having trouble with, so I'll just
-give it here and hopefully somebody can make use of it.  The issue
-seems to be in an experimental format, so it's likely not very
-consequential anyway.
-
-For the sake of anyone else simply googling for a workaround, I'll
-just paste in the (cleaned up) brief IRC conversation about my issue
-from the official channel:
-<quy> I'm using QEMU version 2.12.0 on an x86_64 host (Arch Linux,
-Kernel v4.17.2), and I'm trying to create an x86_64 virtual machine
-(FreeBSD-11.1).  The VM always aborts at the same point in the
-installation (downloading 'ports.tgz') with the following error
-message:
-"qemu-system-x86_64: /build/qemu/src/qemu-2.12.0/block/qed.c:1197:
-qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed.
-zsh: abort (core dumped)  qemu-system-x86_64 -smp 2 -m 4096
--enable-kvm -hda freebsd/freebsd.qed -devic"
-The commands I ran to create the machine are as follows:
-"qemu-img create -f qed freebsd/freebsd.qed 16G"
-"qemu-system-x86_64 -smp 2 -m 4096 -enable-kvm -hda
-freebsd/freebsd.qed -device e1000,netdev=net0 -netdev user,id=net0
--cdrom FreeBSD-11.1-RELEASE-amd64-bootonly.iso -boot order=d"
-I tried adding logging options with the -d flag, but I didn't get
-anything that seemed relevant, since I'm not sure what to look for.
-<stsquad> ohh what's a qed device?
-<stsquad> quy: it might be a workaround to use a qcow2 image for now
-<stsquad> ahh the wiki has a statement "It is not recommended to use
-QED for any new images. "
-<danpb> 'qed' was an experimental disk image format created by IBM
-before qcow2 v3 came along
-<danpb> honestly nothing should ever use  QED these days
-<danpb> the good ideas from QED became  qcow2v3
-<stsquad> danpb: sounds like we should put a warning on the option to
-remind users of that fact
-<danpb> quy: sounds like qed driver is simply broken - please do file
-a bug against qemu bug tracker
-<danpb> quy: but you should also really switch to qcow2
-<quy> I see; some people need to update their wikis then.  I don't
-remember where which guide I read when I first learned what little
-QEMU I know, but I remember it specifically remember it saying QED was
-the newest and most optimal format.
-<stsquad> quy: we can only be responsible for our own wiki I'm afraid...
-<danpb> if you remember where you saw that please let us know so we
-can try to get it fixed
-<quy> Thank you very much for the info; I will switch to QCOW.
-Unfortunately, I'm not sure if I will be able to file any bug reports
-in the tracker as I can't seem to log Launchpad, which it seems to
-require.
-<danpb> quy:  an email to the mailing list would suffice too if you
-can't deal with launchpad
-<danpb> kwolf: ^^^ in case you're interested in possible QED
-assertions from 2.12
-
-If any more info is needed, feel free to email me; I'm not actually
-subscribed to this list though.
-Thank you,
-Quytelda Kahja
-
-CC Qemu Block; looks like QED is a bit busted.
-
-On 06/27/2018 10:25 AM, Quytelda Kahja wrote:
->
-Hello all,
->
-I wanted to submit a bug report in the tracker, but it seem to require
->
-an Ubuntu One account, which I'm having trouble with, so I'll just
->
-give it here and hopefully somebody can make use of it.  The issue
->
-seems to be in an experimental format, so it's likely not very
->
-consequential anyway.
->
->
-For the sake of anyone else simply googling for a workaround, I'll
->
-just paste in the (cleaned up) brief IRC conversation about my issue
->
-from the official channel:
->
-<quy> I'm using QEMU version 2.12.0 on an x86_64 host (Arch Linux,
->
-Kernel v4.17.2), and I'm trying to create an x86_64 virtual machine
->
-(FreeBSD-11.1).  The VM always aborts at the same point in the
->
-installation (downloading 'ports.tgz') with the following error
->
-message:
->
-"qemu-system-x86_64: /build/qemu/src/qemu-2.12.0/block/qed.c:1197:
->
-qed_aio_write_alloc: Assertion `s->allocating_acb == NULL' failed.
->
-zsh: abort (core dumped)  qemu-system-x86_64 -smp 2 -m 4096
->
--enable-kvm -hda freebsd/freebsd.qed -devic"
->
-The commands I ran to create the machine are as follows:
->
-"qemu-img create -f qed freebsd/freebsd.qed 16G"
->
-"qemu-system-x86_64 -smp 2 -m 4096 -enable-kvm -hda
->
-freebsd/freebsd.qed -device e1000,netdev=net0 -netdev user,id=net0
->
--cdrom FreeBSD-11.1-RELEASE-amd64-bootonly.iso -boot order=d"
->
-I tried adding logging options with the -d flag, but I didn't get
->
-anything that seemed relevant, since I'm not sure what to look for.
->
-<stsquad> ohh what's a qed device?
->
-<stsquad> quy: it might be a workaround to use a qcow2 image for now
->
-<stsquad> ahh the wiki has a statement "It is not recommended to use
->
-QED for any new images. "
->
-<danpb> 'qed' was an experimental disk image format created by IBM
->
-before qcow2 v3 came along
->
-<danpb> honestly nothing should ever use  QED these days
->
-<danpb> the good ideas from QED became  qcow2v3
->
-<stsquad> danpb: sounds like we should put a warning on the option to
->
-remind users of that fact
->
-<danpb> quy: sounds like qed driver is simply broken - please do file
->
-a bug against qemu bug tracker
->
-<danpb> quy: but you should also really switch to qcow2
->
-<quy> I see; some people need to update their wikis then.  I don't
->
-remember where which guide I read when I first learned what little
->
-QEMU I know, but I remember it specifically remember it saying QED was
->
-the newest and most optimal format.
->
-<stsquad> quy: we can only be responsible for our own wiki I'm afraid...
->
-<danpb> if you remember where you saw that please let us know so we
->
-can try to get it fixed
->
-<quy> Thank you very much for the info; I will switch to QCOW.
->
-Unfortunately, I'm not sure if I will be able to file any bug reports
->
-in the tracker as I can't seem to log Launchpad, which it seems to
->
-require.
->
-<danpb> quy:  an email to the mailing list would suffice too if you
->
-can't deal with launchpad
->
-<danpb> kwolf: ^^^ in case you're interested in possible QED
->
-assertions from 2.12
->
->
-If any more info is needed, feel free to email me; I'm not actually
->
-subscribed to this list though.
->
-Thank you,
->
-Quytelda Kahja
->
-
-On 06/29/2018 03:07 PM, John Snow wrote:
-CC Qemu Block; looks like QED is a bit busted.
-
-On 06/27/2018 10:25 AM, Quytelda Kahja wrote:
-Hello all,
-I wanted to submit a bug report in the tracker, but it seem to require
-an Ubuntu One account, which I'm having trouble with, so I'll just
-give it here and hopefully somebody can make use of it.  The issue
-seems to be in an experimental format, so it's likely not very
-consequential anyway.
-Analysis in another thread may be relevant:
-https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08963.html
---
-Eric Blake, Principal Software Engineer
-Red Hat, Inc.           +1-919-301-3266
-Virtualization:  qemu.org | libvirt.org
-
-Am 29.06.2018 um 22:16 hat Eric Blake geschrieben:
->
-On 06/29/2018 03:07 PM, John Snow wrote:
->
-> CC Qemu Block; looks like QED is a bit busted.
->
->
->
-> On 06/27/2018 10:25 AM, Quytelda Kahja wrote:
->
-> > Hello all,
->
-> > I wanted to submit a bug report in the tracker, but it seem to require
->
-> > an Ubuntu One account, which I'm having trouble with, so I'll just
->
-> > give it here and hopefully somebody can make use of it.  The issue
->
-> > seems to be in an experimental format, so it's likely not very
->
-> > consequential anyway.
->
->
-Analysis in another thread may be relevant:
->
->
-https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08963.html
-The assertion there was:
-
-qemu-system-x86_64: block.c:3434: bdrv_replace_node: Assertion 
-`!atomic_read(&to->in_flight)' failed.
-
-Which quite clearly pointed to a drain bug. This one, however, doesn't
-seem to be related to drain, so I think it's probably a different bug.
-
-Kevin
-
diff --git a/results/classifier/016/virtual/67821138 b/results/classifier/016/virtual/67821138
deleted file mode 100644
index 62ec4fff..00000000
--- a/results/classifier/016/virtual/67821138
+++ /dev/null
@@ -1,226 +0,0 @@
-virtual: 0.897
-hypervisor: 0.505
-debug: 0.461
-PID: 0.283
-operating system: 0.187
-KVM: 0.099
-kernel: 0.073
-VMM: 0.070
-TCG: 0.049
-register: 0.044
-x86: 0.036
-permissions: 0.032
-files: 0.027
-risc-v: 0.025
-device: 0.017
-user-level: 0.014
-i386: 0.013
-alpha: 0.013
-socket: 0.010
-assembly: 0.009
-network: 0.007
-ppc: 0.007
-architecture: 0.006
-vnc: 0.006
-semantic: 0.005
-arm: 0.004
-graphic: 0.004
-performance: 0.004
-peripherals: 0.002
-boot: 0.001
-mistranslation: 0.000
-
-[BUG, RFC] Base node is in RW after making external snapshot
-
-Hi everyone,
-
-When making an external snapshot, we end up in a situation when 2 block
-graph nodes related to the same image file (format and storage nodes)
-have different RO flags set on them.
-
-E.g.
-
-# ls -la /proc/PID/fd
-lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd
-
-# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
---pretty | egrep '"node-name"|"ro"'
-      "ro": false,
-      "node-name": "libvirt-1-format",
-      "ro": false,
-      "node-name": "libvirt-1-storage",
-
-# virsh snapshot-create-as VM --name snap --disk-only
-Domain snapshot snap created
-
-# ls -la /proc/PID/fd
-lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd
-lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap
-
-# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
---pretty | egrep '"node-name"|"ro"'
-      "ro": false,
-      "node-name": "libvirt-2-format",
-      "ro": false,
-      "node-name": "libvirt-2-storage",
-      "ro": true,
-      "node-name": "libvirt-1-format",
-      "ro": false,                        <--------------
-      "node-name": "libvirt-1-storage",
-
-File descriptor has been reopened in RO, but "libvirt-1-storage" node
-still has RW permissions set.
-
-I'm wondering it this a bug or this is intended?  Looks like a bug to
-me, although I see that some iotests (e.g. 273) expect 2 nodes related
-to the same image file to have different RO flags.
-
-bdrv_reopen_set_read_only()
-  bdrv_reopen()
-    bdrv_reopen_queue()
-      bdrv_reopen_queue_child()
-    bdrv_reopen_multiple()
-      bdrv_list_refresh_perms()
-        bdrv_topological_dfs()
-        bdrv_do_refresh_perms()
-      bdrv_reopen_commit()
-
-In the stack above bdrv_reopen_set_read_only() is only being called for
-the parent (libvirt-1-format) node.  There're 2 lists: BDSs from
-refresh_list are used by bdrv_drv_set_perm and this leads to actual
-reopen with RO of the file descriptor.  And then there's reopen queue
-bs_queue -- BDSs from this queue get their parameters updated.  While
-refresh_list ends up having the whole subtree (including children, this
-is done in bdrv_topological_dfs()) bs_queue only has the parent.  And
-that is because storage (child) node's (bs->inherits_from == NULL), so
-bdrv_reopen_queue_child() never adds it to the queue.  Could it be the
-source of this bug?
-
-Anyway, would greatly appreciate a clarification.
-
-Andrey
-
-On 4/24/24 21:00, Andrey Drobyshev wrote:
->
-Hi everyone,
->
->
-When making an external snapshot, we end up in a situation when 2 block
->
-graph nodes related to the same image file (format and storage nodes)
->
-have different RO flags set on them.
->
->
-E.g.
->
->
-# ls -la /proc/PID/fd
->
-lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd
->
->
-# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
->
---pretty | egrep '"node-name"|"ro"'
->
-"ro": false,
->
-"node-name": "libvirt-1-format",
->
-"ro": false,
->
-"node-name": "libvirt-1-storage",
->
->
-# virsh snapshot-create-as VM --name snap --disk-only
->
-Domain snapshot snap created
->
->
-# ls -la /proc/PID/fd
->
-lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd
->
-lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap
->
->
-# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
->
---pretty | egrep '"node-name"|"ro"'
->
-"ro": false,
->
-"node-name": "libvirt-2-format",
->
-"ro": false,
->
-"node-name": "libvirt-2-storage",
->
-"ro": true,
->
-"node-name": "libvirt-1-format",
->
-"ro": false,                        <--------------
->
-"node-name": "libvirt-1-storage",
->
->
-File descriptor has been reopened in RO, but "libvirt-1-storage" node
->
-still has RW permissions set.
->
->
-I'm wondering it this a bug or this is intended?  Looks like a bug to
->
-me, although I see that some iotests (e.g. 273) expect 2 nodes related
->
-to the same image file to have different RO flags.
->
->
-bdrv_reopen_set_read_only()
->
-bdrv_reopen()
->
-bdrv_reopen_queue()
->
-bdrv_reopen_queue_child()
->
-bdrv_reopen_multiple()
->
-bdrv_list_refresh_perms()
->
-bdrv_topological_dfs()
->
-bdrv_do_refresh_perms()
->
-bdrv_reopen_commit()
->
->
-In the stack above bdrv_reopen_set_read_only() is only being called for
->
-the parent (libvirt-1-format) node.  There're 2 lists: BDSs from
->
-refresh_list are used by bdrv_drv_set_perm and this leads to actual
->
-reopen with RO of the file descriptor.  And then there's reopen queue
->
-bs_queue -- BDSs from this queue get their parameters updated.  While
->
-refresh_list ends up having the whole subtree (including children, this
->
-is done in bdrv_topological_dfs()) bs_queue only has the parent.  And
->
-that is because storage (child) node's (bs->inherits_from == NULL), so
->
-bdrv_reopen_queue_child() never adds it to the queue.  Could it be the
->
-source of this bug?
->
->
-Anyway, would greatly appreciate a clarification.
->
->
-Andrey
-Friendly ping.  Could somebody confirm that it is a bug indeed?
-
diff --git a/results/classifier/016/virtual/70021271 b/results/classifier/016/virtual/70021271
deleted file mode 100644
index 5563eb0f..00000000
--- a/results/classifier/016/virtual/70021271
+++ /dev/null
@@ -1,7475 +0,0 @@
-virtual: 0.943
-debug: 0.870
-x86: 0.251
-hypervisor: 0.249
-device: 0.054
-files: 0.053
-kernel: 0.045
-register: 0.044
-TCG: 0.040
-PID: 0.037
-i386: 0.035
-operating system: 0.032
-VMM: 0.030
-ppc: 0.022
-assembly: 0.022
-KVM: 0.017
-user-level: 0.014
-peripherals: 0.014
-performance: 0.014
-boot: 0.009
-risc-v: 0.009
-network: 0.007
-arm: 0.007
-architecture: 0.005
-semantic: 0.005
-socket: 0.005
-permissions: 0.004
-alpha: 0.003
-graphic: 0.003
-vnc: 0.002
-mistranslation: 0.001
-
-[Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug
-
-Hi all,
-
-In our test, we configured VM with several pci-bridges and a virtio-net nic 
-been attached with bus 4,
-After VM is startup, We ping this nic from host to judge if it is working 
-normally. Then, we hot add pci devices to this VM with bus 0.
-We  found the virtio-net NIC in bus 4 is not working (can not connect) 
-occasionally, as it kick virtio backend failure with error below:
-    Unassigned mem write 00000000fc803004 = 0x1
-
-memory-region: pci_bridge_pci
-  0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
-    00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
-      00000000fc800000-00000000fc800fff (prio 0, RW): virtio-pci-common
-      00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr
-      00000000fc802000-00000000fc802fff (prio 0, RW): virtio-pci-device
-      00000000fc803000-00000000fc803fff (prio 0, RW): virtio-pci-notify  <- io 
-mem unassigned
-      …
-
-We caught an exceptional address changing while this problem happened, show as 
-follow:
-Before pci_bridge_update_mappings:
-      00000000fc000000-00000000fc1fffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fc000000-00000000fc1fffff
-      00000000fc200000-00000000fc3fffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fc200000-00000000fc3fffff
-      00000000fc400000-00000000fc5fffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fc400000-00000000fc5fffff
-      00000000fc600000-00000000fc7fffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fc600000-00000000fc7fffff
-      00000000fc800000-00000000fc9fffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fc800000-00000000fc9fffff <- correct Adress Spce
-      00000000fca00000-00000000fcbfffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fca00000-00000000fcbfffff
-      00000000fcc00000-00000000fcdfffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fcc00000-00000000fcdfffff
-      00000000fce00000-00000000fcffffff (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci 00000000fce00000-00000000fcffffff
-
-After pci_bridge_update_mappings:
-      00000000fda00000-00000000fdbfffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fda00000-00000000fdbfffff
-      00000000fdc00000-00000000fddfffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fdc00000-00000000fddfffff
-      00000000fde00000-00000000fdffffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fde00000-00000000fdffffff
-      00000000fe000000-00000000fe1fffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fe000000-00000000fe1fffff
-      00000000fe200000-00000000fe3fffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fe200000-00000000fe3fffff
-      00000000fe400000-00000000fe5fffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fe400000-00000000fe5fffff
-      00000000fe600000-00000000fe7fffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fe600000-00000000fe7fffff
-      00000000fe800000-00000000fe9fffff (prio 1, RW): alias pci_bridge_mem 
-@pci_bridge_pci 00000000fe800000-00000000fe9fffff
-      fffffffffc800000-fffffffffc800000 (prio 1, RW): alias pci_bridge_pref_mem 
-@pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional Adress Space
-
-We have figured out why this address becomes this value,  according to pci 
-spec,  pci driver can get BAR address size by writing 0xffffffff to
-the pci register firstly, and then read back the value from this register.
-We didn't handle this value  specially while process pci write in qemu, the 
-function call stack is:
-Pci_bridge_dev_write_config
--> pci_bridge_write_config
--> pci_default_write_config (we update the config[address] value here to 
-fffffffffc800000, which should be 0xfc800000 )
--> pci_bridge_update_mappings
-                ->pci_bridge_region_del(br, br->windows);
--> pci_bridge_region_init
-                                                                
-->pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong value 
-fffffffffc800000)
-                                                -> 
-memory_region_transaction_commit
-
-So, as we can see, we use the wrong base address in qemu to update the memory 
-regions, though, we update the base address to
-The correct value after pci driver in VM write the original value back, the 
-virtio NIC in bus 4 may still sends net packets concurrently with
-The wrong memory region address.
-
-We have tried to skip the memory region update action in qemu while detect pci 
-write with 0xffffffff value, and it does work, but
-This seems to be not gently.
-
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
-index b2e50c3..84b405d 100644
---- a/hw/pci/pci_bridge.c
-+++ b/hw/pci/pci_bridge.c
-@@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d,
-     pci_default_write_config(d, address, val, len);
--    if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
-+    if ( (val != 0xffffffff) &&
-+        (ranges_overlap(address, len, PCI_COMMAND, 2) ||
-         /* io base/limit */
-         ranges_overlap(address, len, PCI_IO_BASE, 2) ||
-@@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d,
-         ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
-         /* vga enable */
--        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
-+        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) {
-         pci_bridge_update_mappings(s);
-     }
-
-Thinks,
-Xu
-
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-Hi all,
->
->
->
->
-In our test, we configured VM with several pci-bridges and a virtio-net nic
->
-been attached with bus 4,
->
->
-After VM is startup, We ping this nic from host to judge if it is working
->
-normally. Then, we hot add pci devices to this VM with bus 0.
->
->
-We  found the virtio-net NIC in bus 4 is not working (can not connect)
->
-occasionally, as it kick virtio backend failure with error below:
->
->
-Unassigned mem write 00000000fc803004 = 0x1
-Thanks for the report. Which guest was used to produce this problem?
-
--- 
-MST
-
-n Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> Hi all,
->
->
->
->
->
->
->
-> In our test, we configured VM with several pci-bridges and a
->
-> virtio-net nic been attached with bus 4,
->
->
->
-> After VM is startup, We ping this nic from host to judge if it is
->
-> working normally. Then, we hot add pci devices to this VM with bus 0.
->
->
->
-> We  found the virtio-net NIC in bus 4 is not working (can not connect)
->
-> occasionally, as it kick virtio backend failure with error below:
->
->
->
->     Unassigned mem write 00000000fc803004 = 0x1
->
->
-Thanks for the report. Which guest was used to produce this problem?
->
->
---
->
-MST
-I was seeing this problem when I hotplug a VFIO device to guest CentOS 7.4,
-after that I compiled the latest Linux kernel and it also contains this problem.
-
-Thinks,
-Xu
-
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-Hi all,
->
->
->
->
-In our test, we configured VM with several pci-bridges and a virtio-net nic
->
-been attached with bus 4,
->
->
-After VM is startup, We ping this nic from host to judge if it is working
->
-normally. Then, we hot add pci devices to this VM with bus 0.
->
->
-We  found the virtio-net NIC in bus 4 is not working (can not connect)
->
-occasionally, as it kick virtio backend failure with error below:
->
->
-Unassigned mem write 00000000fc803004 = 0x1
->
->
->
->
-memory-region: pci_bridge_pci
->
->
-0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
->
->
-00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
->
-00000000fc800000-00000000fc800fff (prio 0, RW): virtio-pci-common
->
->
-00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr
->
->
-00000000fc802000-00000000fc802fff (prio 0, RW): virtio-pci-device
->
->
-00000000fc803000-00000000fc803fff (prio 0, RW): virtio-pci-notify  <- io
->
-mem unassigned
->
->
-…
->
->
->
->
-We caught an exceptional address changing while this problem happened, show as
->
-follow:
->
->
-Before pci_bridge_update_mappings:
->
->
-00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fc000000-00000000fc1fffff
->
->
-00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fc200000-00000000fc3fffff
->
->
-00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fc400000-00000000fc5fffff
->
->
-00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fc600000-00000000fc7fffff
->
->
-00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fc800000-00000000fc9fffff <- correct Adress Spce
->
->
-00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fca00000-00000000fcbfffff
->
->
-00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fcc00000-00000000fcdfffff
->
->
-00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci 00000000fce00000-00000000fcffffff
->
->
->
->
-After pci_bridge_update_mappings:
->
->
-00000000fda00000-00000000fdbfffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fda00000-00000000fdbfffff
->
->
-00000000fdc00000-00000000fddfffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fdc00000-00000000fddfffff
->
->
-00000000fde00000-00000000fdffffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fde00000-00000000fdffffff
->
->
-00000000fe000000-00000000fe1fffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fe000000-00000000fe1fffff
->
->
-00000000fe200000-00000000fe3fffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fe200000-00000000fe3fffff
->
->
-00000000fe400000-00000000fe5fffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fe400000-00000000fe5fffff
->
->
-00000000fe600000-00000000fe7fffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fe600000-00000000fe7fffff
->
->
-00000000fe800000-00000000fe9fffff (prio 1, RW): alias pci_bridge_mem
->
-@pci_bridge_pci 00000000fe800000-00000000fe9fffff
->
->
-fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-@pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional Adress
->
-Space
-This one is empty though right?
-
->
->
->
-We have figured out why this address becomes this value,  according to pci
->
-spec,  pci driver can get BAR address size by writing 0xffffffff to
->
->
-the pci register firstly, and then read back the value from this register.
-OK however as you show below the BAR being sized is the BAR
-if a bridge. Are you then adding a bridge device by hotplug?
-
-
-
->
-We didn't handle this value  specially while process pci write in qemu, the
->
-function call stack is:
->
->
-Pci_bridge_dev_write_config
->
->
--> pci_bridge_write_config
->
->
--> pci_default_write_config (we update the config[address] value here to
->
-fffffffffc800000, which should be 0xfc800000 )
->
->
--> pci_bridge_update_mappings
->
->
-->pci_bridge_region_del(br, br->windows);
->
->
--> pci_bridge_region_init
->
->
-->
->
-pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong value
->
-fffffffffc800000)
->
->
-->
->
-memory_region_transaction_commit
->
->
->
->
-So, as we can see, we use the wrong base address in qemu to update the memory
->
-regions, though, we update the base address to
->
->
-The correct value after pci driver in VM write the original value back, the
->
-virtio NIC in bus 4 may still sends net packets concurrently with
->
->
-The wrong memory region address.
->
->
->
->
-We have tried to skip the memory region update action in qemu while detect pci
->
-write with 0xffffffff value, and it does work, but
->
->
-This seems to be not gently.
-For sure. But I'm still puzzled as to why does Linux try to
-size the BAR of the bridge while a device behind it is
-used.
-
-Can you pls post your QEMU command line?
-
-
-
->
->
->
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
->
->
-index b2e50c3..84b405d 100644
->
->
---- a/hw/pci/pci_bridge.c
->
->
-+++ b/hw/pci/pci_bridge.c
->
->
-@@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d,
->
->
-pci_default_write_config(d, address, val, len);
->
->
--    if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
-+    if ( (val != 0xffffffff) &&
->
->
-+        (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
-/* io base/limit */
->
->
-ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
->
-@@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d,
->
->
-ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
->
-/* vga enable */
->
->
--        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
->
-+        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) {
->
->
-pci_bridge_update_mappings(s);
->
->
-}
->
->
->
->
-Thinks,
->
->
-Xu
->
-
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> Hi all,
->
->
->
->
->
->
->
-> In our test, we configured VM with several pci-bridges and a
->
-> virtio-net nic been attached with bus 4,
->
->
->
-> After VM is startup, We ping this nic from host to judge if it is
->
-> working normally. Then, we hot add pci devices to this VM with bus 0.
->
->
->
-> We  found the virtio-net NIC in bus 4 is not working (can not connect)
->
-> occasionally, as it kick virtio backend failure with error below:
->
->
->
->     Unassigned mem write 00000000fc803004 = 0x1
->
->
->
->
->
->
->
-> memory-region: pci_bridge_pci
->
->
->
->   0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
->
->
->
->     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
->
->
->       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> virtio-pci-common
->
->
->
->       00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr
->
->
->
->       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> virtio-pci-device
->
->
->
->       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> virtio-pci-notify  <- io mem unassigned
->
->
->
->       …
->
->
->
->
->
->
->
-> We caught an exceptional address changing while this problem happened,
->
-> show as
->
-> follow:
->
->
->
-> Before pci_bridge_update_mappings:
->
->
->
->       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fc000000-00000000fc1fffff
->
->
->
->       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fc200000-00000000fc3fffff
->
->
->
->       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fc400000-00000000fc5fffff
->
->
->
->       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fc600000-00000000fc7fffff
->
->
->
->       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fc800000-00000000fc9fffff
->
-> <- correct Adress Spce
->
->
->
->       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fca00000-00000000fcbfffff
->
->
->
->       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fcc00000-00000000fcdfffff
->
->
->
->       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> pci_bridge_pref_mem @pci_bridge_pci 00000000fce00000-00000000fcffffff
->
->
->
->
->
->
->
-> After pci_bridge_update_mappings:
->
->
->
->       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff
->
->
->
->       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff
->
->
->
->       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff
->
->
->
->       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff
->
->
->
->       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff
->
->
->
->       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff
->
->
->
->       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff
->
->
->
->       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff
->
->
->
->       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-> pci_bridge_pref_mem
->
-> @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional Adress
->
-Space
->
->
-This one is empty though right?
->
->
->
->
->
->
-> We have figured out why this address becomes this value,  according to
->
-> pci spec,  pci driver can get BAR address size by writing 0xffffffff
->
-> to
->
->
->
-> the pci register firstly, and then read back the value from this register.
->
->
->
-OK however as you show below the BAR being sized is the BAR if a bridge. Are
->
-you then adding a bridge device by hotplug?
-No, I just simply hot plugged a VFIO device to Bus 0, another interesting 
-phenomenon is
-If I hot plug the device to other bus, this doesn't happened.
- 
->
->
->
-> We didn't handle this value  specially while process pci write in
->
-> qemu, the function call stack is:
->
->
->
-> Pci_bridge_dev_write_config
->
->
->
-> -> pci_bridge_write_config
->
->
->
-> -> pci_default_write_config (we update the config[address] value here
->
-> -> to
->
-> fffffffffc800000, which should be 0xfc800000 )
->
->
->
-> -> pci_bridge_update_mappings
->
->
->
->                 ->pci_bridge_region_del(br, br->windows);
->
->
->
-> -> pci_bridge_region_init
->
->
->
->                                                                 ->
->
-> pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong
->
-> value
->
-> fffffffffc800000)
->
->
->
->                                                 ->
->
-> memory_region_transaction_commit
->
->
->
->
->
->
->
-> So, as we can see, we use the wrong base address in qemu to update the
->
-> memory regions, though, we update the base address to
->
->
->
-> The correct value after pci driver in VM write the original value
->
-> back, the virtio NIC in bus 4 may still sends net packets concurrently
->
-> with
->
->
->
-> The wrong memory region address.
->
->
->
->
->
->
->
-> We have tried to skip the memory region update action in qemu while
->
-> detect pci write with 0xffffffff value, and it does work, but
->
->
->
-> This seems to be not gently.
->
->
-For sure. But I'm still puzzled as to why does Linux try to size the BAR of
->
-the
->
-bridge while a device behind it is used.
->
->
-Can you pls post your QEMU command line?
-My QEMU command line:
-/root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -object 
-secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-Linux/master-key.aes
- -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu 
-host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m 
-size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp 
-20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -numa 
-node,nodeid=1,cpus=5-9,mem=1024 -numa node,nodeid=2,cpus=10-14,mem=1024 -numa 
-node,nodeid=3,cpus=15-19,mem=1024 -uuid 34a588c7-b0f2-4952-b39c-47fae3411439 
--no-user-config -nodefaults -chardev 
-socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/monitor.sock,server,nowait
- -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet 
--global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -device 
-pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device 
-pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device 
-pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device 
-pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device 
-pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device 
-piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
-usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device 
-nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device 
-virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device 
-virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device 
-virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device 
-virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device 
-virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive 
-file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-virtio-disk0,cache=none
- -device 
-virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
- -drive if=none,id=drive-ide0-1-1,readonly=on,cache=none -device 
-ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev 
-tap,fd=35,id=hostnet0 -device 
-virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4,addr=0x1 
--chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
--device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device 
-cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device 
-virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on
-
-I am also very curious about this issue, in the linux kernel code, maybe double 
-check in function pci_bridge_check_ranges triggered this problem.
-
-
->
->
->
->
->
->
->
->
-> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
->
->
->
-> index b2e50c3..84b405d 100644
->
->
->
-> --- a/hw/pci/pci_bridge.c
->
->
->
-> +++ b/hw/pci/pci_bridge.c
->
->
->
-> @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d,
->
->
->
->      pci_default_write_config(d, address, val, len);
->
->
->
-> -    if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
->
-> +    if ( (val != 0xffffffff) &&
->
->
->
-> +        (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
->
->          /* io base/limit */
->
->
->
->          ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
->
->
-> @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d,
->
->
->
->          ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
->
->
->          /* vga enable */
->
->
->
-> -        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
->
->
-> +        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) {
->
->
->
->          pci_bridge_update_mappings(s);
->
->
->
->      }
->
->
->
->
->
->
->
-> Thinks,
->
->
->
-> Xu
->
->
-
-On Mon, Dec 10, 2018 at 03:12:53AM +0000, xuyandong wrote:
->
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > Hi all,
->
-> >
->
-> >
->
-> >
->
-> > In our test, we configured VM with several pci-bridges and a
->
-> > virtio-net nic been attached with bus 4,
->
-> >
->
-> > After VM is startup, We ping this nic from host to judge if it is
->
-> > working normally. Then, we hot add pci devices to this VM with bus 0.
->
-> >
->
-> > We  found the virtio-net NIC in bus 4 is not working (can not connect)
->
-> > occasionally, as it kick virtio backend failure with error below:
->
-> >
->
-> >     Unassigned mem write 00000000fc803004 = 0x1
->
-> >
->
-> >
->
-> >
->
-> > memory-region: pci_bridge_pci
->
-> >
->
-> >   0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
->
-> >
->
-> >     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
-> >
->
-> >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > virtio-pci-common
->
-> >
->
-> >       00000000fc801000-00000000fc801fff (prio 0, RW): virtio-pci-isr
->
-> >
->
-> >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > virtio-pci-device
->
-> >
->
-> >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > virtio-pci-notify  <- io mem unassigned
->
-> >
->
-> >       …
->
-> >
->
-> >
->
-> >
->
-> > We caught an exceptional address changing while this problem happened,
->
-> > show as
->
-> > follow:
->
-> >
->
-> > Before pci_bridge_update_mappings:
->
-> >
->
-> >       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc000000-00000000fc1fffff
->
-> >
->
-> >       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc200000-00000000fc3fffff
->
-> >
->
-> >       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc400000-00000000fc5fffff
->
-> >
->
-> >       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc600000-00000000fc7fffff
->
-> >
->
-> >       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fc800000-00000000fc9fffff
->
-> > <- correct Adress Spce
->
-> >
->
-> >       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fca00000-00000000fcbfffff
->
-> >
->
-> >       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fcc00000-00000000fcdfffff
->
-> >
->
-> >       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> > pci_bridge_pref_mem @pci_bridge_pci 00000000fce00000-00000000fcffffff
->
-> >
->
-> >
->
-> >
->
-> > After pci_bridge_update_mappings:
->
-> >
->
-> >       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff
->
-> >
->
-> >       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff
->
-> >
->
-> >       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff
->
-> >
->
-> >       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff
->
-> >
->
-> >       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff
->
-> >
->
-> >       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff
->
-> >
->
-> >       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff
->
-> >
->
-> >       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff
->
-> >
->
-> >       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-> > pci_bridge_pref_mem
->
-> > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional Adress
->
-> Space
->
->
->
-> This one is empty though right?
->
->
->
-> >
->
-> >
->
-> > We have figured out why this address becomes this value,  according to
->
-> > pci spec,  pci driver can get BAR address size by writing 0xffffffff
->
-> > to
->
-> >
->
-> > the pci register firstly, and then read back the value from this register.
->
->
->
->
->
-> OK however as you show below the BAR being sized is the BAR if a bridge. Are
->
-> you then adding a bridge device by hotplug?
->
->
-No, I just simply hot plugged a VFIO device to Bus 0, another interesting
->
-phenomenon is
->
-If I hot plug the device to other bus, this doesn't happened.
->
->
->
->
->
->
-> > We didn't handle this value  specially while process pci write in
->
-> > qemu, the function call stack is:
->
-> >
->
-> > Pci_bridge_dev_write_config
->
-> >
->
-> > -> pci_bridge_write_config
->
-> >
->
-> > -> pci_default_write_config (we update the config[address] value here
->
-> > -> to
->
-> > fffffffffc800000, which should be 0xfc800000 )
->
-> >
->
-> > -> pci_bridge_update_mappings
->
-> >
->
-> >                 ->pci_bridge_region_del(br, br->windows);
->
-> >
->
-> > -> pci_bridge_region_init
->
-> >
->
-> >                                                                 ->
->
-> > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong
->
-> > value
->
-> > fffffffffc800000)
->
-> >
->
-> >                                                 ->
->
-> > memory_region_transaction_commit
->
-> >
->
-> >
->
-> >
->
-> > So, as we can see, we use the wrong base address in qemu to update the
->
-> > memory regions, though, we update the base address to
->
-> >
->
-> > The correct value after pci driver in VM write the original value
->
-> > back, the virtio NIC in bus 4 may still sends net packets concurrently
->
-> > with
->
-> >
->
-> > The wrong memory region address.
->
-> >
->
-> >
->
-> >
->
-> > We have tried to skip the memory region update action in qemu while
->
-> > detect pci write with 0xffffffff value, and it does work, but
->
-> >
->
-> > This seems to be not gently.
->
->
->
-> For sure. But I'm still puzzled as to why does Linux try to size the BAR of
->
-> the
->
-> bridge while a device behind it is used.
->
->
->
-> Can you pls post your QEMU command line?
->
->
-My QEMU command line:
->
-/root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S -object
->
-secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-Linux/master-key.aes
->
--machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp
->
-20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024 -numa
->
-node,nodeid=1,cpus=5-9,mem=1024 -numa node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-node,nodeid=3,cpus=15-19,mem=1024 -uuid 34a588c7-b0f2-4952-b39c-47fae3411439
->
--no-user-config -nodefaults -chardev
->
-socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/monitor.sock,server,nowait
->
--mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet
->
--global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -device
->
-pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-virtio-disk0,cache=none
->
--device
->
-virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
->
--drive if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev
->
-tap,fd=35,id=hostnet0 -device
->
-virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4,addr=0x1
->
--chardev pty,id=charserial0 -device
->
-isa-serial,chardev=charserial0,id=serial0 -device
->
-usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on
->
->
-I am also very curious about this issue, in the linux kernel code, maybe
->
-double check in function pci_bridge_check_ranges triggered this problem.
-If you can get the stacktrace in Linux when it tries to write this
-fffff value, that would be quite helpful.
-
-
->
->
->
->
->
->
->
->
-> >
->
-> >
->
-> > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
->
-> >
->
-> > index b2e50c3..84b405d 100644
->
-> >
->
-> > --- a/hw/pci/pci_bridge.c
->
-> >
->
-> > +++ b/hw/pci/pci_bridge.c
->
-> >
->
-> > @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d,
->
-> >
->
-> >      pci_default_write_config(d, address, val, len);
->
-> >
->
-> > -    if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
-> >
->
-> > +    if ( (val != 0xffffffff) &&
->
-> >
->
-> > +        (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
-> >
->
-> >          /* io base/limit */
->
-> >
->
-> >          ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-> >
->
-> > @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d,
->
-> >
->
-> >          ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-> >
->
-> >          /* vga enable */
->
-> >
->
-> > -        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
-> >
->
-> > +        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) {
->
-> >
->
-> >          pci_bridge_update_mappings(s);
->
-> >
->
-> >      }
->
-> >
->
-> >
->
-> >
->
-> > Thinks,
->
-> >
->
-> > Xu
->
-> >
-
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > Hi all,
->
-> > >
->
-> > >
->
-> > >
->
-> > > In our test, we configured VM with several pci-bridges and a
->
-> > > virtio-net nic been attached with bus 4,
->
-> > >
->
-> > > After VM is startup, We ping this nic from host to judge if it is
->
-> > > working normally. Then, we hot add pci devices to this VM with bus 0.
->
-> > >
->
-> > > We  found the virtio-net NIC in bus 4 is not working (can not
->
-> > > connect) occasionally, as it kick virtio backend failure with error
->
-> > > below:
->
-> > >
->
-> > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > >
->
-> > >
->
-> > >
->
-> > > memory-region: pci_bridge_pci
->
-> > >
->
-> > >   0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
->
-> > >
->
-> > >     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
-> > >
->
-> > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > virtio-pci-common
->
-> > >
->
-> > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > virtio-pci-isr
->
-> > >
->
-> > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > virtio-pci-device
->
-> > >
->
-> > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > virtio-pci-notify  <- io mem unassigned
->
-> > >
->
-> > >       …
->
-> > >
->
-> > >
->
-> > >
->
-> > > We caught an exceptional address changing while this problem
->
-> > > happened, show as
->
-> > > follow:
->
-> > >
->
-> > > Before pci_bridge_update_mappings:
->
-> > >
->
-> > >       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fc000000-00000000fc1fffff
->
-> > >
->
-> > >       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fc200000-00000000fc3fffff
->
-> > >
->
-> > >       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fc400000-00000000fc5fffff
->
-> > >
->
-> > >       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fc600000-00000000fc7fffff
->
-> > >
->
-> > >       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fc800000-00000000fc9fffff
->
-> > > <- correct Adress Spce
->
-> > >
->
-> > >       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fca00000-00000000fcbfffff
->
-> > >
->
-> > >       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fcc00000-00000000fcdfffff
->
-> > >
->
-> > >       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > 00000000fce00000-00000000fcffffff
->
-> > >
->
-> > >
->
-> > >
->
-> > > After pci_bridge_update_mappings:
->
-> > >
->
-> > >       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff
->
-> > >
->
-> > >       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff
->
-> > >
->
-> > >       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff
->
-> > >
->
-> > >       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff
->
-> > >
->
-> > >       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff
->
-> > >
->
-> > >       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff
->
-> > >
->
-> > >       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff
->
-> > >
->
-> > >       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> > > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff
->
-> > >
->
-> > >       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-pci_bridge_pref_mem
->
-> > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional
->
-> > > Adress
->
-> > Space
->
-> >
->
-> > This one is empty though right?
->
-> >
->
-> > >
->
-> > >
->
-> > > We have figured out why this address becomes this value,
->
-> > > according to pci spec,  pci driver can get BAR address size by
->
-> > > writing 0xffffffff to
->
-> > >
->
-> > > the pci register firstly, and then read back the value from this
->
-> > > register.
->
-> >
->
-> >
->
-> > OK however as you show below the BAR being sized is the BAR if a
->
-> > bridge. Are you then adding a bridge device by hotplug?
->
->
->
-> No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> interesting phenomenon is If I hot plug the device to other bus, this
->
-> doesn't
->
-happened.
->
->
->
-> >
->
-> >
->
-> > > We didn't handle this value  specially while process pci write in
->
-> > > qemu, the function call stack is:
->
-> > >
->
-> > > Pci_bridge_dev_write_config
->
-> > >
->
-> > > -> pci_bridge_write_config
->
-> > >
->
-> > > -> pci_default_write_config (we update the config[address] value
->
-> > > -> here to
->
-> > > fffffffffc800000, which should be 0xfc800000 )
->
-> > >
->
-> > > -> pci_bridge_update_mappings
->
-> > >
->
-> > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > >
->
-> > > -> pci_bridge_region_init
->
-> > >
->
-> > >                                                                 ->
->
-> > > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong
->
-> > > value
->
-> > > fffffffffc800000)
->
-> > >
->
-> > >                                                 ->
->
-> > > memory_region_transaction_commit
->
-> > >
->
-> > >
->
-> > >
->
-> > > So, as we can see, we use the wrong base address in qemu to update
->
-> > > the memory regions, though, we update the base address to
->
-> > >
->
-> > > The correct value after pci driver in VM write the original value
->
-> > > back, the virtio NIC in bus 4 may still sends net packets
->
-> > > concurrently with
->
-> > >
->
-> > > The wrong memory region address.
->
-> > >
->
-> > >
->
-> > >
->
-> > > We have tried to skip the memory region update action in qemu
->
-> > > while detect pci write with 0xffffffff value, and it does work,
->
-> > > but
->
-> > >
->
-> > > This seems to be not gently.
->
-> >
->
-> > For sure. But I'm still puzzled as to why does Linux try to size the
->
-> > BAR of the bridge while a device behind it is used.
->
-> >
->
-> > Can you pls post your QEMU command line?
->
->
->
-> My QEMU command line:
->
-> /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S
->
-> -object
->
-> secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-
->
-> Linux/master-key.aes -machine
->
-> pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp
->
-> 20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024
->
-> -numa node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults
->
-> -chardev
->
-> socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/moni
->
-> tor.sock,server,nowait -mon
->
-> chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet
->
-> -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on
->
-> -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-v
->
-> irtio-disk0,cache=none -device
->
-> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id
->
-> =virtio-disk0,bootindex=1 -drive
->
-> if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev
->
-> tap,fd=35,id=hostnet0 -device
->
-> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4
->
-> ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> isa-serial,chardev=charserial0,id=serial0 -device
->
-> usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on
->
->
->
-> I am also very curious about this issue, in the linux kernel code, maybe
->
-> double
->
-check in function pci_bridge_check_ranges triggered this problem.
->
->
-If you can get the stacktrace in Linux when it tries to write this fffff
->
-value, that
->
-would be quite helpful.
->
-After I add mdelay(100) in function pci_bridge_check_ranges, this phenomenon is
-easier to reproduce, below is my modify in kernel:
-diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
-index cb389277..86e232d 100644
---- a/drivers/pci/setup-bus.c
-+++ b/drivers/pci/setup-bus.c
-@@ -27,7 +27,7 @@
- #include <linux/slab.h>
- #include <linux/acpi.h>
- #include "pci.h"
--
-+#include <linux/delay.h>
- unsigned int pci_flags;
- 
- struct pci_dev_resource {
-@@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
-                pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
-                                               0xffffffff);
-                pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp);
-+               mdelay(100);
-+               printk(KERN_ERR "sleep\n");
-+                dump_stack();
-                if (!tmp)
-                        b_res[2].flags &= ~IORESOURCE_MEM_64;
-                pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
-
-After hot plugging, we get the following log:
-
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:14.0: BAR 0: assigned [mem 
-0xc2360000-0xc237ffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:14.0: BAR 3: assigned [mem 
-0xc2328000-0xc232bfff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:16 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:17 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:18 uefi-linux kernel: sleep
-Dec 11 09:28:18 uefi-linux kernel: CPU: 16 PID: 502 Comm: kworker/u40:1 Not 
-tainted 4.11.0-rc3+ #11
-Dec 11 09:28:18 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + 
-PIIX, 1996), BIOS 0.0.0 02/06/2015
-Dec 11 09:28:18 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn
-Dec 11 09:28:18 uefi-linux kernel: Call Trace:
-Dec 11 09:28:18 uefi-linux kernel: dump_stack+0x63/0x87
-Dec 11 09:28:18 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960
-Dec 11 09:28:18 uefi-linux kernel: ? dev_printk+0x4d/0x50
-Dec 11 09:28:18 uefi-linux kernel: enable_slot+0x140/0x2f0
-Dec 11 09:28:18 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80
-Dec 11 09:28:18 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:18 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120
-Dec 11 09:28:18 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0
-Dec 11 09:28:18 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0
-Dec 11 09:28:18 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3
-Dec 11 09:28:18 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29
-Dec 11 09:28:18 uefi-linux kernel: process_one_work+0x165/0x410
-Dec 11 09:28:18 uefi-linux kernel: worker_thread+0x137/0x4c0
-Dec 11 09:28:18 uefi-linux kernel: kthread+0x101/0x140
-Dec 11 09:28:18 uefi-linux kernel: ? rescuer_thread+0x380/0x380
-Dec 11 09:28:18 uefi-linux kernel: ? kthread_park+0x90/0x90
-Dec 11 09:28:18 uefi-linux kernel: ret_from_fork+0x2c/0x40
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:18 uefi-linux kernel: sleep
-Dec 11 09:28:18 uefi-linux kernel: CPU: 16 PID: 502 Comm: kworker/u40:1 Not 
-tainted 4.11.0-rc3+ #11
-Dec 11 09:28:18 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + 
-PIIX, 1996), BIOS 0.0.0 02/06/2015
-Dec 11 09:28:18 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn
-Dec 11 09:28:18 uefi-linux kernel: Call Trace:
-Dec 11 09:28:18 uefi-linux kernel: dump_stack+0x63/0x87
-Dec 11 09:28:18 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960
-Dec 11 09:28:18 uefi-linux kernel: ? dev_printk+0x4d/0x50
-Dec 11 09:28:18 uefi-linux kernel: enable_slot+0x140/0x2f0
-Dec 11 09:28:18 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80
-Dec 11 09:28:18 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:18 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120
-Dec 11 09:28:18 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0
-Dec 11 09:28:18 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0
-Dec 11 09:28:18 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3
-Dec 11 09:28:18 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29
-Dec 11 09:28:18 uefi-linux kernel: process_one_work+0x165/0x410
-Dec 11 09:28:18 uefi-linux kernel: worker_thread+0x137/0x4c0
-Dec 11 09:28:18 uefi-linux kernel: kthread+0x101/0x140
-Dec 11 09:28:18 uefi-linux kernel: ? rescuer_thread+0x380/0x380
-Dec 11 09:28:18 uefi-linux kernel: ? kthread_park+0x90/0x90
-Dec 11 09:28:18 uefi-linux kernel: ret_from_fork+0x2c/0x40
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:18 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:19 uefi-linux kernel: sleep
-Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not 
-tainted 4.11.0-rc3+ #11
-Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + 
-PIIX, 1996), BIOS 0.0.0 02/06/2015
-Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn
-Dec 11 09:28:19 uefi-linux kernel: Call Trace:
-Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87
-Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960
-Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50
-Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0
-Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80
-Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0
-Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0
-Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3
-Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29
-Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410
-Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0
-Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140
-Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380
-Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90
-Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:19 uefi-linux kernel: sleep
-Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not 
-tainted 4.11.0-rc3+ #11
-Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + 
-PIIX, 1996), BIOS 0.0.0 02/06/2015
-Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn
-Dec 11 09:28:19 uefi-linux kernel: Call Trace:
-Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87
-Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960
-Dec 11 09:28:19 uefi-linux kernel: ? pci_conf1_read+0xba/0x100
-Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0xe9/0x960
-Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50
-Dec 11 09:28:19 uefi-linux kernel: ? pcibios_allocate_rom_resources+0x45/0x80
-Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0
-Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80
-Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0
-Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0
-Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3
-Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29
-Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410
-Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0
-Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140
-Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380
-Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90
-Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40
-Dec 11 09:28:19 uefi-linux kernel: sleep
-Dec 11 09:28:19 uefi-linux kernel: CPU: 17 PID: 502 Comm: kworker/u40:1 Not 
-tainted 4.11.0-rc3+ #11
-Dec 11 09:28:19 uefi-linux kernel: Hardware name: QEMU Standard PC (i440FX + 
-PIIX, 1996), BIOS 0.0.0 02/06/2015
-Dec 11 09:28:19 uefi-linux kernel: Workqueue: kacpi_hotplug acpi_hotplug_work_fn
-Dec 11 09:28:19 uefi-linux kernel: Call Trace:
-Dec 11 09:28:19 uefi-linux kernel: dump_stack+0x63/0x87
-Dec 11 09:28:19 uefi-linux kernel: __pci_bus_size_bridges+0x931/0x960
-Dec 11 09:28:19 uefi-linux kernel: ? dev_printk+0x4d/0x50
-Dec 11 09:28:19 uefi-linux kernel: enable_slot+0x140/0x2f0
-Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:19 uefi-linux kernel: ? __pm_runtime_resume+0x5c/0x80
-Dec 11 09:28:19 uefi-linux kernel: ? trim_stale_devices+0x9a/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_check_bridge.part.6+0xf5/0x120
-Dec 11 09:28:19 uefi-linux kernel: acpiphp_hotplug_notify+0x145/0x1c0
-Dec 11 09:28:19 uefi-linux kernel: ? acpiphp_post_dock_fixup+0xc0/0xc0
-Dec 11 09:28:19 uefi-linux kernel: acpi_device_hotplug+0x3a6/0x3f3
-Dec 11 09:28:19 uefi-linux kernel: acpi_hotplug_work_fn+0x1e/0x29
-Dec 11 09:28:19 uefi-linux kernel: process_one_work+0x165/0x410
-Dec 11 09:28:19 uefi-linux kernel: worker_thread+0x137/0x4c0
-Dec 11 09:28:19 uefi-linux kernel: kthread+0x101/0x140
-Dec 11 09:28:19 uefi-linux kernel: ? rescuer_thread+0x380/0x380
-Dec 11 09:28:19 uefi-linux kernel: ? kthread_park+0x90/0x90
-Dec 11 09:28:19 uefi-linux kernel: ret_from_fork+0x2c/0x40
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:19 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:20 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:20 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 lost sync at byte 1
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: psmouse serio1: VMMouse at 
-isa0060/serio1/input0 - driver resynced.
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:21 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0: PCI bridge to [bus 01]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0:   bridge window [io  
-0xf000-0xffff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2800000-0xc29fffff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:08.0:   bridge window [mem 
-0xc2b00000-0xc2cfffff 64bit pref]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0: PCI bridge to [bus 02]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0:   bridge window [io  
-0xe000-0xefff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2600000-0xc27fffff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:09.0:   bridge window [mem 
-0xc2d00000-0xc2efffff 64bit pref]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0: PCI bridge to [bus 03]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [io  
-0xd000-0xdfff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2400000-0xc25fffff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:00:0a.0:   bridge window [mem 
-0xc2f00000-0xc30fffff 64bit pref]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0: PCI bridge to [bus 05]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [io  
-0xc000-0xcfff]
-Dec 11 09:28:22 uefi-linux kernel: pci 0000:04:0c.0:   bridge window [mem 
-0xc2000000-0xc21fffff]
-
->
->
->
-> >
->
-> >
->
-> >
->
-> > >
->
-> > >
->
-> > > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
->
-> > >
->
-> > > index b2e50c3..84b405d 100644
->
-> > >
->
-> > > --- a/hw/pci/pci_bridge.c
->
-> > >
->
-> > > +++ b/hw/pci/pci_bridge.c
->
-> > >
->
-> > > @@ -256,7 +256,8 @@ void pci_bridge_write_config(PCIDevice *d,
->
-> > >
->
-> > >      pci_default_write_config(d, address, val, len);
->
-> > >
->
-> > > -    if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
-> > >
->
-> > > +    if ( (val != 0xffffffff) &&
->
-> > >
->
-> > > +        (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
-> > >
->
-> > >          /* io base/limit */
->
-> > >
->
-> > >          ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-> > >
->
-> > > @@ -266,7 +267,7 @@ void pci_bridge_write_config(PCIDevice *d,
->
-> > >
->
-> > >          ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-> > >
->
-> > >          /* vga enable */
->
-> > >
->
-> > > -        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
-> > >
->
-> > > +        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2))) {
->
-> > >
->
-> > >          pci_bridge_update_mappings(s);
->
-> > >
->
-> > >      }
->
-> > >
->
-> > >
->
-> > >
->
-> > > Thinks,
->
-> > >
->
-> > > Xu
->
-> > >
-
-On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
->
-On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > > Hi all,
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > In our test, we configured VM with several pci-bridges and a
->
-> > > > virtio-net nic been attached with bus 4,
->
-> > > >
->
-> > > > After VM is startup, We ping this nic from host to judge if it is
->
-> > > > working normally. Then, we hot add pci devices to this VM with bus 0.
->
-> > > >
->
-> > > > We  found the virtio-net NIC in bus 4 is not working (can not
->
-> > > > connect) occasionally, as it kick virtio backend failure with error
->
-> > > > below:
->
-> > > >
->
-> > > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > memory-region: pci_bridge_pci
->
-> > > >
->
-> > > >   0000000000000000-ffffffffffffffff (prio 0, RW): pci_bridge_pci
->
-> > > >
->
-> > > >     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
-> > > >
->
-> > > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > > virtio-pci-common
->
-> > > >
->
-> > > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > > virtio-pci-isr
->
-> > > >
->
-> > > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > > virtio-pci-device
->
-> > > >
->
-> > > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > > virtio-pci-notify  <- io mem unassigned
->
-> > > >
->
-> > > >       …
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > We caught an exceptional address changing while this problem
->
-> > > > happened, show as
->
-> > > > follow:
->
-> > > >
->
-> > > > Before pci_bridge_update_mappings:
->
-> > > >
->
-> > > >       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fc000000-00000000fc1fffff
->
-> > > >
->
-> > > >       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fc200000-00000000fc3fffff
->
-> > > >
->
-> > > >       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fc400000-00000000fc5fffff
->
-> > > >
->
-> > > >       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fc600000-00000000fc7fffff
->
-> > > >
->
-> > > >       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fc800000-00000000fc9fffff
->
-> > > > <- correct Adress Spce
->
-> > > >
->
-> > > >       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fca00000-00000000fcbfffff
->
-> > > >
->
-> > > >       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fcc00000-00000000fcdfffff
->
-> > > >
->
-> > > >       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > 00000000fce00000-00000000fcffffff
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > After pci_bridge_update_mappings:
->
-> > > >
->
-> > > >       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fda00000-00000000fdbfffff
->
-> > > >
->
-> > > >       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fdc00000-00000000fddfffff
->
-> > > >
->
-> > > >       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fde00000-00000000fdffffff
->
-> > > >
->
-> > > >       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fe000000-00000000fe1fffff
->
-> > > >
->
-> > > >       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fe200000-00000000fe3fffff
->
-> > > >
->
-> > > >       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fe400000-00000000fe5fffff
->
-> > > >
->
-> > > >       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fe600000-00000000fe7fffff
->
-> > > >
->
-> > > >       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> > > > pci_bridge_mem @pci_bridge_pci 00000000fe800000-00000000fe9fffff
->
-> > > >
->
-> > > >       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-> pci_bridge_pref_mem
->
-> > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional
->
-> > > > Adress
->
-> > > Space
->
-> > >
->
-> > > This one is empty though right?
->
-> > >
->
-> > > >
->
-> > > >
->
-> > > > We have figured out why this address becomes this value,
->
-> > > > according to pci spec,  pci driver can get BAR address size by
->
-> > > > writing 0xffffffff to
->
-> > > >
->
-> > > > the pci register firstly, and then read back the value from this
->
-> > > > register.
->
-> > >
->
-> > >
->
-> > > OK however as you show below the BAR being sized is the BAR if a
->
-> > > bridge. Are you then adding a bridge device by hotplug?
->
-> >
->
-> > No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> > interesting phenomenon is If I hot plug the device to other bus, this
->
-> > doesn't
->
-> happened.
->
-> >
->
-> > >
->
-> > >
->
-> > > > We didn't handle this value  specially while process pci write in
->
-> > > > qemu, the function call stack is:
->
-> > > >
->
-> > > > Pci_bridge_dev_write_config
->
-> > > >
->
-> > > > -> pci_bridge_write_config
->
-> > > >
->
-> > > > -> pci_default_write_config (we update the config[address] value
->
-> > > > -> here to
->
-> > > > fffffffffc800000, which should be 0xfc800000 )
->
-> > > >
->
-> > > > -> pci_bridge_update_mappings
->
-> > > >
->
-> > > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > > >
->
-> > > > -> pci_bridge_region_init
->
-> > > >
->
-> > > >                                                                 ->
->
-> > > > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong
->
-> > > > value
->
-> > > > fffffffffc800000)
->
-> > > >
->
-> > > >                                                 ->
->
-> > > > memory_region_transaction_commit
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > So, as we can see, we use the wrong base address in qemu to update
->
-> > > > the memory regions, though, we update the base address to
->
-> > > >
->
-> > > > The correct value after pci driver in VM write the original value
->
-> > > > back, the virtio NIC in bus 4 may still sends net packets
->
-> > > > concurrently with
->
-> > > >
->
-> > > > The wrong memory region address.
->
-> > > >
->
-> > > >
->
-> > > >
->
-> > > > We have tried to skip the memory region update action in qemu
->
-> > > > while detect pci write with 0xffffffff value, and it does work,
->
-> > > > but
->
-> > > >
->
-> > > > This seems to be not gently.
->
-> > >
->
-> > > For sure. But I'm still puzzled as to why does Linux try to size the
->
-> > > BAR of the bridge while a device behind it is used.
->
-> > >
->
-> > > Can you pls post your QEMU command line?
->
-> >
->
-> > My QEMU command line:
->
-> > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S
->
-> > -object
->
-> > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-194-
->
-> > Linux/master-key.aes -machine
->
-> > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp
->
-> > 20,sockets=20,cores=1,threads=1 -numa node,nodeid=0,cpus=0-4,mem=1024
->
-> > -numa node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> > node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> > node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults
->
-> > -chardev
->
-> > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/moni
->
-> > tor.sock,server,nowait -mon
->
-> > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet
->
-> > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on
->
-> > -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=drive-v
->
-> > irtio-disk0,cache=none -device
->
-> > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id
->
-> > =virtio-disk0,bootindex=1 -drive
->
-> > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev
->
-> > tap,fd=35,id=hostnet0 -device
->
-> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=pci.4
->
-> > ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> > isa-serial,chardev=charserial0,id=serial0 -device
->
-> > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg timestamp=on
->
-> >
->
-> > I am also very curious about this issue, in the linux kernel code, maybe
->
-> > double
->
-> check in function pci_bridge_check_ranges triggered this problem.
->
->
->
-> If you can get the stacktrace in Linux when it tries to write this fffff
->
-> value, that
->
-> would be quite helpful.
->
->
->
->
-After I add mdelay(100) in function pci_bridge_check_ranges, this phenomenon
->
-is
->
-easier to reproduce, below is my modify in kernel:
->
-diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
->
-index cb389277..86e232d 100644
->
---- a/drivers/pci/setup-bus.c
->
-+++ b/drivers/pci/setup-bus.c
->
-@@ -27,7 +27,7 @@
->
-#include <linux/slab.h>
->
-#include <linux/acpi.h>
->
-#include "pci.h"
->
--
->
-+#include <linux/delay.h>
->
-unsigned int pci_flags;
->
->
-struct pci_dev_resource {
->
-@@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
->
-pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-0xffffffff);
->
-pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32, &tmp);
->
-+               mdelay(100);
->
-+               printk(KERN_ERR "sleep\n");
->
-+                dump_stack();
->
-if (!tmp)
->
-b_res[2].flags &= ~IORESOURCE_MEM_64;
->
-pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-OK!
-I just sent a Linux patch that should help.
-I would appreciate it if you will give it a try
-and if that helps reply to it with
-a Tested-by: tag.
-
--- 
-MST
-
-On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
->
-> On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > > > Hi all,
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > In our test, we configured VM with several pci-bridges and a
->
-> > > > > virtio-net nic been attached with bus 4,
->
-> > > > >
->
-> > > > > After VM is startup, We ping this nic from host to judge if it
->
-> > > > > is working normally. Then, we hot add pci devices to this VM with
->
-> > > > > bus
->
-0.
->
-> > > > >
->
-> > > > > We  found the virtio-net NIC in bus 4 is not working (can not
->
-> > > > > connect) occasionally, as it kick virtio backend failure with error
->
-> > > > > below:
->
-> > > > >
->
-> > > > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > memory-region: pci_bridge_pci
->
-> > > > >
->
-> > > > >   0000000000000000-ffffffffffffffff (prio 0, RW):
->
-> > > > > pci_bridge_pci
->
-> > > > >
->
-> > > > >     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
-> > > > >
->
-> > > > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > > > virtio-pci-common
->
-> > > > >
->
-> > > > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > > > virtio-pci-isr
->
-> > > > >
->
-> > > > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > > > virtio-pci-device
->
-> > > > >
->
-> > > > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > > > virtio-pci-notify  <- io mem unassigned
->
-> > > > >
->
-> > > > >       …
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > We caught an exceptional address changing while this problem
->
-> > > > > happened, show as
->
-> > > > > follow:
->
-> > > > >
->
-> > > > > Before pci_bridge_update_mappings:
->
-> > > > >
->
-> > > > >       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fc000000-00000000fc1fffff
->
-> > > > >
->
-> > > > >       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fc200000-00000000fc3fffff
->
-> > > > >
->
-> > > > >       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fc400000-00000000fc5fffff
->
-> > > > >
->
-> > > > >       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fc600000-00000000fc7fffff
->
-> > > > >
->
-> > > > >       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fc800000-00000000fc9fffff
->
-> > > > > <- correct Adress Spce
->
-> > > > >
->
-> > > > >       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fca00000-00000000fcbfffff
->
-> > > > >
->
-> > > > >       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fcc00000-00000000fcdfffff
->
-> > > > >
->
-> > > > >       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > 00000000fce00000-00000000fcffffff
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > After pci_bridge_update_mappings:
->
-> > > > >
->
-> > > > >       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fda00000-00000000fdbfffff
->
-> > > > >
->
-> > > > >       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fdc00000-00000000fddfffff
->
-> > > > >
->
-> > > > >       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fde00000-00000000fdffffff
->
-> > > > >
->
-> > > > >       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fe000000-00000000fe1fffff
->
-> > > > >
->
-> > > > >       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fe200000-00000000fe3fffff
->
-> > > > >
->
-> > > > >       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fe400000-00000000fe5fffff
->
-> > > > >
->
-> > > > >       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fe600000-00000000fe7fffff
->
-> > > > >
->
-> > > > >       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > 00000000fe800000-00000000fe9fffff
->
-> > > > >
->
-> > > > >       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-> > pci_bridge_pref_mem
->
-> > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional
->
-Adress
->
-> > > > Space
->
-> > > >
->
-> > > > This one is empty though right?
->
-> > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > We have figured out why this address becomes this value,
->
-> > > > > according to pci spec,  pci driver can get BAR address size by
->
-> > > > > writing 0xffffffff to
->
-> > > > >
->
-> > > > > the pci register firstly, and then read back the value from this
->
-> > > > > register.
->
-> > > >
->
-> > > >
->
-> > > > OK however as you show below the BAR being sized is the BAR if a
->
-> > > > bridge. Are you then adding a bridge device by hotplug?
->
-> > >
->
-> > > No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> > > interesting phenomenon is If I hot plug the device to other bus,
->
-> > > this doesn't
->
-> > happened.
->
-> > >
->
-> > > >
->
-> > > >
->
-> > > > > We didn't handle this value  specially while process pci write
->
-> > > > > in qemu, the function call stack is:
->
-> > > > >
->
-> > > > > Pci_bridge_dev_write_config
->
-> > > > >
->
-> > > > > -> pci_bridge_write_config
->
-> > > > >
->
-> > > > > -> pci_default_write_config (we update the config[address]
->
-> > > > > -> value here to
->
-> > > > > fffffffffc800000, which should be 0xfc800000 )
->
-> > > > >
->
-> > > > > -> pci_bridge_update_mappings
->
-> > > > >
->
-> > > > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > > > >
->
-> > > > > -> pci_bridge_region_init
->
-> > > > >
->
-> > > > >
->
-> > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use the
->
-> > > > > wrong value
->
-> > > > > fffffffffc800000)
->
-> > > > >
->
-> > > > >                                                 ->
->
-> > > > > memory_region_transaction_commit
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > So, as we can see, we use the wrong base address in qemu to
->
-> > > > > update the memory regions, though, we update the base address
->
-> > > > > to
->
-> > > > >
->
-> > > > > The correct value after pci driver in VM write the original
->
-> > > > > value back, the virtio NIC in bus 4 may still sends net
->
-> > > > > packets concurrently with
->
-> > > > >
->
-> > > > > The wrong memory region address.
->
-> > > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > We have tried to skip the memory region update action in qemu
->
-> > > > > while detect pci write with 0xffffffff value, and it does
->
-> > > > > work, but
->
-> > > > >
->
-> > > > > This seems to be not gently.
->
-> > > >
->
-> > > > For sure. But I'm still puzzled as to why does Linux try to size
->
-> > > > the BAR of the bridge while a device behind it is used.
->
-> > > >
->
-> > > > Can you pls post your QEMU command line?
->
-> > >
->
-> > > My QEMU command line:
->
-> > > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S
->
-> > > -object
->
-> > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-
->
-> > > 194-
->
-> > > Linux/master-key.aes -machine
->
-> > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp
->
-> > > 20,sockets=20,cores=1,threads=1 -numa
->
-> > > node,nodeid=0,cpus=0-4,mem=1024 -numa
->
-> > > node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> > > node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults
->
-> > > -chardev
->
-> > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/
->
-> > > moni
->
-> > > tor.sock,server,nowait -mon
->
-> > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet
->
-> > > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot
->
-> > > strict=on -device
->
-> > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=dri
->
-> > > ve-v
->
-> > > irtio-disk0,cache=none -device
->
-> > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk
->
-> > > 0,id
->
-> > > =virtio-disk0,bootindex=1 -drive
->
-> > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev
->
-> > > tap,fd=35,id=hostnet0 -device
->
-> > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=p
->
-> > > ci.4
->
-> > > ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> > > isa-serial,chardev=charserial0,id=serial0 -device
->
-> > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
->
-> > > timestamp=on
->
-> > >
->
-> > > I am also very curious about this issue, in the linux kernel code,
->
-> > > maybe double
->
-> > check in function pci_bridge_check_ranges triggered this problem.
->
-> >
->
-> > If you can get the stacktrace in Linux when it tries to write this
->
-> > fffff value, that would be quite helpful.
->
-> >
->
->
->
-> After I add mdelay(100) in function pci_bridge_check_ranges, this
->
-> phenomenon is easier to reproduce, below is my modify in kernel:
->
-> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index
->
-> cb389277..86e232d 100644
->
-> --- a/drivers/pci/setup-bus.c
->
-> +++ b/drivers/pci/setup-bus.c
->
-> @@ -27,7 +27,7 @@
->
->  #include <linux/slab.h>
->
->  #include <linux/acpi.h>
->
->  #include "pci.h"
->
-> -
->
-> +#include <linux/delay.h>
->
->  unsigned int pci_flags;
->
->
->
->  struct pci_dev_resource {
->
-> @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus
->
-*bus)
->
->                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
->                                                0xffffffff);
->
->                 pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> &tmp);
->
-> +               mdelay(100);
->
-> +               printk(KERN_ERR "sleep\n");
->
-> +                dump_stack();
->
->                 if (!tmp)
->
->                         b_res[2].flags &= ~IORESOURCE_MEM_64;
->
->                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
->
->
->
-OK!
->
-I just sent a Linux patch that should help.
->
-I would appreciate it if you will give it a try and if that helps reply to it
->
-with a
->
-Tested-by: tag.
->
-I tested this patch and it works fine on my machine.
-
-But I have another question, if we only fix this problem in the kernel, the 
-Linux
-version that has been released does not work well on the virtualization 
-platform. 
-Is there a way to fix this problem in the backend?
-
->
---
->
-MST
-
-On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote:
->
-On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
->
-> > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > > > > Hi all,
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > In our test, we configured VM with several pci-bridges and a
->
-> > > > > > virtio-net nic been attached with bus 4,
->
-> > > > > >
->
-> > > > > > After VM is startup, We ping this nic from host to judge if it
->
-> > > > > > is working normally. Then, we hot add pci devices to this VM with
->
-> > > > > > bus
->
-> 0.
->
-> > > > > >
->
-> > > > > > We  found the virtio-net NIC in bus 4 is not working (can not
->
-> > > > > > connect) occasionally, as it kick virtio backend failure with
->
-> > > > > > error below:
->
-> > > > > >
->
-> > > > > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > memory-region: pci_bridge_pci
->
-> > > > > >
->
-> > > > > >   0000000000000000-ffffffffffffffff (prio 0, RW):
->
-> > > > > > pci_bridge_pci
->
-> > > > > >
->
-> > > > > >     00000000fc800000-00000000fc803fff (prio 1, RW): virtio-pci
->
-> > > > > >
->
-> > > > > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > > > > virtio-pci-common
->
-> > > > > >
->
-> > > > > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > > > > virtio-pci-isr
->
-> > > > > >
->
-> > > > > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > > > > virtio-pci-device
->
-> > > > > >
->
-> > > > > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > > > > virtio-pci-notify  <- io mem unassigned
->
-> > > > > >
->
-> > > > > >       …
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > We caught an exceptional address changing while this problem
->
-> > > > > > happened, show as
->
-> > > > > > follow:
->
-> > > > > >
->
-> > > > > > Before pci_bridge_update_mappings:
->
-> > > > > >
->
-> > > > > >       00000000fc000000-00000000fc1fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fc000000-00000000fc1fffff
->
-> > > > > >
->
-> > > > > >       00000000fc200000-00000000fc3fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fc200000-00000000fc3fffff
->
-> > > > > >
->
-> > > > > >       00000000fc400000-00000000fc5fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fc400000-00000000fc5fffff
->
-> > > > > >
->
-> > > > > >       00000000fc600000-00000000fc7fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fc600000-00000000fc7fffff
->
-> > > > > >
->
-> > > > > >       00000000fc800000-00000000fc9fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fc800000-00000000fc9fffff
->
-> > > > > > <- correct Adress Spce
->
-> > > > > >
->
-> > > > > >       00000000fca00000-00000000fcbfffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fca00000-00000000fcbfffff
->
-> > > > > >
->
-> > > > > >       00000000fcc00000-00000000fcdfffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fcc00000-00000000fcdfffff
->
-> > > > > >
->
-> > > > > >       00000000fce00000-00000000fcffffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > 00000000fce00000-00000000fcffffff
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > After pci_bridge_update_mappings:
->
-> > > > > >
->
-> > > > > >       00000000fda00000-00000000fdbfffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fda00000-00000000fdbfffff
->
-> > > > > >
->
-> > > > > >       00000000fdc00000-00000000fddfffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fdc00000-00000000fddfffff
->
-> > > > > >
->
-> > > > > >       00000000fde00000-00000000fdffffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fde00000-00000000fdffffff
->
-> > > > > >
->
-> > > > > >       00000000fe000000-00000000fe1fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fe000000-00000000fe1fffff
->
-> > > > > >
->
-> > > > > >       00000000fe200000-00000000fe3fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fe200000-00000000fe3fffff
->
-> > > > > >
->
-> > > > > >       00000000fe400000-00000000fe5fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fe400000-00000000fe5fffff
->
-> > > > > >
->
-> > > > > >       00000000fe600000-00000000fe7fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fe600000-00000000fe7fffff
->
-> > > > > >
->
-> > > > > >       00000000fe800000-00000000fe9fffff (prio 1, RW): alias
->
-> > > > > > pci_bridge_mem @pci_bridge_pci
->
-> > > > > > 00000000fe800000-00000000fe9fffff
->
-> > > > > >
->
-> > > > > >       fffffffffc800000-fffffffffc800000 (prio 1, RW): alias
->
-> > > pci_bridge_pref_mem
->
-> > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <- Exceptional
->
-> Adress
->
-> > > > > Space
->
-> > > > >
->
-> > > > > This one is empty though right?
->
-> > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > We have figured out why this address becomes this value,
->
-> > > > > > according to pci spec,  pci driver can get BAR address size by
->
-> > > > > > writing 0xffffffff to
->
-> > > > > >
->
-> > > > > > the pci register firstly, and then read back the value from this
->
-> > > > > > register.
->
-> > > > >
->
-> > > > >
->
-> > > > > OK however as you show below the BAR being sized is the BAR if a
->
-> > > > > bridge. Are you then adding a bridge device by hotplug?
->
-> > > >
->
-> > > > No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> > > > interesting phenomenon is If I hot plug the device to other bus,
->
-> > > > this doesn't
->
-> > > happened.
->
-> > > >
->
-> > > > >
->
-> > > > >
->
-> > > > > > We didn't handle this value  specially while process pci write
->
-> > > > > > in qemu, the function call stack is:
->
-> > > > > >
->
-> > > > > > Pci_bridge_dev_write_config
->
-> > > > > >
->
-> > > > > > -> pci_bridge_write_config
->
-> > > > > >
->
-> > > > > > -> pci_default_write_config (we update the config[address]
->
-> > > > > > -> value here to
->
-> > > > > > fffffffffc800000, which should be 0xfc800000 )
->
-> > > > > >
->
-> > > > > > -> pci_bridge_update_mappings
->
-> > > > > >
->
-> > > > > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > > > > >
->
-> > > > > > -> pci_bridge_region_init
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use the
->
-> > > > > > wrong value
->
-> > > > > > fffffffffc800000)
->
-> > > > > >
->
-> > > > > >                                                 ->
->
-> > > > > > memory_region_transaction_commit
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > So, as we can see, we use the wrong base address in qemu to
->
-> > > > > > update the memory regions, though, we update the base address
->
-> > > > > > to
->
-> > > > > >
->
-> > > > > > The correct value after pci driver in VM write the original
->
-> > > > > > value back, the virtio NIC in bus 4 may still sends net
->
-> > > > > > packets concurrently with
->
-> > > > > >
->
-> > > > > > The wrong memory region address.
->
-> > > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > We have tried to skip the memory region update action in qemu
->
-> > > > > > while detect pci write with 0xffffffff value, and it does
->
-> > > > > > work, but
->
-> > > > > >
->
-> > > > > > This seems to be not gently.
->
-> > > > >
->
-> > > > > For sure. But I'm still puzzled as to why does Linux try to size
->
-> > > > > the BAR of the bridge while a device behind it is used.
->
-> > > > >
->
-> > > > > Can you pls post your QEMU command line?
->
-> > > >
->
-> > > > My QEMU command line:
->
-> > > > /root/xyd/qemu-system-x86_64 -name guest=Linux,debug-threads=on -S
->
-> > > > -object
->
-> > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-
->
-> > > > 194-
->
-> > > > Linux/master-key.aes -machine
->
-> > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off -smp
->
-> > > > 20,sockets=20,cores=1,threads=1 -numa
->
-> > > > node,nodeid=0,cpus=0-4,mem=1024 -numa
->
-> > > > node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> > > > node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config -nodefaults
->
-> > > > -chardev
->
-> > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Linux/
->
-> > > > moni
->
-> > > > tor.sock,server,nowait -mon
->
-> > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet
->
-> > > > -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot
->
-> > > > strict=on -device
->
-> > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id=dri
->
-> > > > ve-v
->
-> > > > irtio-disk0,cache=none -device
->
-> > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk
->
-> > > > 0,id
->
-> > > > =virtio-disk0,bootindex=1 -drive
->
-> > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev
->
-> > > > tap,fd=35,id=hostnet0 -device
->
-> > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,bus=p
->
-> > > > ci.4
->
-> > > > ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> > > > isa-serial,chardev=charserial0,id=serial0 -device
->
-> > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
->
-> > > > timestamp=on
->
-> > > >
->
-> > > > I am also very curious about this issue, in the linux kernel code,
->
-> > > > maybe double
->
-> > > check in function pci_bridge_check_ranges triggered this problem.
->
-> > >
->
-> > > If you can get the stacktrace in Linux when it tries to write this
->
-> > > fffff value, that would be quite helpful.
->
-> > >
->
-> >
->
-> > After I add mdelay(100) in function pci_bridge_check_ranges, this
->
-> > phenomenon is easier to reproduce, below is my modify in kernel:
->
-> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index
->
-> > cb389277..86e232d 100644
->
-> > --- a/drivers/pci/setup-bus.c
->
-> > +++ b/drivers/pci/setup-bus.c
->
-> > @@ -27,7 +27,7 @@
->
-> >  #include <linux/slab.h>
->
-> >  #include <linux/acpi.h>
->
-> >  #include "pci.h"
->
-> > -
->
-> > +#include <linux/delay.h>
->
-> >  unsigned int pci_flags;
->
-> >
->
-> >  struct pci_dev_resource {
->
-> > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct pci_bus
->
-> *bus)
->
-> >                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> >                                                0xffffffff);
->
-> >                 pci_read_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> > &tmp);
->
-> > +               mdelay(100);
->
-> > +               printk(KERN_ERR "sleep\n");
->
-> > +                dump_stack();
->
-> >                 if (!tmp)
->
-> >                         b_res[2].flags &= ~IORESOURCE_MEM_64;
->
-> >                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> >
->
->
->
-> OK!
->
-> I just sent a Linux patch that should help.
->
-> I would appreciate it if you will give it a try and if that helps reply to
->
-> it with a
->
-> Tested-by: tag.
->
->
->
->
-I tested this patch and it works fine on my machine.
->
->
-But I have another question, if we only fix this problem in the kernel, the
->
-Linux
->
-version that has been released does not work well on the virtualization
->
-platform.
->
-Is there a way to fix this problem in the backend?
-There could we a way to work around this.
-Does below help?
-
-diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
-index 236a20eaa8..7834cac4b0 100644
---- a/hw/i386/acpi-build.c
-+++ b/hw/i386/acpi-build.c
-@@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
-PCIBus *bus,
- 
-         aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
-         aml_append(method,
--            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check */)
-+            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device Check 
-Light */)
-         );
-         aml_append(method,
-             aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request */)
-
->
-On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote:
->
-> On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
->
-> > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > > > > > Hi all,
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > In our test, we configured VM with several pci-bridges and
->
-> > > > > > > a virtio-net nic been attached with bus 4,
->
-> > > > > > >
->
-> > > > > > > After VM is startup, We ping this nic from host to judge
->
-> > > > > > > if it is working normally. Then, we hot add pci devices to
->
-> > > > > > > this VM with bus
->
-> > 0.
->
-> > > > > > >
->
-> > > > > > > We  found the virtio-net NIC in bus 4 is not working (can
->
-> > > > > > > not
->
-> > > > > > > connect) occasionally, as it kick virtio backend failure with
->
-> > > > > > > error
->
-below:
->
-> > > > > > >
->
-> > > > > > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > memory-region: pci_bridge_pci
->
-> > > > > > >
->
-> > > > > > >   0000000000000000-ffffffffffffffff (prio 0, RW):
->
-> > > > > > > pci_bridge_pci
->
-> > > > > > >
->
-> > > > > > >     00000000fc800000-00000000fc803fff (prio 1, RW):
->
-> > > > > > > virtio-pci
->
-> > > > > > >
->
-> > > > > > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > > > > > virtio-pci-common
->
-> > > > > > >
->
-> > > > > > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > > > > > virtio-pci-isr
->
-> > > > > > >
->
-> > > > > > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > > > > > virtio-pci-device
->
-> > > > > > >
->
-> > > > > > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > > > > > virtio-pci-notify  <- io mem unassigned
->
-> > > > > > >
->
-> > > > > > >       …
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > We caught an exceptional address changing while this
->
-> > > > > > > problem happened, show as
->
-> > > > > > > follow:
->
-> > > > > > >
->
-> > > > > > > Before pci_bridge_update_mappings:
->
-> > > > > > >
->
-> > > > > > >       00000000fc000000-00000000fc1fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fc000000-00000000fc1fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fc200000-00000000fc3fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fc200000-00000000fc3fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fc400000-00000000fc5fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fc400000-00000000fc5fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fc600000-00000000fc7fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fc600000-00000000fc7fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fc800000-00000000fc9fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fc800000-00000000fc9fffff
->
-> > > > > > > <- correct Adress Spce
->
-> > > > > > >
->
-> > > > > > >       00000000fca00000-00000000fcbfffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fca00000-00000000fcbfffff
->
-> > > > > > >
->
-> > > > > > >       00000000fcc00000-00000000fcdfffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fcc00000-00000000fcdfffff
->
-> > > > > > >
->
-> > > > > > >       00000000fce00000-00000000fcffffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > 00000000fce00000-00000000fcffffff
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > After pci_bridge_update_mappings:
->
-> > > > > > >
->
-> > > > > > >       00000000fda00000-00000000fdbfffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fda00000-00000000fdbfffff
->
-> > > > > > >
->
-> > > > > > >       00000000fdc00000-00000000fddfffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fdc00000-00000000fddfffff
->
-> > > > > > >
->
-> > > > > > >       00000000fde00000-00000000fdffffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fde00000-00000000fdffffff
->
-> > > > > > >
->
-> > > > > > >       00000000fe000000-00000000fe1fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fe000000-00000000fe1fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fe200000-00000000fe3fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fe200000-00000000fe3fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fe400000-00000000fe5fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fe400000-00000000fe5fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fe600000-00000000fe7fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fe600000-00000000fe7fffff
->
-> > > > > > >
->
-> > > > > > >       00000000fe800000-00000000fe9fffff (prio 1, RW):
->
-> > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > 00000000fe800000-00000000fe9fffff
->
-> > > > > > >
->
-> > > > > > >       fffffffffc800000-fffffffffc800000 (prio 1, RW):
->
-> > > > > > > alias
->
-> > > > pci_bridge_pref_mem
->
-> > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <-
->
-> > > > > > > Exceptional
->
-> > Adress
->
-> > > > > > Space
->
-> > > > > >
->
-> > > > > > This one is empty though right?
->
-> > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > We have figured out why this address becomes this value,
->
-> > > > > > > according to pci spec,  pci driver can get BAR address
->
-> > > > > > > size by writing 0xffffffff to
->
-> > > > > > >
->
-> > > > > > > the pci register firstly, and then read back the value from this
->
-register.
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > OK however as you show below the BAR being sized is the BAR
->
-> > > > > > if a bridge. Are you then adding a bridge device by hotplug?
->
-> > > > >
->
-> > > > > No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> > > > > interesting phenomenon is If I hot plug the device to other
->
-> > > > > bus, this doesn't
->
-> > > > happened.
->
-> > > > >
->
-> > > > > >
->
-> > > > > >
->
-> > > > > > > We didn't handle this value  specially while process pci
->
-> > > > > > > write in qemu, the function call stack is:
->
-> > > > > > >
->
-> > > > > > > Pci_bridge_dev_write_config
->
-> > > > > > >
->
-> > > > > > > -> pci_bridge_write_config
->
-> > > > > > >
->
-> > > > > > > -> pci_default_write_config (we update the config[address]
->
-> > > > > > > -> value here to
->
-> > > > > > > fffffffffc800000, which should be 0xfc800000 )
->
-> > > > > > >
->
-> > > > > > > -> pci_bridge_update_mappings
->
-> > > > > > >
->
-> > > > > > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > > > > > >
->
-> > > > > > > -> pci_bridge_region_init
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use
->
-> > > > > > > -> the
->
-> > > > > > > wrong value
->
-> > > > > > > fffffffffc800000)
->
-> > > > > > >
->
-> > > > > > >                                                 ->
->
-> > > > > > > memory_region_transaction_commit
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > So, as we can see, we use the wrong base address in qemu
->
-> > > > > > > to update the memory regions, though, we update the base
->
-> > > > > > > address to
->
-> > > > > > >
->
-> > > > > > > The correct value after pci driver in VM write the
->
-> > > > > > > original value back, the virtio NIC in bus 4 may still
->
-> > > > > > > sends net packets concurrently with
->
-> > > > > > >
->
-> > > > > > > The wrong memory region address.
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > We have tried to skip the memory region update action in
->
-> > > > > > > qemu while detect pci write with 0xffffffff value, and it
->
-> > > > > > > does work, but
->
-> > > > > > >
->
-> > > > > > > This seems to be not gently.
->
-> > > > > >
->
-> > > > > > For sure. But I'm still puzzled as to why does Linux try to
->
-> > > > > > size the BAR of the bridge while a device behind it is used.
->
-> > > > > >
->
-> > > > > > Can you pls post your QEMU command line?
->
-> > > > >
->
-> > > > > My QEMU command line:
->
-> > > > > /root/xyd/qemu-system-x86_64 -name
->
-> > > > > guest=Linux,debug-threads=on -S -object
->
-> > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom
->
-> > > > > ain-
->
-> > > > > 194-
->
-> > > > > Linux/master-key.aes -machine
->
-> > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off
->
-> > > > > -smp
->
-> > > > > 20,sockets=20,cores=1,threads=1 -numa
->
-> > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa
->
-> > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config
->
-> > > > > -nodefaults -chardev
->
-> > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li
->
-> > > > > nux/
->
-> > > > > moni
->
-> > > > > tor.sock,server,nowait -mon
->
-> > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc
->
-> > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown
->
-> > > > > -boot strict=on -device
->
-> > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id
->
-> > > > > =dri
->
-> > > > > ve-v
->
-> > > > > irtio-disk0,cache=none -device
->
-> > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-
->
-> > > > > disk
->
-> > > > > 0,id
->
-> > > > > =virtio-disk0,bootindex=1 -drive
->
-> > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1
->
-> > > > > -netdev
->
-> > > > > tap,fd=35,id=hostnet0 -device
->
-> > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b
->
-> > > > > us=p
->
-> > > > > ci.4
->
-> > > > > ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> > > > > isa-serial,chardev=charserial0,id=serial0 -device
->
-> > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
->
-> > > > > timestamp=on
->
-> > > > >
->
-> > > > > I am also very curious about this issue, in the linux kernel
->
-> > > > > code, maybe double
->
-> > > > check in function pci_bridge_check_ranges triggered this problem.
->
-> > > >
->
-> > > > If you can get the stacktrace in Linux when it tries to write
->
-> > > > this fffff value, that would be quite helpful.
->
-> > > >
->
-> > >
->
-> > > After I add mdelay(100) in function pci_bridge_check_ranges, this
->
-> > > phenomenon is easier to reproduce, below is my modify in kernel:
->
-> > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
->
-> > > index cb389277..86e232d 100644
->
-> > > --- a/drivers/pci/setup-bus.c
->
-> > > +++ b/drivers/pci/setup-bus.c
->
-> > > @@ -27,7 +27,7 @@
->
-> > >  #include <linux/slab.h>
->
-> > >  #include <linux/acpi.h>
->
-> > >  #include "pci.h"
->
-> > > -
->
-> > > +#include <linux/delay.h>
->
-> > >  unsigned int pci_flags;
->
-> > >
->
-> > >  struct pci_dev_resource {
->
-> > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct
->
-> > > pci_bus
->
-> > *bus)
->
-> > >                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> > >                                                0xffffffff);
->
-> > >                 pci_read_config_dword(bridge,
->
-> > > PCI_PREF_BASE_UPPER32, &tmp);
->
-> > > +               mdelay(100);
->
-> > > +               printk(KERN_ERR "sleep\n");
->
-> > > +                dump_stack();
->
-> > >                 if (!tmp)
->
-> > >                         b_res[2].flags &= ~IORESOURCE_MEM_64;
->
-> > >                 pci_write_config_dword(bridge,
->
-> > > PCI_PREF_BASE_UPPER32,
->
-> > >
->
-> >
->
-> > OK!
->
-> > I just sent a Linux patch that should help.
->
-> > I would appreciate it if you will give it a try and if that helps
->
-> > reply to it with a
->
-> > Tested-by: tag.
->
-> >
->
->
->
-> I tested this patch and it works fine on my machine.
->
->
->
-> But I have another question, if we only fix this problem in the
->
-> kernel, the Linux version that has been released does not work well on the
->
-virtualization platform.
->
-> Is there a way to fix this problem in the backend?
->
->
-There could we a way to work around this.
->
-Does below help?
-I am sorry to tell you, I tested this patch and it doesn't work fine.
-
->
->
-diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-236a20eaa8..7834cac4b0 100644
->
---- a/hw/i386/acpi-build.c
->
-+++ b/hw/i386/acpi-build.c
->
-@@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-*parent_scope, PCIBus *bus,
->
->
-aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
->
-aml_append(method,
->
--            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check
->
-*/)
->
-+            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device
->
-+ Check Light */)
->
-);
->
-aml_append(method,
->
-aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request
->
-*/)
-
-On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote:
->
-> There could we a way to work around this.
->
-> Does below help?
->
->
-I am sorry to tell you, I tested this patch and it doesn't work fine.
-What happens?
-
->
->
->
-> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-> 236a20eaa8..7834cac4b0 100644
->
-> --- a/hw/i386/acpi-build.c
->
-> +++ b/hw/i386/acpi-build.c
->
-> @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-> *parent_scope, PCIBus *bus,
->
->
->
->          aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
->
->          aml_append(method,
->
-> -            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check
->
-> */)
->
-> +            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device
->
-> + Check Light */)
->
->          );
->
->          aml_append(method,
->
->              aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request
->
-> */)
-
-On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote:
->
-> On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote:
->
-> > On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
->
-> > > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
->
-> > > > > > > > Hi all,
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > In our test, we configured VM with several pci-bridges and
->
-> > > > > > > > a virtio-net nic been attached with bus 4,
->
-> > > > > > > >
->
-> > > > > > > > After VM is startup, We ping this nic from host to judge
->
-> > > > > > > > if it is working normally. Then, we hot add pci devices to
->
-> > > > > > > > this VM with bus
->
-> > > 0.
->
-> > > > > > > >
->
-> > > > > > > > We  found the virtio-net NIC in bus 4 is not working (can
->
-> > > > > > > > not
->
-> > > > > > > > connect) occasionally, as it kick virtio backend failure with
->
-> > > > > > > > error
->
-> below:
->
-> > > > > > > >
->
-> > > > > > > >     Unassigned mem write 00000000fc803004 = 0x1
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > memory-region: pci_bridge_pci
->
-> > > > > > > >
->
-> > > > > > > >   0000000000000000-ffffffffffffffff (prio 0, RW):
->
-> > > > > > > > pci_bridge_pci
->
-> > > > > > > >
->
-> > > > > > > >     00000000fc800000-00000000fc803fff (prio 1, RW):
->
-> > > > > > > > virtio-pci
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc800000-00000000fc800fff (prio 0, RW):
->
-> > > > > > > > virtio-pci-common
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc801000-00000000fc801fff (prio 0, RW):
->
-> > > > > > > > virtio-pci-isr
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc802000-00000000fc802fff (prio 0, RW):
->
-> > > > > > > > virtio-pci-device
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc803000-00000000fc803fff (prio 0, RW):
->
-> > > > > > > > virtio-pci-notify  <- io mem unassigned
->
-> > > > > > > >
->
-> > > > > > > >       …
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > We caught an exceptional address changing while this
->
-> > > > > > > > problem happened, show as
->
-> > > > > > > > follow:
->
-> > > > > > > >
->
-> > > > > > > > Before pci_bridge_update_mappings:
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc000000-00000000fc1fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fc000000-00000000fc1fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc200000-00000000fc3fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fc200000-00000000fc3fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc400000-00000000fc5fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fc400000-00000000fc5fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc600000-00000000fc7fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fc600000-00000000fc7fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fc800000-00000000fc9fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fc800000-00000000fc9fffff
->
-> > > > > > > > <- correct Adress Spce
->
-> > > > > > > >
->
-> > > > > > > >       00000000fca00000-00000000fcbfffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fca00000-00000000fcbfffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fcc00000-00000000fcdfffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fcc00000-00000000fcdfffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fce00000-00000000fcffffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fce00000-00000000fcffffff
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > After pci_bridge_update_mappings:
->
-> > > > > > > >
->
-> > > > > > > >       00000000fda00000-00000000fdbfffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fda00000-00000000fdbfffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fdc00000-00000000fddfffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fdc00000-00000000fddfffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fde00000-00000000fdffffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fde00000-00000000fdffffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fe000000-00000000fe1fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fe000000-00000000fe1fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fe200000-00000000fe3fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fe200000-00000000fe3fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fe400000-00000000fe5fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fe400000-00000000fe5fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fe600000-00000000fe7fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fe600000-00000000fe7fffff
->
-> > > > > > > >
->
-> > > > > > > >       00000000fe800000-00000000fe9fffff (prio 1, RW):
->
-> > > > > > > > alias pci_bridge_mem @pci_bridge_pci
->
-> > > > > > > > 00000000fe800000-00000000fe9fffff
->
-> > > > > > > >
->
-> > > > > > > >       fffffffffc800000-fffffffffc800000 (prio 1, RW):
->
-> > > > > > > > alias
->
-> > > > > pci_bridge_pref_mem
->
-> > > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000   <-
->
-> > > > > > > > Exceptional
->
-> > > Adress
->
-> > > > > > > Space
->
-> > > > > > >
->
-> > > > > > > This one is empty though right?
->
-> > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > We have figured out why this address becomes this value,
->
-> > > > > > > > according to pci spec,  pci driver can get BAR address
->
-> > > > > > > > size by writing 0xffffffff to
->
-> > > > > > > >
->
-> > > > > > > > the pci register firstly, and then read back the value from
->
-> > > > > > > > this
->
-> register.
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > OK however as you show below the BAR being sized is the BAR
->
-> > > > > > > if a bridge. Are you then adding a bridge device by hotplug?
->
-> > > > > >
->
-> > > > > > No, I just simply hot plugged a VFIO device to Bus 0, another
->
-> > > > > > interesting phenomenon is If I hot plug the device to other
->
-> > > > > > bus, this doesn't
->
-> > > > > happened.
->
-> > > > > >
->
-> > > > > > >
->
-> > > > > > >
->
-> > > > > > > > We didn't handle this value  specially while process pci
->
-> > > > > > > > write in qemu, the function call stack is:
->
-> > > > > > > >
->
-> > > > > > > > Pci_bridge_dev_write_config
->
-> > > > > > > >
->
-> > > > > > > > -> pci_bridge_write_config
->
-> > > > > > > >
->
-> > > > > > > > -> pci_default_write_config (we update the config[address]
->
-> > > > > > > > -> value here to
->
-> > > > > > > > fffffffffc800000, which should be 0xfc800000 )
->
-> > > > > > > >
->
-> > > > > > > > -> pci_bridge_update_mappings
->
-> > > > > > > >
->
-> > > > > > > >                 ->pci_bridge_region_del(br, br->windows);
->
-> > > > > > > >
->
-> > > > > > > > -> pci_bridge_region_init
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use
->
-> > > > > > > > -> the
->
-> > > > > > > > wrong value
->
-> > > > > > > > fffffffffc800000)
->
-> > > > > > > >
->
-> > > > > > > >                                                 ->
->
-> > > > > > > > memory_region_transaction_commit
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > So, as we can see, we use the wrong base address in qemu
->
-> > > > > > > > to update the memory regions, though, we update the base
->
-> > > > > > > > address to
->
-> > > > > > > >
->
-> > > > > > > > The correct value after pci driver in VM write the
->
-> > > > > > > > original value back, the virtio NIC in bus 4 may still
->
-> > > > > > > > sends net packets concurrently with
->
-> > > > > > > >
->
-> > > > > > > > The wrong memory region address.
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > >
->
-> > > > > > > > We have tried to skip the memory region update action in
->
-> > > > > > > > qemu while detect pci write with 0xffffffff value, and it
->
-> > > > > > > > does work, but
->
-> > > > > > > >
->
-> > > > > > > > This seems to be not gently.
->
-> > > > > > >
->
-> > > > > > > For sure. But I'm still puzzled as to why does Linux try to
->
-> > > > > > > size the BAR of the bridge while a device behind it is used.
->
-> > > > > > >
->
-> > > > > > > Can you pls post your QEMU command line?
->
-> > > > > >
->
-> > > > > > My QEMU command line:
->
-> > > > > > /root/xyd/qemu-system-x86_64 -name
->
-> > > > > > guest=Linux,debug-threads=on -S -object
->
-> > > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom
->
-> > > > > > ain-
->
-> > > > > > 194-
->
-> > > > > > Linux/master-key.aes -machine
->
-> > > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
->
-> > > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
->
-> > > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off
->
-> > > > > > -smp
->
-> > > > > > 20,sockets=20,cores=1,threads=1 -numa
->
-> > > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa
->
-> > > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa
->
-> > > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa
->
-> > > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
->
-> > > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config
->
-> > > > > > -nodefaults -chardev
->
-> > > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li
->
-> > > > > > nux/
->
-> > > > > > moni
->
-> > > > > > tor.sock,server,nowait -mon
->
-> > > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc
->
-> > > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown
->
-> > > > > > -boot strict=on -device
->
-> > > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
->
-> > > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
->
-> > > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
->
-> > > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
->
-> > > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
->
-> > > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
->
-> > > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
->
-> > > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
->
-> > > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
->
-> > > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
->
-> > > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
->
-> > > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
->
-> > > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
->
-> > > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id
->
-> > > > > > =dri
->
-> > > > > > ve-v
->
-> > > > > > irtio-disk0,cache=none -device
->
-> > > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-
->
-> > > > > > disk
->
-> > > > > > 0,id
->
-> > > > > > =virtio-disk0,bootindex=1 -drive
->
-> > > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
->
-> > > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1
->
-> > > > > > -netdev
->
-> > > > > > tap,fd=35,id=hostnet0 -device
->
-> > > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b
->
-> > > > > > us=p
->
-> > > > > > ci.4
->
-> > > > > > ,addr=0x1 -chardev pty,id=charserial0 -device
->
-> > > > > > isa-serial,chardev=charserial0,id=serial0 -device
->
-> > > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
->
-> > > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
->
-> > > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
->
-> > > > > > timestamp=on
->
-> > > > > >
->
-> > > > > > I am also very curious about this issue, in the linux kernel
->
-> > > > > > code, maybe double
->
-> > > > > check in function pci_bridge_check_ranges triggered this problem.
->
-> > > > >
->
-> > > > > If you can get the stacktrace in Linux when it tries to write
->
-> > > > > this fffff value, that would be quite helpful.
->
-> > > > >
->
-> > > >
->
-> > > > After I add mdelay(100) in function pci_bridge_check_ranges, this
->
-> > > > phenomenon is easier to reproduce, below is my modify in kernel:
->
-> > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
->
-> > > > index cb389277..86e232d 100644
->
-> > > > --- a/drivers/pci/setup-bus.c
->
-> > > > +++ b/drivers/pci/setup-bus.c
->
-> > > > @@ -27,7 +27,7 @@
->
-> > > >  #include <linux/slab.h>
->
-> > > >  #include <linux/acpi.h>
->
-> > > >  #include "pci.h"
->
-> > > > -
->
-> > > > +#include <linux/delay.h>
->
-> > > >  unsigned int pci_flags;
->
-> > > >
->
-> > > >  struct pci_dev_resource {
->
-> > > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct
->
-> > > > pci_bus
->
-> > > *bus)
->
-> > > >                 pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
->
-> > > >                                                0xffffffff);
->
-> > > >                 pci_read_config_dword(bridge,
->
-> > > > PCI_PREF_BASE_UPPER32, &tmp);
->
-> > > > +               mdelay(100);
->
-> > > > +               printk(KERN_ERR "sleep\n");
->
-> > > > +                dump_stack();
->
-> > > >                 if (!tmp)
->
-> > > >                         b_res[2].flags &= ~IORESOURCE_MEM_64;
->
-> > > >                 pci_write_config_dword(bridge,
->
-> > > > PCI_PREF_BASE_UPPER32,
->
-> > > >
->
-> > >
->
-> > > OK!
->
-> > > I just sent a Linux patch that should help.
->
-> > > I would appreciate it if you will give it a try and if that helps
->
-> > > reply to it with a
->
-> > > Tested-by: tag.
->
-> > >
->
-> >
->
-> > I tested this patch and it works fine on my machine.
->
-> >
->
-> > But I have another question, if we only fix this problem in the
->
-> > kernel, the Linux version that has been released does not work well on the
->
-> virtualization platform.
->
-> > Is there a way to fix this problem in the backend?
->
->
->
-> There could we a way to work around this.
->
-> Does below help?
->
->
-I am sorry to tell you, I tested this patch and it doesn't work fine.
->
->
->
->
-> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-> 236a20eaa8..7834cac4b0 100644
->
-> --- a/hw/i386/acpi-build.c
->
-> +++ b/hw/i386/acpi-build.c
->
-> @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-> *parent_scope, PCIBus *bus,
->
->
->
->          aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
->
->          aml_append(method,
->
-> -            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check
->
-> */)
->
-> +            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device
->
-> + Check Light */)
->
->          );
->
->          aml_append(method,
->
->              aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request
->
-> */)
-Oh I see, another bug:
-
-        case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
-                acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT 
-event\n");
-                /* TBD: Exactly what does 'light' mean? */
-                break;
-
-And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type)
-and friends all just ignore this event type.
-
-
-
--- 
-MST
-
->
-> > > > > > > > > Hi all,
->
-> > > > > > > > >
->
-> > > > > > > > >
->
-> > > > > > > > >
->
-> > > > > > > > > In our test, we configured VM with several pci-bridges
->
-> > > > > > > > > and a virtio-net nic been attached with bus 4,
->
-> > > > > > > > >
->
-> > > > > > > > > After VM is startup, We ping this nic from host to
->
-> > > > > > > > > judge if it is working normally. Then, we hot add pci
->
-> > > > > > > > > devices to this VM with bus
->
-> > > > 0.
->
-> > > > > > > > >
->
-> > > > > > > > > We  found the virtio-net NIC in bus 4 is not working
->
-> > > > > > > > > (can not
->
-> > > > > > > > > connect) occasionally, as it kick virtio backend
->
-> > > > > > > > > failure with error
->
-> > > But I have another question, if we only fix this problem in the
->
-> > > kernel, the Linux version that has been released does not work
->
-> > > well on the
->
-> > virtualization platform.
->
-> > > Is there a way to fix this problem in the backend?
->
-> >
->
-> > There could we a way to work around this.
->
-> > Does below help?
->
->
->
-> I am sorry to tell you, I tested this patch and it doesn't work fine.
->
->
->
-> >
->
-> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-> > 236a20eaa8..7834cac4b0 100644
->
-> > --- a/hw/i386/acpi-build.c
->
-> > +++ b/hw/i386/acpi-build.c
->
-> > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-> > *parent_scope, PCIBus *bus,
->
-> >
->
-> >          aml_append(method, aml_store(aml_int(bsel_val),
->
-aml_name("BNUM")));
->
-> >          aml_append(method,
->
-> > -            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device
->
-> > Check
->
-*/)
->
-> > +            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /*
->
-> > + Device Check Light */)
->
-> >          );
->
-> >          aml_append(method,
->
-> >              aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject
->
-> > Request */)
->
->
->
-Oh I see, another bug:
->
->
-case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
->
-acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT
->
-event\n");
->
-/* TBD: Exactly what does 'light' mean? */
->
-break;
->
->
-And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type)
->
-and friends all just ignore this event type.
->
->
->
->
---
->
-MST
-Hi Michael,
-
-If we want to fix this problem on the backend, it is not enough to consider 
-only PCI
-device hot plugging, because I found that if we use a command like
-"echo 1 > /sys/bus/pci/rescan" in guest, this problem is very easy to reproduce.
-
-From the perspective of device emulation, when guest writes 0xffffffff to the 
-BAR,
-guest just want to get the size of the region but not really updating the 
-address space.
-So I made the following patch to avoid  update pci mapping.
-
-Do you think this make sense?
-
-[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
-
-When guest writes 0xffffffff to the BAR, guest just want to get the size of the 
-region
-but not really updating the address space.
-So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings 
-or pci_bridge_update_mappings.
-
-Signed-off-by: xuyandong <address@hidden>
----
- hw/pci/pci.c        | 6 ++++--
- hw/pci/pci_bridge.c | 8 +++++---
- 2 files changed, 9 insertions(+), 5 deletions(-)
-
-diff --git a/hw/pci/pci.c b/hw/pci/pci.c
-index 56b13b3..ef368e1 100644
---- a/hw/pci/pci.c
-+++ b/hw/pci/pci.c
-@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
-addr, uint32_t val_in, int
- {
-     int i, was_irq_disabled = pci_irq_disabled(d);
-     uint32_t val = val_in;
-+    uint64_t barmask = (1 << l*8) - 1;
- 
-     for (i = 0; i < l; val >>= 8, ++i) {
-         uint8_t wmask = d->wmask[addr + i];
-@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
-addr, uint32_t val_in, int
-         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
-         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
-     }
--    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
-+    if ((val_in != barmask &&
-+       (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
-         ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
--        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
-+        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
-         range_covers_byte(addr, l, PCI_COMMAND))
-         pci_update_mappings(d);
- 
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
-index ee9dff2..f2bad79 100644
---- a/hw/pci/pci_bridge.c
-+++ b/hw/pci/pci_bridge.c
-@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
-     PCIBridge *s = PCI_BRIDGE(d);
-     uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
-     uint16_t newctl;
-+    uint64_t barmask = (1 << len * 8) - 1;
- 
-     pci_default_write_config(d, address, val, len);
- 
-     if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
- 
--        /* io base/limit */
--        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
-+        (val != barmask &&
-+       /* io base/limit */
-+        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
- 
-         /* memory base/limit, prefetchable base/limit and
-            io base/limit upper 16 */
--        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
-+        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
- 
-         /* vga enable */
-         ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
--- 
-1.8.3.1
-
-On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote:
->
-> > > > > > > > > > Hi all,
->
-> > > > > > > > > >
->
-> > > > > > > > > >
->
-> > > > > > > > > >
->
-> > > > > > > > > > In our test, we configured VM with several pci-bridges
->
-> > > > > > > > > > and a virtio-net nic been attached with bus 4,
->
-> > > > > > > > > >
->
-> > > > > > > > > > After VM is startup, We ping this nic from host to
->
-> > > > > > > > > > judge if it is working normally. Then, we hot add pci
->
-> > > > > > > > > > devices to this VM with bus
->
-> > > > > 0.
->
-> > > > > > > > > >
->
-> > > > > > > > > > We  found the virtio-net NIC in bus 4 is not working
->
-> > > > > > > > > > (can not
->
-> > > > > > > > > > connect) occasionally, as it kick virtio backend
->
-> > > > > > > > > > failure with error
->
->
-> > > > But I have another question, if we only fix this problem in the
->
-> > > > kernel, the Linux version that has been released does not work
->
-> > > > well on the
->
-> > > virtualization platform.
->
-> > > > Is there a way to fix this problem in the backend?
->
-> > >
->
-> > > There could we a way to work around this.
->
-> > > Does below help?
->
-> >
->
-> > I am sorry to tell you, I tested this patch and it doesn't work fine.
->
-> >
->
-> > >
->
-> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-> > > 236a20eaa8..7834cac4b0 100644
->
-> > > --- a/hw/i386/acpi-build.c
->
-> > > +++ b/hw/i386/acpi-build.c
->
-> > > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-> > > *parent_scope, PCIBus *bus,
->
-> > >
->
-> > >          aml_append(method, aml_store(aml_int(bsel_val),
->
-> aml_name("BNUM")));
->
-> > >          aml_append(method,
->
-> > > -            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device
->
-> > > Check
->
-> */)
->
-> > > +            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /*
->
-> > > + Device Check Light */)
->
-> > >          );
->
-> > >          aml_append(method,
->
-> > >              aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject
->
-> > > Request */)
->
->
->
->
->
-> Oh I see, another bug:
->
->
->
->         case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
->
->                 acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT
->
-> event\n");
->
->                 /* TBD: Exactly what does 'light' mean? */
->
->                 break;
->
->
->
-> And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type)
->
-> and friends all just ignore this event type.
->
->
->
->
->
->
->
-> --
->
-> MST
->
->
-Hi Michael,
->
->
-If we want to fix this problem on the backend, it is not enough to consider
->
-only PCI
->
-device hot plugging, because I found that if we use a command like
->
-"echo 1 > /sys/bus/pci/rescan" in guest, this problem is very easy to
->
-reproduce.
->
->
-From the perspective of device emulation, when guest writes 0xffffffff to the
->
-BAR,
->
-guest just want to get the size of the region but not really updating the
->
-address space.
->
-So I made the following patch to avoid  update pci mapping.
->
->
-Do you think this make sense?
->
->
-[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
->
->
-When guest writes 0xffffffff to the BAR, guest just want to get the size of
->
-the region
->
-but not really updating the address space.
->
-So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings
->
-or pci_bridge_update_mappings.
->
->
-Signed-off-by: xuyandong <address@hidden>
-I see how that will address the common case however there are a bunch of
-issues here.  First of all it's easy to trigger the update by some other
-action like VM migration.  More importantly it's just possible that
-guest actually does want to set the low 32 bit of the address to all
-ones.  For example, that is clearly listed as a way to disable all
-devices behind the bridge in the pci to pci bridge spec.
-
-Given upstream is dragging it's feet I'm open to adding a flag
-that will help keep guests going as a temporary measure.
-We will need to think about ways to restrict this as much as
-we can.
-
-
->
----
->
-hw/pci/pci.c        | 6 ++++--
->
-hw/pci/pci_bridge.c | 8 +++++---
->
-2 files changed, 9 insertions(+), 5 deletions(-)
->
->
-diff --git a/hw/pci/pci.c b/hw/pci/pci.c
->
-index 56b13b3..ef368e1 100644
->
---- a/hw/pci/pci.c
->
-+++ b/hw/pci/pci.c
->
-@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t
->
-addr, uint32_t val_in, int
->
-{
->
-int i, was_irq_disabled = pci_irq_disabled(d);
->
-uint32_t val = val_in;
->
-+    uint64_t barmask = (1 << l*8) - 1;
->
->
-for (i = 0; i < l; val >>= 8, ++i) {
->
-uint8_t wmask = d->wmask[addr + i];
->
-@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t
->
-addr, uint32_t val_in, int
->
-d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
->
-d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
->
-}
->
--    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-+    if ((val_in != barmask &&
->
-+     (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
->
--        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
->
-+        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
->
-range_covers_byte(addr, l, PCI_COMMAND))
->
-pci_update_mappings(d);
->
->
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
->
-index ee9dff2..f2bad79 100644
->
---- a/hw/pci/pci_bridge.c
->
-+++ b/hw/pci/pci_bridge.c
->
-@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
->
-PCIBridge *s = PCI_BRIDGE(d);
->
-uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
->
-uint16_t newctl;
->
-+    uint64_t barmask = (1 << len * 8) - 1;
->
->
-pci_default_write_config(d, address, val, len);
->
->
-if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
--        /* io base/limit */
->
--        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-+        (val != barmask &&
->
-+     /* io base/limit */
->
-+        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
->
-/* memory base/limit, prefetchable base/limit and
->
-io base/limit upper 16 */
->
--        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-+        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
->
->
-/* vga enable */
->
-ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
---
->
-1.8.3.1
->
->
->
-
->
------Original Message-----
->
-From: Michael S. Tsirkin [
-mailto:address@hidden
->
-Sent: Monday, January 07, 2019 11:06 PM
->
-To: xuyandong <address@hidden>
->
-Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu-
->
-address@hidden; Zhanghailiang <address@hidden>;
->
-wangxin (U) <address@hidden>; Huangweidong (C)
->
-<address@hidden>
->
-Subject: Re: [BUG]Unassigned mem write during pci device hot-plug
->
->
-On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote:
->
-> > > > > > > > > > > Hi all,
->
-> > > > > > > > > > >
->
-> > > > > > > > > > >
->
-> > > > > > > > > > >
->
-> > > > > > > > > > > In our test, we configured VM with several
->
-> > > > > > > > > > > pci-bridges and a virtio-net nic been attached
->
-> > > > > > > > > > > with bus 4,
->
-> > > > > > > > > > >
->
-> > > > > > > > > > > After VM is startup, We ping this nic from host to
->
-> > > > > > > > > > > judge if it is working normally. Then, we hot add
->
-> > > > > > > > > > > pci devices to this VM with bus
->
-> > > > > > 0.
->
-> > > > > > > > > > >
->
-> > > > > > > > > > > We  found the virtio-net NIC in bus 4 is not
->
-> > > > > > > > > > > working (can not
->
-> > > > > > > > > > > connect) occasionally, as it kick virtio backend
->
-> > > > > > > > > > > failure with error
->
->
->
-> > > > > But I have another question, if we only fix this problem in
->
-> > > > > the kernel, the Linux version that has been released does not
->
-> > > > > work well on the
->
-> > > > virtualization platform.
->
-> > > > > Is there a way to fix this problem in the backend?
->
->
->
-> Hi Michael,
->
->
->
-> If we want to fix this problem on the backend, it is not enough to
->
-> consider only PCI device hot plugging, because I found that if we use
->
-> a command like "echo 1 > /sys/bus/pci/rescan" in guest, this problem is very
->
-easy to reproduce.
->
->
->
-> From the perspective of device emulation, when guest writes 0xffffffff
->
-> to the BAR, guest just want to get the size of the region but not really
->
-updating the address space.
->
-> So I made the following patch to avoid  update pci mapping.
->
->
->
-> Do you think this make sense?
->
->
->
-> [PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
->
->
->
-> When guest writes 0xffffffff to the BAR, guest just want to get the
->
-> size of the region but not really updating the address space.
->
-> So when guest writes 0xffffffff to BAR, we need avoid
->
-> pci_update_mappings or pci_bridge_update_mappings.
->
->
->
-> Signed-off-by: xuyandong <address@hidden>
->
->
-I see how that will address the common case however there are a bunch of
->
-issues here.  First of all it's easy to trigger the update by some other
->
-action like
->
-VM migration.  More importantly it's just possible that guest actually does
->
-want
->
-to set the low 32 bit of the address to all ones.  For example, that is
->
-clearly
->
-listed as a way to disable all devices behind the bridge in the pci to pci
->
-bridge
->
-spec.
-Ok, I see. If I only skip upate when guest writing 0xFFFFFFFF to Prefetcable 
-Base Upper 32 Bits
-to meet the kernel double check problem.
-Do you think there is still risk?
-
->
->
-Given upstream is dragging it's feet I'm open to adding a flag that will help
->
-keep guests going as a temporary measure.
->
-We will need to think about ways to restrict this as much as we can.
->
->
->
-> ---
->
->  hw/pci/pci.c        | 6 ++++--
->
->  hw/pci/pci_bridge.c | 8 +++++---
->
->  2 files changed, 9 insertions(+), 5 deletions(-)
->
->
->
-> diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644
->
-> --- a/hw/pci/pci.c
->
-> +++ b/hw/pci/pci.c
->
-> @@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d,
->
-> uint32_t addr, uint32_t val_in, int  {
->
->      int i, was_irq_disabled = pci_irq_disabled(d);
->
->      uint32_t val = val_in;
->
-> +    uint64_t barmask = (1 << l*8) - 1;
->
->
->
->      for (i = 0; i < l; val >>= 8, ++i) {
->
->          uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@
->
-> void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in,
->
-int
->
->          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val &
->
-> wmask);
->
->          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear
->
-> */
->
->      }
->
-> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-> +    if ((val_in != barmask &&
->
-> +   (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
->          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
->
-> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
->
-> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
->
->          range_covers_byte(addr, l, PCI_COMMAND))
->
->          pci_update_mappings(d);
->
->
->
-> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index
->
-> ee9dff2..f2bad79 100644
->
-> --- a/hw/pci/pci_bridge.c
->
-> +++ b/hw/pci/pci_bridge.c
->
-> @@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
->
->      PCIBridge *s = PCI_BRIDGE(d);
->
->      uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
->
->      uint16_t newctl;
->
-> +    uint64_t barmask = (1 << len * 8) - 1;
->
->
->
->      pci_default_write_config(d, address, val, len);
->
->
->
->      if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
->
-> -        /* io base/limit */
->
-> -        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-> +        (val != barmask &&
->
-> +   /* io base/limit */
->
-> +        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
->
->
->          /* memory base/limit, prefetchable base/limit and
->
->             io base/limit upper 16 */
->
-> -        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-> +        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
->
->
->
->          /* vga enable */
->
->          ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
-> --
->
-> 1.8.3.1
->
->
->
->
->
->
-
-On Mon, Jan 07, 2019 at 03:28:36PM +0000, xuyandong wrote:
->
->
->
-> -----Original Message-----
->
-> From: Michael S. Tsirkin [
-mailto:address@hidden
->
-> Sent: Monday, January 07, 2019 11:06 PM
->
-> To: xuyandong <address@hidden>
->
-> Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu-
->
-> address@hidden; Zhanghailiang <address@hidden>;
->
-> wangxin (U) <address@hidden>; Huangweidong (C)
->
-> <address@hidden>
->
-> Subject: Re: [BUG]Unassigned mem write during pci device hot-plug
->
->
->
-> On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote:
->
-> > > > > > > > > > > > Hi all,
->
-> > > > > > > > > > > >
->
-> > > > > > > > > > > >
->
-> > > > > > > > > > > >
->
-> > > > > > > > > > > > In our test, we configured VM with several
->
-> > > > > > > > > > > > pci-bridges and a virtio-net nic been attached
->
-> > > > > > > > > > > > with bus 4,
->
-> > > > > > > > > > > >
->
-> > > > > > > > > > > > After VM is startup, We ping this nic from host to
->
-> > > > > > > > > > > > judge if it is working normally. Then, we hot add
->
-> > > > > > > > > > > > pci devices to this VM with bus
->
-> > > > > > > 0.
->
-> > > > > > > > > > > >
->
-> > > > > > > > > > > > We  found the virtio-net NIC in bus 4 is not
->
-> > > > > > > > > > > > working (can not
->
-> > > > > > > > > > > > connect) occasionally, as it kick virtio backend
->
-> > > > > > > > > > > > failure with error
->
-> >
->
-> > > > > > But I have another question, if we only fix this problem in
->
-> > > > > > the kernel, the Linux version that has been released does not
->
-> > > > > > work well on the
->
-> > > > > virtualization platform.
->
-> > > > > > Is there a way to fix this problem in the backend?
->
-> >
->
-> > Hi Michael,
->
-> >
->
-> > If we want to fix this problem on the backend, it is not enough to
->
-> > consider only PCI device hot plugging, because I found that if we use
->
-> > a command like "echo 1 > /sys/bus/pci/rescan" in guest, this problem is
->
-> > very
->
-> easy to reproduce.
->
-> >
->
-> > From the perspective of device emulation, when guest writes 0xffffffff
->
-> > to the BAR, guest just want to get the size of the region but not really
->
-> updating the address space.
->
-> > So I made the following patch to avoid  update pci mapping.
->
-> >
->
-> > Do you think this make sense?
->
-> >
->
-> > [PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
->
-> >
->
-> > When guest writes 0xffffffff to the BAR, guest just want to get the
->
-> > size of the region but not really updating the address space.
->
-> > So when guest writes 0xffffffff to BAR, we need avoid
->
-> > pci_update_mappings or pci_bridge_update_mappings.
->
-> >
->
-> > Signed-off-by: xuyandong <address@hidden>
->
->
->
-> I see how that will address the common case however there are a bunch of
->
-> issues here.  First of all it's easy to trigger the update by some other
->
-> action like
->
-> VM migration.  More importantly it's just possible that guest actually does
->
-> want
->
-> to set the low 32 bit of the address to all ones.  For example, that is
->
-> clearly
->
-> listed as a way to disable all devices behind the bridge in the pci to pci
->
-> bridge
->
-> spec.
->
->
-Ok, I see. If I only skip upate when guest writing 0xFFFFFFFF to Prefetcable
->
-Base Upper 32 Bits
->
-to meet the kernel double check problem.
->
-Do you think there is still risk?
-Well it's non zero since spec says such a write should disable all
-accesses. Just an idea: why not add an option to disable upper 32 bit?
-That is ugly and limits space but spec compliant.
-
->
->
->
-> Given upstream is dragging it's feet I'm open to adding a flag that will
->
-> help
->
-> keep guests going as a temporary measure.
->
-> We will need to think about ways to restrict this as much as we can.
->
->
->
->
->
-> > ---
->
-> >  hw/pci/pci.c        | 6 ++++--
->
-> >  hw/pci/pci_bridge.c | 8 +++++---
->
-> >  2 files changed, 9 insertions(+), 5 deletions(-)
->
-> >
->
-> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644
->
-> > --- a/hw/pci/pci.c
->
-> > +++ b/hw/pci/pci.c
->
-> > @@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d,
->
-> > uint32_t addr, uint32_t val_in, int  {
->
-> >      int i, was_irq_disabled = pci_irq_disabled(d);
->
-> >      uint32_t val = val_in;
->
-> > +    uint64_t barmask = (1 << l*8) - 1;
->
-> >
->
-> >      for (i = 0; i < l; val >>= 8, ++i) {
->
-> >          uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@
->
-> > void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t
->
-> > val_in,
->
-> int
->
-> >          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val &
->
-> > wmask);
->
-> >          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to
->
-> > Clear */
->
-> >      }
->
-> > -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-> > +    if ((val_in != barmask &&
->
-> > + (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-> >          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
->
-> > -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
->
-> > +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
->
-> >          range_covers_byte(addr, l, PCI_COMMAND))
->
-> >          pci_update_mappings(d);
->
-> >
->
-> > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index
->
-> > ee9dff2..f2bad79 100644
->
-> > --- a/hw/pci/pci_bridge.c
->
-> > +++ b/hw/pci/pci_bridge.c
->
-> > @@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
->
-> >      PCIBridge *s = PCI_BRIDGE(d);
->
-> >      uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
->
-> >      uint16_t newctl;
->
-> > +    uint64_t barmask = (1 << len * 8) - 1;
->
-> >
->
-> >      pci_default_write_config(d, address, val, len);
->
-> >
->
-> >      if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
-> >
->
-> > -        /* io base/limit */
->
-> > -        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-> > +        (val != barmask &&
->
-> > + /* io base/limit */
->
-> > +        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-> >
->
-> >          /* memory base/limit, prefetchable base/limit and
->
-> >             io base/limit upper 16 */
->
-> > -        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-> > +        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
->
-> >
->
-> >          /* vga enable */
->
-> >          ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
-> > --
->
-> > 1.8.3.1
->
-> >
->
-> >
->
-> >
-
->
------Original Message-----
->
-From: xuyandong
->
-Sent: Monday, January 07, 2019 10:37 PM
->
-To: 'Michael S. Tsirkin' <address@hidden>
->
-Cc: address@hidden; Paolo Bonzini <address@hidden>; qemu-
->
-address@hidden; Zhanghailiang <address@hidden>;
->
-wangxin (U) <address@hidden>; Huangweidong (C)
->
-<address@hidden>
->
-Subject: RE: [BUG]Unassigned mem write during pci device hot-plug
->
->
-> > > > > > > > > > Hi all,
->
-> > > > > > > > > >
->
-> > > > > > > > > >
->
-> > > > > > > > > >
->
-> > > > > > > > > > In our test, we configured VM with several
->
-> > > > > > > > > > pci-bridges and a virtio-net nic been attached with
->
-> > > > > > > > > > bus 4,
->
-> > > > > > > > > >
->
-> > > > > > > > > > After VM is startup, We ping this nic from host to
->
-> > > > > > > > > > judge if it is working normally. Then, we hot add
->
-> > > > > > > > > > pci devices to this VM with bus
->
-> > > > > 0.
->
-> > > > > > > > > >
->
-> > > > > > > > > > We  found the virtio-net NIC in bus 4 is not working
->
-> > > > > > > > > > (can not
->
-> > > > > > > > > > connect) occasionally, as it kick virtio backend
->
-> > > > > > > > > > failure with error
->
->
-> > > > But I have another question, if we only fix this problem in the
->
-> > > > kernel, the Linux version that has been released does not work
->
-> > > > well on the
->
-> > > virtualization platform.
->
-> > > > Is there a way to fix this problem in the backend?
->
-> > >
->
-> > > There could we a way to work around this.
->
-> > > Does below help?
->
-> >
->
-> > I am sorry to tell you, I tested this patch and it doesn't work fine.
->
-> >
->
-> > >
->
-> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
->
-> > > 236a20eaa8..7834cac4b0 100644
->
-> > > --- a/hw/i386/acpi-build.c
->
-> > > +++ b/hw/i386/acpi-build.c
->
-> > > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
->
-> > > *parent_scope, PCIBus *bus,
->
-> > >
->
-> > >          aml_append(method, aml_store(aml_int(bsel_val),
->
-> aml_name("BNUM")));
->
-> > >          aml_append(method,
->
-> > > -            aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device
->
-Check
->
-> */)
->
-> > > +            aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /*
->
-> > > + Device Check Light */)
->
-> > >          );
->
-> > >          aml_append(method,
->
-> > >              aml_call2("DVNT", aml_name("PCID"), aml_int(3)/*
->
-> > > Eject Request */)
->
->
->
->
->
-> Oh I see, another bug:
->
->
->
->         case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
->
->                 acpi_handle_debug(handle,
->
-> "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n");
->
->                 /* TBD: Exactly what does 'light' mean? */
->
->                 break;
->
->
->
-> And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32
->
-> type) and friends all just ignore this event type.
->
->
->
->
->
->
->
-> --
->
-> MST
->
->
-Hi Michael,
->
->
-If we want to fix this problem on the backend, it is not enough to consider
->
-only
->
-PCI device hot plugging, because I found that if we use a command like "echo
->
-1 >
->
-/sys/bus/pci/rescan" in guest, this problem is very easy to reproduce.
->
->
-From the perspective of device emulation, when guest writes 0xffffffff to the
->
-BAR, guest just want to get the size of the region but not really updating the
->
-address space.
->
-So I made the following patch to avoid  update pci mapping.
->
->
-Do you think this make sense?
->
->
-[PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
->
->
-When guest writes 0xffffffff to the BAR, guest just want to get the size of
->
-the
->
-region but not really updating the address space.
->
-So when guest writes 0xffffffff to BAR, we need avoid pci_update_mappings or
->
-pci_bridge_update_mappings.
->
->
-Signed-off-by: xuyandong <address@hidden>
->
----
->
-hw/pci/pci.c        | 6 ++++--
->
-hw/pci/pci_bridge.c | 8 +++++---
->
-2 files changed, 9 insertions(+), 5 deletions(-)
->
->
-diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644
->
---- a/hw/pci/pci.c
->
-+++ b/hw/pci/pci.c
->
-@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t
->
-addr, uint32_t val_in, int  {
->
-int i, was_irq_disabled = pci_irq_disabled(d);
->
-uint32_t val = val_in;
->
-+    uint64_t barmask = (1 << l*8) - 1;
->
->
-for (i = 0; i < l; val >>= 8, ++i) {
->
-uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@ void
->
-pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
->
-d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
->
-d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
->
-}
->
--    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-+    if ((val_in != barmask &&
->
-+     (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
->
-ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
->
--        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
->
-+        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
->
-range_covers_byte(addr, l, PCI_COMMAND))
->
-pci_update_mappings(d);
->
->
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index ee9dff2..f2bad79
->
-100644
->
---- a/hw/pci/pci_bridge.c
->
-+++ b/hw/pci/pci_bridge.c
->
-@@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
->
-PCIBridge *s = PCI_BRIDGE(d);
->
-uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
->
-uint16_t newctl;
->
-+    uint64_t barmask = (1 << len * 8) - 1;
->
->
-pci_default_write_config(d, address, val, len);
->
->
-if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
->
->
--        /* io base/limit */
->
--        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
-+        (val != barmask &&
->
-+     /* io base/limit */
->
-+        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
->
->
-/* memory base/limit, prefetchable base/limit and
->
-io base/limit upper 16 */
->
--        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
->
-+        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
->
->
-/* vga enable */
->
-ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
->
---
->
-1.8.3.1
->
->
-Sorry, please ignore the patch above.
-
-Here is the patch I want to post:
-
-diff --git a/hw/pci/pci.c b/hw/pci/pci.c
-index 56b13b3..38a300f 100644
---- a/hw/pci/pci.c
-+++ b/hw/pci/pci.c
-@@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
-addr, uint32_t val_in, int
- {
-     int i, was_irq_disabled = pci_irq_disabled(d);
-     uint32_t val = val_in;
-+    uint64_t barmask = ((uint64_t)1 << l*8) - 1;
- 
-     for (i = 0; i < l; val >>= 8, ++i) {
-         uint8_t wmask = d->wmask[addr + i];
-@@ -1369,9 +1370,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
-addr, uint32_t val_in, int
-         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
-         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
-     }
--    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
-+    if ((val_in != barmask &&
-+       (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
-         ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
--        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
-+        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
-         range_covers_byte(addr, l, PCI_COMMAND))
-         pci_update_mappings(d);
- 
-diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
-index ee9dff2..b8f7d48 100644
---- a/hw/pci/pci_bridge.c
-+++ b/hw/pci/pci_bridge.c
-@@ -253,20 +253,22 @@ void pci_bridge_write_config(PCIDevice *d,
-     PCIBridge *s = PCI_BRIDGE(d);
-     uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
-     uint16_t newctl;
-+    uint64_t barmask = ((uint64_t)1 << len * 8) - 1;
- 
-     pci_default_write_config(d, address, val, len);
- 
-     if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
- 
--        /* io base/limit */
--        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
-+        /* vga enable */
-+        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2) ||
- 
--        /* memory base/limit, prefetchable base/limit and
--           io base/limit upper 16 */
--        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
-+        (val != barmask &&
-+        /* io base/limit */
-+         (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
- 
--        /* vga enable */
--        ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
-+         /* memory base/limit, prefetchable base/limit and
-+            io base/limit upper 16 */
-+         ranges_overlap(address, len, PCI_MEMORY_BASE, 20)))) {
-         pci_bridge_update_mappings(s);
-     }
- 
--- 
-1.8.3.1
-
diff --git a/results/classifier/016/virtual/70416488 b/results/classifier/016/virtual/70416488
deleted file mode 100644
index 8e21bd76..00000000
--- a/results/classifier/016/virtual/70416488
+++ /dev/null
@@ -1,1206 +0,0 @@
-virtual: 0.872
-debug: 0.857
-boot: 0.817
-kernel: 0.804
-hypervisor: 0.767
-arm: 0.323
-KVM: 0.289
-operating system: 0.251
-TCG: 0.103
-VMM: 0.064
-device: 0.047
-PID: 0.037
-register: 0.032
-files: 0.027
-assembly: 0.017
-semantic: 0.016
-socket: 0.015
-peripherals: 0.013
-user-level: 0.009
-performance: 0.009
-vnc: 0.007
-architecture: 0.006
-risc-v: 0.004
-network: 0.003
-alpha: 0.003
-permissions: 0.002
-graphic: 0.002
-ppc: 0.002
-x86: 0.000
-mistranslation: 0.000
-i386: 0.000
-
-[Bug Report] smmuv3 event 0x10 report when running virtio-blk-pci
-
-Hi All,
-
-When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
-during kernel booting up.
-
-qemu command which I use is as below:
-
-qemu-system-aarch64 -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
--kernel Image -initrd minifs.cpio.gz \
--enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
--append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
--device 
-pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2 
-\
--device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
--device 
-virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
--drive file=/home/boot.img,if=none,id=drive0,format=raw
-
-smmuv3 event 0x10 log:
-[...]
-[    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
-[    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
-[    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
-[    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks (1.07 
-GB/1.00 GiB)
-[    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
-[    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
-[    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
-[    1.968381] clk: Disabling unused clocks
-[    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
-[    1.968990] PM: genpd: Disabling unused power domains
-[    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.969814] ALSA device list:
-[    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.970471]   No soundcards found.
-[    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
-[    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
-[    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
-[    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
-[    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.975005] Freeing unused kernel memory: 10112K
-[    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
-[    1.975442] Run init as init process
-
-Another information is that if "maxcpus=3" is removed from the kernel command 
-line,
-it will be OK.
-
-I am not sure if there is a bug about vsmmu. It will be very appreciated if 
-anyone
-know this issue or can take a look at it.
-
-Thanks,
-Zhou
-
-On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->
-Hi All,
->
->
-When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
-during kernel booting up.
-Does it still do this if you either:
- (1) use the v9.1.0 release (commit fd1952d814da)
- (2) use "-machine virt-9.1" instead of "-machine virt"
-
-?
-
-My suspicion is that this will have started happening now that
-we expose an SMMU with two-stage translation support to the guest
-in the "virt" machine type (which we do not if you either
-use virt-9.1 or in the v9.1.0 release).
-
-I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of
-the two-stage support).
-
->
-qemu command which I use is as below:
->
->
-qemu-system-aarch64 -machine
->
-virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
--kernel Image -initrd minifs.cpio.gz \
->
--enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
--append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
--device
->
-pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
-\
->
--device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
--device
->
-virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
--drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->
-smmuv3 event 0x10 log:
->
-[...]
->
-[    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
-[    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
-[    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
-[    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
-(1.07 GB/1.00 GiB)
->
-[    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
-[    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.968381] clk: Disabling unused clocks
->
-[    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.968990] PM: genpd: Disabling unused power domains
->
-[    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.969814] ALSA device list:
->
-[    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.970471]   No soundcards found.
->
-[    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.975005] Freeing unused kernel memory: 10112K
->
-[    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.975442] Run init as init process
->
->
-Another information is that if "maxcpus=3" is removed from the kernel command
->
-line,
->
-it will be OK.
->
->
-I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
-anyone
->
-know this issue or can take a look at it.
-thanks
--- PMM
-
-On 2024/9/9 22:31, Peter Maydell wrote:
->
-On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->
->
-> Hi All,
->
->
->
-> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
-> during kernel booting up.
->
->
-Does it still do this if you either:
->
-(1) use the v9.1.0 release (commit fd1952d814da)
->
-(2) use "-machine virt-9.1" instead of "-machine virt"
-I tested above two cases, the problem is still there.
-
->
->
-?
->
->
-My suspicion is that this will have started happening now that
->
-we expose an SMMU with two-stage translation support to the guest
->
-in the "virt" machine type (which we do not if you either
->
-use virt-9.1 or in the v9.1.0 release).
->
->
-I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of
->
-the two-stage support).
->
->
-> qemu command which I use is as below:
->
->
->
-> qemu-system-aarch64 -machine
->
-> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
-> -kernel Image -initrd minifs.cpio.gz \
->
-> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
-> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
-> -device
->
-> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
->  \
->
-> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
-> -device
->
-> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
-> -drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->
->
-> smmuv3 event 0x10 log:
->
-> [...]
->
-> [    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
-> [    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
-> [    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
-> [    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
-> (1.07 GB/1.00 GiB)
->
-> [    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
-> [    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.968381] clk: Disabling unused clocks
->
-> [    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.968990] PM: genpd: Disabling unused power domains
->
-> [    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.969814] ALSA device list:
->
-> [    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.970471]   No soundcards found.
->
-> [    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.975005] Freeing unused kernel memory: 10112K
->
-> [    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.975442] Run init as init process
->
->
->
-> Another information is that if "maxcpus=3" is removed from the kernel
->
-> command line,
->
-> it will be OK.
->
->
->
-> I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
-> anyone
->
-> know this issue or can take a look at it.
->
->
-thanks
->
--- PMM
->
-.
-
-Hi Zhou,
-On 9/10/24 03:24, Zhou Wang via wrote:
->
-On 2024/9/9 22:31, Peter Maydell wrote:
->
-> On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->> Hi All,
->
->>
->
->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
->> during kernel booting up.
->
-> Does it still do this if you either:
->
->  (1) use the v9.1.0 release (commit fd1952d814da)
->
->  (2) use "-machine virt-9.1" instead of "-machine virt"
->
-I tested above two cases, the problem is still there.
-Thank you for reporting. I am able to reproduce and effectively the
-maxcpus kernel option is triggering the issue. It works without. I will
-come back to you asap.
-
-Eric
->
->
-> ?
->
->
->
-> My suspicion is that this will have started happening now that
->
-> we expose an SMMU with two-stage translation support to the guest
->
-> in the "virt" machine type (which we do not if you either
->
-> use virt-9.1 or in the v9.1.0 release).
->
->
->
-> I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of
->
-> the two-stage support).
->
->
->
->> qemu command which I use is as below:
->
->>
->
->> qemu-system-aarch64 -machine
->
->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
->> -kernel Image -initrd minifs.cpio.gz \
->
->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
->> -device
->
->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
->>  \
->
->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
->> -device
->
->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
->> -drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->>
->
->> smmuv3 event 0x10 log:
->
->> [...]
->
->> [    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
->> [    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
->> [    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
->> [    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
->> (1.07 GB/1.00 GiB)
->
->> [    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
->> [    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.968381] clk: Disabling unused clocks
->
->> [    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.968990] PM: genpd: Disabling unused power domains
->
->> [    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.969814] ALSA device list:
->
->> [    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.970471]   No soundcards found.
->
->> [    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975005] Freeing unused kernel memory: 10112K
->
->> [    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975442] Run init as init process
->
->>
->
->> Another information is that if "maxcpus=3" is removed from the kernel
->
->> command line,
->
->> it will be OK.
->
->>
->
->> I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
->> anyone
->
->> know this issue or can take a look at it.
->
-> thanks
->
-> -- PMM
->
-> .
-
-Hi,
-
-On 9/10/24 03:24, Zhou Wang via wrote:
->
-On 2024/9/9 22:31, Peter Maydell wrote:
->
-> On Mon, 9 Sept 2024 at 15:22, Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->> Hi All,
->
->>
->
->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
->> during kernel booting up.
->
-> Does it still do this if you either:
->
->  (1) use the v9.1.0 release (commit fd1952d814da)
->
->  (2) use "-machine virt-9.1" instead of "-machine virt"
->
-I tested above two cases, the problem is still there.
-I have not much progressed yet but I see it comes with
-qemu traces.
-
-smmuv3-iommu-memory-region-0-0 translation failed for iova=0x0
-(SMMU_EVT_F_TRANSLATION)
-../..
-qemu-system-aarch64: virtio-blk failed to set guest notifier (-22),
-ensure -accel kvm is set.
-qemu-system-aarch64: virtio_bus_start_ioeventfd: failed. Fallback to
-userspace (slower).
-
-the PCIe Host bridge seems to cause that translation failure at iova=0
-
-Also virtio-iommu has the same issue:
-qemu-system-aarch64: virtio_iommu_translate no mapping for 0x0 for sid=1024
-qemu-system-aarch64: virtio-blk failed to set guest notifier (-22),
-ensure -accel kvm is set.
-qemu-system-aarch64: virtio_bus_start_ioeventfd: failed. Fallback to
-userspace (slower).
-
-Only happens with maxcpus=3. Note the virtio-blk-pci is not protected by
-the vIOMMU in your case.
-
-Thanks
-
-Eric
-
->
->
-> ?
->
->
->
-> My suspicion is that this will have started happening now that
->
-> we expose an SMMU with two-stage translation support to the guest
->
-> in the "virt" machine type (which we do not if you either
->
-> use virt-9.1 or in the v9.1.0 release).
->
->
->
-> I've cc'd Eric (smmuv3 maintainer) and Mostafa (author of
->
-> the two-stage support).
->
->
->
->> qemu command which I use is as below:
->
->>
->
->> qemu-system-aarch64 -machine
->
->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
->> -kernel Image -initrd minifs.cpio.gz \
->
->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
->> -device
->
->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
->>  \
->
->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
->> -device
->
->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
->> -drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->>
->
->> smmuv3 event 0x10 log:
->
->> [...]
->
->> [    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
->> [    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
->> [    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
->> [    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
->> (1.07 GB/1.00 GiB)
->
->> [    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
->> [    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.968381] clk: Disabling unused clocks
->
->> [    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.968990] PM: genpd: Disabling unused power domains
->
->> [    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.969814] ALSA device list:
->
->> [    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.970471]   No soundcards found.
->
->> [    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975005] Freeing unused kernel memory: 10112K
->
->> [    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975442] Run init as init process
->
->>
->
->> Another information is that if "maxcpus=3" is removed from the kernel
->
->> command line,
->
->> it will be OK.
->
->>
->
->> I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
->> anyone
->
->> know this issue or can take a look at it.
->
-> thanks
->
-> -- PMM
->
-> .
-
-Hi Zhou,
-
-On Mon, Sep 9, 2024 at 3:22 PM Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->
-Hi All,
->
->
-When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
-during kernel booting up.
->
->
-qemu command which I use is as below:
->
->
-qemu-system-aarch64 -machine
->
-virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
--kernel Image -initrd minifs.cpio.gz \
->
--enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
--append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
--device
->
-pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
-\
->
--device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
--device
->
-virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
--drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->
-smmuv3 event 0x10 log:
->
-[...]
->
-[    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
-[    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
-[    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
-[    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
-(1.07 GB/1.00 GiB)
->
-[    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
-[    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.968381] clk: Disabling unused clocks
->
-[    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.968990] PM: genpd: Disabling unused power domains
->
-[    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.969814] ALSA device list:
->
-[    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.970471]   No soundcards found.
->
-[    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-[    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-[    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-[    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.975005] Freeing unused kernel memory: 10112K
->
-[    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-[    1.975442] Run init as init process
->
->
-Another information is that if "maxcpus=3" is removed from the kernel command
->
-line,
->
-it will be OK.
->
-That's interesting, not sure how that would be related.
-
->
-I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
-anyone
->
-know this issue or can take a look at it.
->
-Can you please provide logs with adding "-d trace:smmu*" to qemu invocation.
-
-Also if possible, can you please provide which Linux kernel version
-you are using, I will see if I can repro.
-
-Thanks,
-Mostafa
-
->
-Thanks,
->
-Zhou
->
->
->
-
-On 2024/9/9 22:47, Mostafa Saleh wrote:
->
-Hi Zhou,
->
->
-On Mon, Sep 9, 2024 at 3:22 PM Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->
->
-> Hi All,
->
->
->
-> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event 0x10
->
-> during kernel booting up.
->
->
->
-> qemu command which I use is as below:
->
->
->
-> qemu-system-aarch64 -machine
->
-> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
-> -kernel Image -initrd minifs.cpio.gz \
->
-> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
-> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
-> -device
->
-> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
->  \
->
-> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1 \
->
-> -device
->
-> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
-> -drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->
->
-> smmuv3 event 0x10 log:
->
-> [...]
->
-> [    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
-> [    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
-> [    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
-> [    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
-> (1.07 GB/1.00 GiB)
->
-> [    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
-> [    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.968381] clk: Disabling unused clocks
->
-> [    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.968990] PM: genpd: Disabling unused power domains
->
-> [    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.969814] ALSA device list:
->
-> [    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.970471]   No soundcards found.
->
-> [    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
-> [    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
-> [    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
-> [    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.975005] Freeing unused kernel memory: 10112K
->
-> [    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
-> [    1.975442] Run init as init process
->
->
->
-> Another information is that if "maxcpus=3" is removed from the kernel
->
-> command line,
->
-> it will be OK.
->
->
->
->
-That's interesting, not sure how that would be related.
->
->
-> I am not sure if there is a bug about vsmmu. It will be very appreciated if
->
-> anyone
->
-> know this issue or can take a look at it.
->
->
->
->
-Can you please provide logs with adding "-d trace:smmu*" to qemu invocation.
-Sure. Please see the attached log(using above qemu commit and command).
-
->
->
-Also if possible, can you please provide which Linux kernel version
->
-you are using, I will see if I can repro.
-I just use the latest mainline kernel(commit b831f83e40a2) with defconfig.
-
-Thanks,
-Zhou
-
->
->
-Thanks,
->
-Mostafa
->
->
-> Thanks,
->
-> Zhou
->
->
->
->
->
->
->
->
-.
-qemu_boot_log.txt
-Description:
-Text document
-
-On Tue, Sep 10, 2024 at 2:51 AM Zhou Wang <wangzhou1@hisilicon.com> wrote:
->
->
-On 2024/9/9 22:47, Mostafa Saleh wrote:
->
-> Hi Zhou,
->
->
->
-> On Mon, Sep 9, 2024 at 3:22 PM Zhou Wang via <qemu-devel@nongnu.org> wrote:
->
->>
->
->> Hi All,
->
->>
->
->> When I tested mainline qemu(commit 7b87a25f49), it reports smmuv3 event
->
->> 0x10
->
->> during kernel booting up.
->
->>
->
->> qemu command which I use is as below:
->
->>
->
->> qemu-system-aarch64 -machine
->
->> virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 \
->
->> -kernel Image -initrd minifs.cpio.gz \
->
->> -enable-kvm -net none -nographic -m 3G -smp 6 -cpu host \
->
->> -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 maxcpus=3' \
->
->> -device
->
->> pcie-root-port,port=0x8,chassis=0,id=pci.0,bus=pcie.0,multifunction=on,addr=0x2
->
->>  \
->
->> -device pcie-root-port,port=0x9,chassis=1,id=pci.1,bus=pcie.0,addr=0x2.0x1
->
->> \
->
->> -device
->
->> virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=8,packed=on,bus=pci.1 \
->
->> -drive file=/home/boot.img,if=none,id=drive0,format=raw
->
->>
->
->> smmuv3 event 0x10 log:
->
->> [...]
->
->> [    1.962656] virtio-pci 0000:02:00.0: Adding to iommu group 0
->
->> [    1.963150] virtio-pci 0000:02:00.0: enabling device (0000 -> 0002)
->
->> [    1.964707] virtio_blk virtio0: 6/0/0 default/read/poll queues
->
->> [    1.965759] virtio_blk virtio0: [vda] 2097152 512-byte logical blocks
->
->> (1.07 GB/1.00 GiB)
->
->> [    1.966934] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.967442] input: gpio-keys as /devices/platform/gpio-keys/input/input0
->
->> [    1.967478] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.968381] clk: Disabling unused clocks
->
->> [    1.968677] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.968990] PM: genpd: Disabling unused power domains
->
->> [    1.969424] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.969814] ALSA device list:
->
->> [    1.970240] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.970471]   No soundcards found.
->
->> [    1.970902] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971600] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.971601] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971602] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.971606] arm-smmu-v3 9050000.smmuv3: event 0x10 received:
->
->> [    1.971607] arm-smmu-v3 9050000.smmuv3:      0x0000020000000010
->
->> [    1.974202] arm-smmu-v3 9050000.smmuv3:      0x0000020000000000
->
->> [    1.974634] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975005] Freeing unused kernel memory: 10112K
->
->> [    1.975062] arm-smmu-v3 9050000.smmuv3:      0x0000000000000000
->
->> [    1.975442] Run init as init process
->
->>
->
->> Another information is that if "maxcpus=3" is removed from the kernel
->
->> command line,
->
->> it will be OK.
->
->>
->
->
->
-> That's interesting, not sure how that would be related.
->
->
->
->> I am not sure if there is a bug about vsmmu. It will be very appreciated
->
->> if anyone
->
->> know this issue or can take a look at it.
->
->>
->
->
->
-> Can you please provide logs with adding "-d trace:smmu*" to qemu invocation.
->
->
-Sure. Please see the attached log(using above qemu commit and command).
->
-Thanks a lot, it seems the SMMUv3 indeed receives a translation
-request with addr 0x0 which causes this event.
-I don't see any kind of modification (alignment) of the address in this path.
-So my hunch it's not related to the SMMUv3 and the initiator is
-issuing bogus addresses.
-
->
->
->
-> Also if possible, can you please provide which Linux kernel version
->
-> you are using, I will see if I can repro.
->
->
-I just use the latest mainline kernel(commit b831f83e40a2) with defconfig.
->
-I see, I can't repro in my setup which has no "--enable-kvm" and with
-"-cpu max" instead of host.
-I will try other options and see if I can repro.
-
-Thanks,
-Mostafa
->
-Thanks,
->
-Zhou
->
->
->
->
-> Thanks,
->
-> Mostafa
->
->
->
->> Thanks,
->
->> Zhou
->
->>
->
->>
->
->>
->
->
->
-> .
-
diff --git a/results/classifier/016/virtual/74466963 b/results/classifier/016/virtual/74466963
deleted file mode 100644
index a738c771..00000000
--- a/results/classifier/016/virtual/74466963
+++ /dev/null
@@ -1,1905 +0,0 @@
-TCG: 0.983
-virtual: 0.972
-hypervisor: 0.881
-vnc: 0.807
-debug: 0.545
-x86: 0.169
-operating system: 0.077
-network: 0.046
-socket: 0.038
-boot: 0.036
-register: 0.035
-device: 0.027
-PID: 0.016
-files: 0.014
-VMM: 0.013
-user-level: 0.006
-assembly: 0.006
-kernel: 0.006
-ppc: 0.006
-semantic: 0.006
-performance: 0.005
-architecture: 0.004
-KVM: 0.003
-risc-v: 0.002
-peripherals: 0.002
-permissions: 0.002
-alpha: 0.002
-graphic: 0.001
-arm: 0.001
-mistranslation: 0.000
-i386: 0.000
-
-[Qemu-devel] [TCG only][Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration
-
-Hi all,
-
-Does anyboday remember the similar issue post by hailiang months ago
-http://patchwork.ozlabs.org/patch/454322/
-At least tow bugs about migration had been fixed since that.
-And now we found the same issue at the tcg vm(kvm is fine), after
-migration, the content VM's memory is inconsistent.
-we add a patch to check memory content, you can find it from affix
-
-steps to reporduce:
-1) apply the patch and re-build qemu
-2) prepare the ubuntu guest and run memtest in grub.
-soruce side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off
-destination side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
-3) start migration
-with 1000M NIC, migration will finish within 3 min.
-
-at source:
-(qemu) migrate tcp:192.168.2.66:8881
-after saving ram complete
-e9e725df678d392b1a83b3a917f332bb
-qemu-system-x86_64: end ram md5
-(qemu)
-
-at destination:
-...skip...
-Completed load of VM with exit code 0 seq iteration 1264
-Completed load of VM with exit code 0 seq iteration 1265
-Completed load of VM with exit code 0 seq iteration 1266
-qemu-system-x86_64: after loading state section id 2(ram)
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
-
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-
-This occurs occasionally and only at tcg machine. It seems that
-some pages dirtied in source side don't transferred to destination.
-This problem can be reproduced even if we disable virtio.
-Is it OK for some pages that not transferred to destination when do
-migration ? Or is it a bug?
-Any idea...
-
-=================md5 check patch=============================
-
-diff --git a/Makefile.target b/Makefile.target
-index 962d004..e2cb8e9 100644
---- a/Makefile.target
-+++ b/Makefile.target
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
- obj-y += memory_mapping.o
- obj-y += dump.o
- obj-y += migration/ram.o migration/savevm.o
--LIBS := $(libs_softmmu) $(LIBS)
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
-
- # xen support
- obj-$(CONFIG_XEN) += xen-common.o
-diff --git a/migration/ram.c b/migration/ram.c
-index 1eb155a..3b7a09d 100644
---- a/migration/ram.c
-+++ b/migration/ram.c
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
-version_id)
-}
-
-     rcu_read_unlock();
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
-+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
-             "%" PRIu64 "\n", ret, seq_iter);
-     return ret;
- }
-diff --git a/migration/savevm.c b/migration/savevm.c
-index 0ad1b93..3feaa61 100644
---- a/migration/savevm.c
-+++ b/migration/savevm.c
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
-
- }
-
-+#include "exec/ram_addr.h"
-+#include "qemu/rcu_queue.h"
-+#include <clplumbing/md5.h>
-+#ifndef MD5_DIGEST_LENGTH
-+#define MD5_DIGEST_LENGTH 16
-+#endif
-+
-+static void check_host_md5(void)
-+{
-+    int i;
-+    unsigned char md[MD5_DIGEST_LENGTH];
-+    rcu_read_lock();
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
-'pc.ram' block */
-+    rcu_read_unlock();
-+
-+    MD5(block->host, block->used_length, md);
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
-+        fprintf(stderr, "%02x", md[i]);
-+    }
-+    fprintf(stderr, "\n");
-+    error_report("end ram md5");
-+}
-+
- void qemu_savevm_state_begin(QEMUFile *f,
-                              const MigrationParams *params)
- {
-@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile
-*f, bool iterable_only)
-save_section_header(f, se, QEMU_VM_SECTION_END);
-
-         ret = se->ops->save_live_complete_precopy(f, se->opaque);
-+
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
-+        check_host_md5();
-+
-         trace_savevm_section_end(se->idstr, se->section_id, ret);
-         save_section_footer(f, se);
-         if (ret < 0) {
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
-MigrationIncomingState *mis)
-section_id, le->se->idstr);
-                 return ret;
-             }
-+            if (section_type == QEMU_VM_SECTION_END) {
-+                error_report("after loading state section id %d(%s)",
-+                             section_id, le->se->idstr);
-+                check_host_md5();
-+            }
-             if (!check_section_footer(f, le)) {
-                 return -EINVAL;
-             }
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
-     }
-
-     cpu_synchronize_all_post_init();
-+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
-+    check_host_md5();
-
-     return ret;
- }
-
-* Li Zhijian (address@hidden) wrote:
->
-Hi all,
->
->
-Does anyboday remember the similar issue post by hailiang months ago
->
-http://patchwork.ozlabs.org/patch/454322/
->
-At least tow bugs about migration had been fixed since that.
-Yes, I wondered what happened to that.
-
->
-And now we found the same issue at the tcg vm(kvm is fine), after migration,
->
-the content VM's memory is inconsistent.
-Hmm, TCG only - I don't know much about that; but I guess something must
-be accessing memory without using the proper macros/functions so
-it doesn't mark it as dirty.
-
->
-we add a patch to check memory content, you can find it from affix
->
->
-steps to reporduce:
->
-1) apply the patch and re-build qemu
->
-2) prepare the ubuntu guest and run memtest in grub.
->
-soruce side:
->
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
-pc-i440fx-2.3,accel=tcg,usb=off
->
->
-destination side:
->
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
->
->
-3) start migration
->
-with 1000M NIC, migration will finish within 3 min.
->
->
-at source:
->
-(qemu) migrate tcp:192.168.2.66:8881
->
-after saving ram complete
->
-e9e725df678d392b1a83b3a917f332bb
->
-qemu-system-x86_64: end ram md5
->
-(qemu)
->
->
-at destination:
->
-...skip...
->
-Completed load of VM with exit code 0 seq iteration 1264
->
-Completed load of VM with exit code 0 seq iteration 1265
->
-Completed load of VM with exit code 0 seq iteration 1266
->
-qemu-system-x86_64: after loading state section id 2(ram)
->
-49c2dac7bde0e5e22db7280dcb3824f9
->
-qemu-system-x86_64: end ram md5
->
-qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
->
->
-49c2dac7bde0e5e22db7280dcb3824f9
->
-qemu-system-x86_64: end ram md5
->
->
-This occurs occasionally and only at tcg machine. It seems that
->
-some pages dirtied in source side don't transferred to destination.
->
-This problem can be reproduced even if we disable virtio.
->
->
-Is it OK for some pages that not transferred to destination when do
->
-migration ? Or is it a bug?
-I'm pretty sure that means it's a bug.  Hard to find though, I guess
-at least memtest is smaller than a big OS.  I think I'd dump the whole
-of memory on both sides, hexdump and diff them  - I'd guess it would
-just be one byte/word different, maybe that would offer some idea what
-wrote it.
-
-Dave
-
->
-Any idea...
->
->
-=================md5 check patch=============================
->
->
-diff --git a/Makefile.target b/Makefile.target
->
-index 962d004..e2cb8e9 100644
->
---- a/Makefile.target
->
-+++ b/Makefile.target
->
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
->
-obj-y += memory_mapping.o
->
-obj-y += dump.o
->
-obj-y += migration/ram.o migration/savevm.o
->
--LIBS := $(libs_softmmu) $(LIBS)
->
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
->
->
-# xen support
->
-obj-$(CONFIG_XEN) += xen-common.o
->
-diff --git a/migration/ram.c b/migration/ram.c
->
-index 1eb155a..3b7a09d 100644
->
---- a/migration/ram.c
->
-+++ b/migration/ram.c
->
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
->
-version_id)
->
-}
->
->
-rcu_read_unlock();
->
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
->
-+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
->
-"%" PRIu64 "\n", ret, seq_iter);
->
-return ret;
->
-}
->
-diff --git a/migration/savevm.c b/migration/savevm.c
->
-index 0ad1b93..3feaa61 100644
->
---- a/migration/savevm.c
->
-+++ b/migration/savevm.c
->
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
->
->
-}
->
->
-+#include "exec/ram_addr.h"
->
-+#include "qemu/rcu_queue.h"
->
-+#include <clplumbing/md5.h>
->
-+#ifndef MD5_DIGEST_LENGTH
->
-+#define MD5_DIGEST_LENGTH 16
->
-+#endif
->
-+
->
-+static void check_host_md5(void)
->
-+{
->
-+    int i;
->
-+    unsigned char md[MD5_DIGEST_LENGTH];
->
-+    rcu_read_lock();
->
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
->
-'pc.ram' block */
->
-+    rcu_read_unlock();
->
-+
->
-+    MD5(block->host, block->used_length, md);
->
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
->
-+        fprintf(stderr, "%02x", md[i]);
->
-+    }
->
-+    fprintf(stderr, "\n");
->
-+    error_report("end ram md5");
->
-+}
->
-+
->
-void qemu_savevm_state_begin(QEMUFile *f,
->
-const MigrationParams *params)
->
-{
->
-@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f,
->
-bool iterable_only)
->
-save_section_header(f, se, QEMU_VM_SECTION_END);
->
->
-ret = se->ops->save_live_complete_precopy(f, se->opaque);
->
-+
->
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
->
-+        check_host_md5();
->
-+
->
-trace_savevm_section_end(se->idstr, se->section_id, ret);
->
-save_section_footer(f, se);
->
-if (ret < 0) {
->
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
->
-MigrationIncomingState *mis)
->
-section_id, le->se->idstr);
->
-return ret;
->
-}
->
-+            if (section_type == QEMU_VM_SECTION_END) {
->
-+                error_report("after loading state section id %d(%s)",
->
-+                             section_id, le->se->idstr);
->
-+                check_host_md5();
->
-+            }
->
-if (!check_section_footer(f, le)) {
->
-return -EINVAL;
->
-}
->
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
->
-}
->
->
-cpu_synchronize_all_post_init();
->
-+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
->
-+    check_host_md5();
->
->
-return ret;
->
-}
->
->
->
---
-Dr. David Alan Gilbert / address@hidden / Manchester, UK
-
-On 2015/12/3 17:24, Dr. David Alan Gilbert wrote:
-* Li Zhijian (address@hidden) wrote:
-Hi all,
-
-Does anyboday remember the similar issue post by hailiang months ago
-http://patchwork.ozlabs.org/patch/454322/
-At least tow bugs about migration had been fixed since that.
-Yes, I wondered what happened to that.
-And now we found the same issue at the tcg vm(kvm is fine), after migration,
-the content VM's memory is inconsistent.
-Hmm, TCG only - I don't know much about that; but I guess something must
-be accessing memory without using the proper macros/functions so
-it doesn't mark it as dirty.
-we add a patch to check memory content, you can find it from affix
-
-steps to reporduce:
-1) apply the patch and re-build qemu
-2) prepare the ubuntu guest and run memtest in grub.
-soruce side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off
-
-destination side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
-
-3) start migration
-with 1000M NIC, migration will finish within 3 min.
-
-at source:
-(qemu) migrate tcp:192.168.2.66:8881
-after saving ram complete
-e9e725df678d392b1a83b3a917f332bb
-qemu-system-x86_64: end ram md5
-(qemu)
-
-at destination:
-...skip...
-Completed load of VM with exit code 0 seq iteration 1264
-Completed load of VM with exit code 0 seq iteration 1265
-Completed load of VM with exit code 0 seq iteration 1266
-qemu-system-x86_64: after loading state section id 2(ram)
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
-
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-
-This occurs occasionally and only at tcg machine. It seems that
-some pages dirtied in source side don't transferred to destination.
-This problem can be reproduced even if we disable virtio.
-
-Is it OK for some pages that not transferred to destination when do
-migration ? Or is it a bug?
-I'm pretty sure that means it's a bug.  Hard to find though, I guess
-at least memtest is smaller than a big OS.  I think I'd dump the whole
-of memory on both sides, hexdump and diff them  - I'd guess it would
-just be one byte/word different, maybe that would offer some idea what
-wrote it.
-Maybe one better way to do that is with the help of userfaultfd's write-protect
-capability. It is still in the development by Andrea Arcangeli, but there
-is a RFC version available, please refer to
-http://www.spinics.net/lists/linux-mm/msg97422.html
-(I'm developing live memory snapshot which based on it, maybe this is another 
-scene where we
-can use userfaultfd's WP ;) ).
-Dave
-Any idea...
-
-=================md5 check patch=============================
-
-diff --git a/Makefile.target b/Makefile.target
-index 962d004..e2cb8e9 100644
---- a/Makefile.target
-+++ b/Makefile.target
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
-  obj-y += memory_mapping.o
-  obj-y += dump.o
-  obj-y += migration/ram.o migration/savevm.o
--LIBS := $(libs_softmmu) $(LIBS)
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
-
-  # xen support
-  obj-$(CONFIG_XEN) += xen-common.o
-diff --git a/migration/ram.c b/migration/ram.c
-index 1eb155a..3b7a09d 100644
---- a/migration/ram.c
-+++ b/migration/ram.c
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
-version_id)
-      }
-
-      rcu_read_unlock();
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
-+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
-              "%" PRIu64 "\n", ret, seq_iter);
-      return ret;
-  }
-diff --git a/migration/savevm.c b/migration/savevm.c
-index 0ad1b93..3feaa61 100644
---- a/migration/savevm.c
-+++ b/migration/savevm.c
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
-
-  }
-
-+#include "exec/ram_addr.h"
-+#include "qemu/rcu_queue.h"
-+#include <clplumbing/md5.h>
-+#ifndef MD5_DIGEST_LENGTH
-+#define MD5_DIGEST_LENGTH 16
-+#endif
-+
-+static void check_host_md5(void)
-+{
-+    int i;
-+    unsigned char md[MD5_DIGEST_LENGTH];
-+    rcu_read_lock();
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
-'pc.ram' block */
-+    rcu_read_unlock();
-+
-+    MD5(block->host, block->used_length, md);
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
-+        fprintf(stderr, "%02x", md[i]);
-+    }
-+    fprintf(stderr, "\n");
-+    error_report("end ram md5");
-+}
-+
-  void qemu_savevm_state_begin(QEMUFile *f,
-                               const MigrationParams *params)
-  {
-@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f,
-bool iterable_only)
-          save_section_header(f, se, QEMU_VM_SECTION_END);
-
-          ret = se->ops->save_live_complete_precopy(f, se->opaque);
-+
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
-+        check_host_md5();
-+
-          trace_savevm_section_end(se->idstr, se->section_id, ret);
-          save_section_footer(f, se);
-          if (ret < 0) {
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
-MigrationIncomingState *mis)
-                               section_id, le->se->idstr);
-                  return ret;
-              }
-+            if (section_type == QEMU_VM_SECTION_END) {
-+                error_report("after loading state section id %d(%s)",
-+                             section_id, le->se->idstr);
-+                check_host_md5();
-+            }
-              if (!check_section_footer(f, le)) {
-                  return -EINVAL;
-              }
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
-      }
-
-      cpu_synchronize_all_post_init();
-+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
-+    check_host_md5();
-
-      return ret;
-  }
---
-Dr. David Alan Gilbert / address@hidden / Manchester, UK
-
-.
-
-On 12/03/2015 05:37 PM, Hailiang Zhang wrote:
-On 2015/12/3 17:24, Dr. David Alan Gilbert wrote:
-* Li Zhijian (address@hidden) wrote:
-Hi all,
-
-Does anyboday remember the similar issue post by hailiang months ago
-http://patchwork.ozlabs.org/patch/454322/
-At least tow bugs about migration had been fixed since that.
-Yes, I wondered what happened to that.
-And now we found the same issue at the tcg vm(kvm is fine), after
-migration,
-the content VM's memory is inconsistent.
-Hmm, TCG only - I don't know much about that; but I guess something must
-be accessing memory without using the proper macros/functions so
-it doesn't mark it as dirty.
-we add a patch to check memory content, you can find it from affix
-
-steps to reporduce:
-1) apply the patch and re-build qemu
-2) prepare the ubuntu guest and run memtest in grub.
-soruce side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off
-
-destination side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
-
-3) start migration
-with 1000M NIC, migration will finish within 3 min.
-
-at source:
-(qemu) migrate tcp:192.168.2.66:8881
-after saving ram complete
-e9e725df678d392b1a83b3a917f332bb
-qemu-system-x86_64: end ram md5
-(qemu)
-
-at destination:
-...skip...
-Completed load of VM with exit code 0 seq iteration 1264
-Completed load of VM with exit code 0 seq iteration 1265
-Completed load of VM with exit code 0 seq iteration 1266
-qemu-system-x86_64: after loading state section id 2(ram)
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-qemu-system-x86_64: qemu_loadvm_state: after
-cpu_synchronize_all_post_init
-
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-
-This occurs occasionally and only at tcg machine. It seems that
-some pages dirtied in source side don't transferred to destination.
-This problem can be reproduced even if we disable virtio.
-
-Is it OK for some pages that not transferred to destination when do
-migration ? Or is it a bug?
-I'm pretty sure that means it's a bug.  Hard to find though, I guess
-at least memtest is smaller than a big OS.  I think I'd dump the whole
-of memory on both sides, hexdump and diff them  - I'd guess it would
-just be one byte/word different, maybe that would offer some idea what
-wrote it.
-Maybe one better way to do that is with the help of userfaultfd's
-write-protect
-capability. It is still in the development by Andrea Arcangeli, but there
-is a RFC version available, please refer to
-http://www.spinics.net/lists/linux-mm/msg97422.html
-(I'm developing live memory snapshot which based on it, maybe this is
-another scene where we
-can use userfaultfd's WP ;) ).
-sounds good.
-
-thanks
-Li
-Dave
-Any idea...
-
-=================md5 check patch=============================
-
-diff --git a/Makefile.target b/Makefile.target
-index 962d004..e2cb8e9 100644
---- a/Makefile.target
-+++ b/Makefile.target
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
-  obj-y += memory_mapping.o
-  obj-y += dump.o
-  obj-y += migration/ram.o migration/savevm.o
--LIBS := $(libs_softmmu) $(LIBS)
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
-
-  # xen support
-  obj-$(CONFIG_XEN) += xen-common.o
-diff --git a/migration/ram.c b/migration/ram.c
-index 1eb155a..3b7a09d 100644
---- a/migration/ram.c
-+++ b/migration/ram.c
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
-version_id)
-      }
-
-      rcu_read_unlock();
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
-+    fprintf(stderr, "Completed load of VM with exit code %d seq
-iteration "
-              "%" PRIu64 "\n", ret, seq_iter);
-      return ret;
-  }
-diff --git a/migration/savevm.c b/migration/savevm.c
-index 0ad1b93..3feaa61 100644
---- a/migration/savevm.c
-+++ b/migration/savevm.c
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
-
-  }
-
-+#include "exec/ram_addr.h"
-+#include "qemu/rcu_queue.h"
-+#include <clplumbing/md5.h>
-+#ifndef MD5_DIGEST_LENGTH
-+#define MD5_DIGEST_LENGTH 16
-+#endif
-+
-+static void check_host_md5(void)
-+{
-+    int i;
-+    unsigned char md[MD5_DIGEST_LENGTH];
-+    rcu_read_lock();
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
-'pc.ram' block */
-+    rcu_read_unlock();
-+
-+    MD5(block->host, block->used_length, md);
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
-+        fprintf(stderr, "%02x", md[i]);
-+    }
-+    fprintf(stderr, "\n");
-+    error_report("end ram md5");
-+}
-+
-  void qemu_savevm_state_begin(QEMUFile *f,
-                               const MigrationParams *params)
-  {
-@@ -1056,6 +1079,10 @@ void
-qemu_savevm_state_complete_precopy(QEMUFile *f,
-bool iterable_only)
-          save_section_header(f, se, QEMU_VM_SECTION_END);
-
-          ret = se->ops->save_live_complete_precopy(f, se->opaque);
-+
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
-+        check_host_md5();
-+
-          trace_savevm_section_end(se->idstr, se->section_id, ret);
-          save_section_footer(f, se);
-          if (ret < 0) {
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
-MigrationIncomingState *mis)
-                               section_id, le->se->idstr);
-                  return ret;
-              }
-+            if (section_type == QEMU_VM_SECTION_END) {
-+                error_report("after loading state section id %d(%s)",
-+                             section_id, le->se->idstr);
-+                check_host_md5();
-+            }
-              if (!check_section_footer(f, le)) {
-                  return -EINVAL;
-              }
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
-      }
-
-      cpu_synchronize_all_post_init();
-+    error_report("%s: after cpu_synchronize_all_post_init\n",
-__func__);
-+    check_host_md5();
-
-      return ret;
-  }
---
-Dr. David Alan Gilbert / address@hidden / Manchester, UK
-
-.
-.
---
-Best regards.
-Li Zhijian (8555)
-
-On 12/03/2015 05:24 PM, Dr. David Alan Gilbert wrote:
-* Li Zhijian (address@hidden) wrote:
-Hi all,
-
-Does anyboday remember the similar issue post by hailiang months ago
-http://patchwork.ozlabs.org/patch/454322/
-At least tow bugs about migration had been fixed since that.
-Yes, I wondered what happened to that.
-And now we found the same issue at the tcg vm(kvm is fine), after migration,
-the content VM's memory is inconsistent.
-Hmm, TCG only - I don't know much about that; but I guess something must
-be accessing memory without using the proper macros/functions so
-it doesn't mark it as dirty.
-we add a patch to check memory content, you can find it from affix
-
-steps to reporduce:
-1) apply the patch and re-build qemu
-2) prepare the ubuntu guest and run memtest in grub.
-soruce side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off
-
-destination side:
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
-
-3) start migration
-with 1000M NIC, migration will finish within 3 min.
-
-at source:
-(qemu) migrate tcp:192.168.2.66:8881
-after saving ram complete
-e9e725df678d392b1a83b3a917f332bb
-qemu-system-x86_64: end ram md5
-(qemu)
-
-at destination:
-...skip...
-Completed load of VM with exit code 0 seq iteration 1264
-Completed load of VM with exit code 0 seq iteration 1265
-Completed load of VM with exit code 0 seq iteration 1266
-qemu-system-x86_64: after loading state section id 2(ram)
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
-
-49c2dac7bde0e5e22db7280dcb3824f9
-qemu-system-x86_64: end ram md5
-
-This occurs occasionally and only at tcg machine. It seems that
-some pages dirtied in source side don't transferred to destination.
-This problem can be reproduced even if we disable virtio.
-
-Is it OK for some pages that not transferred to destination when do
-migration ? Or is it a bug?
-I'm pretty sure that means it's a bug.  Hard to find though, I guess
-at least memtest is smaller than a big OS.  I think I'd dump the whole
-of memory on both sides, hexdump and diff them  - I'd guess it would
-just be one byte/word different, maybe that would offer some idea what
-wrote it.
-I try to dump and compare them, more than 10 pages are different.
-in source side, they are random value rather than always 'FF' 'FB' 'EF'
-'BF'... in destination.
-and not all of the different pages are continuous.
-
-thanks
-Li
-Dave
-Any idea...
-
-=================md5 check patch=============================
-
-diff --git a/Makefile.target b/Makefile.target
-index 962d004..e2cb8e9 100644
---- a/Makefile.target
-+++ b/Makefile.target
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
-  obj-y += memory_mapping.o
-  obj-y += dump.o
-  obj-y += migration/ram.o migration/savevm.o
--LIBS := $(libs_softmmu) $(LIBS)
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
-
-  # xen support
-  obj-$(CONFIG_XEN) += xen-common.o
-diff --git a/migration/ram.c b/migration/ram.c
-index 1eb155a..3b7a09d 100644
---- a/migration/ram.c
-+++ b/migration/ram.c
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
-version_id)
-      }
-
-      rcu_read_unlock();
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
-+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
-              "%" PRIu64 "\n", ret, seq_iter);
-      return ret;
-  }
-diff --git a/migration/savevm.c b/migration/savevm.c
-index 0ad1b93..3feaa61 100644
---- a/migration/savevm.c
-+++ b/migration/savevm.c
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
-
-  }
-
-+#include "exec/ram_addr.h"
-+#include "qemu/rcu_queue.h"
-+#include <clplumbing/md5.h>
-+#ifndef MD5_DIGEST_LENGTH
-+#define MD5_DIGEST_LENGTH 16
-+#endif
-+
-+static void check_host_md5(void)
-+{
-+    int i;
-+    unsigned char md[MD5_DIGEST_LENGTH];
-+    rcu_read_lock();
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
-'pc.ram' block */
-+    rcu_read_unlock();
-+
-+    MD5(block->host, block->used_length, md);
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
-+        fprintf(stderr, "%02x", md[i]);
-+    }
-+    fprintf(stderr, "\n");
-+    error_report("end ram md5");
-+}
-+
-  void qemu_savevm_state_begin(QEMUFile *f,
-                               const MigrationParams *params)
-  {
-@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f,
-bool iterable_only)
-          save_section_header(f, se, QEMU_VM_SECTION_END);
-
-          ret = se->ops->save_live_complete_precopy(f, se->opaque);
-+
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
-+        check_host_md5();
-+
-          trace_savevm_section_end(se->idstr, se->section_id, ret);
-          save_section_footer(f, se);
-          if (ret < 0) {
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
-MigrationIncomingState *mis)
-                               section_id, le->se->idstr);
-                  return ret;
-              }
-+            if (section_type == QEMU_VM_SECTION_END) {
-+                error_report("after loading state section id %d(%s)",
-+                             section_id, le->se->idstr);
-+                check_host_md5();
-+            }
-              if (!check_section_footer(f, le)) {
-                  return -EINVAL;
-              }
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
-      }
-
-      cpu_synchronize_all_post_init();
-+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
-+    check_host_md5();
-
-      return ret;
-  }
---
-Dr. David Alan Gilbert / address@hidden / Manchester, UK
-
-
-.
---
-Best regards.
-Li Zhijian (8555)
-
-* Li Zhijian (address@hidden) wrote:
->
->
->
-On 12/03/2015 05:24 PM, Dr. David Alan Gilbert wrote:
->
->* Li Zhijian (address@hidden) wrote:
->
->>Hi all,
->
->>
->
->>Does anyboday remember the similar issue post by hailiang months ago
->
->>
-http://patchwork.ozlabs.org/patch/454322/
->
->>At least tow bugs about migration had been fixed since that.
->
->
->
->Yes, I wondered what happened to that.
->
->
->
->>And now we found the same issue at the tcg vm(kvm is fine), after migration,
->
->>the content VM's memory is inconsistent.
->
->
->
->Hmm, TCG only - I don't know much about that; but I guess something must
->
->be accessing memory without using the proper macros/functions so
->
->it doesn't mark it as dirty.
->
->
->
->>we add a patch to check memory content, you can find it from affix
->
->>
->
->>steps to reporduce:
->
->>1) apply the patch and re-build qemu
->
->>2) prepare the ubuntu guest and run memtest in grub.
->
->>soruce side:
->
->>x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
->>e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
->>if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
->>virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
->>-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
->>tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
->>pc-i440fx-2.3,accel=tcg,usb=off
->
->>
->
->>destination side:
->
->>x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
->>e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
->>if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
->>virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
->>-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
->>tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
->>pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
->
->>
->
->>3) start migration
->
->>with 1000M NIC, migration will finish within 3 min.
->
->>
->
->>at source:
->
->>(qemu) migrate tcp:192.168.2.66:8881
->
->>after saving ram complete
->
->>e9e725df678d392b1a83b3a917f332bb
->
->>qemu-system-x86_64: end ram md5
->
->>(qemu)
->
->>
->
->>at destination:
->
->>...skip...
->
->>Completed load of VM with exit code 0 seq iteration 1264
->
->>Completed load of VM with exit code 0 seq iteration 1265
->
->>Completed load of VM with exit code 0 seq iteration 1266
->
->>qemu-system-x86_64: after loading state section id 2(ram)
->
->>49c2dac7bde0e5e22db7280dcb3824f9
->
->>qemu-system-x86_64: end ram md5
->
->>qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
->
->>
->
->>49c2dac7bde0e5e22db7280dcb3824f9
->
->>qemu-system-x86_64: end ram md5
->
->>
->
->>This occurs occasionally and only at tcg machine. It seems that
->
->>some pages dirtied in source side don't transferred to destination.
->
->>This problem can be reproduced even if we disable virtio.
->
->>
->
->>Is it OK for some pages that not transferred to destination when do
->
->>migration ? Or is it a bug?
->
->
->
->I'm pretty sure that means it's a bug.  Hard to find though, I guess
->
->at least memtest is smaller than a big OS.  I think I'd dump the whole
->
->of memory on both sides, hexdump and diff them  - I'd guess it would
->
->just be one byte/word different, maybe that would offer some idea what
->
->wrote it.
->
->
-I try to dump and compare them, more than 10 pages are different.
->
-in source side, they are random value rather than always 'FF' 'FB' 'EF'
->
-'BF'... in destination.
->
->
-and not all of the different pages are continuous.
-I wonder if it happens on all of memtest's different test patterns,
-perhaps it might be possible to narrow it down if you tell memtest
-to only run one test at a time.
-
-Dave
-
->
->
-thanks
->
-Li
->
->
->
->
->
->Dave
->
->
->
->>Any idea...
->
->>
->
->>=================md5 check patch=============================
->
->>
->
->>diff --git a/Makefile.target b/Makefile.target
->
->>index 962d004..e2cb8e9 100644
->
->>--- a/Makefile.target
->
->>+++ b/Makefile.target
->
->>@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
->
->>  obj-y += memory_mapping.o
->
->>  obj-y += dump.o
->
->>  obj-y += migration/ram.o migration/savevm.o
->
->>-LIBS := $(libs_softmmu) $(LIBS)
->
->>+LIBS := $(libs_softmmu) $(LIBS) -lplumb
->
->>
->
->>  # xen support
->
->>  obj-$(CONFIG_XEN) += xen-common.o
->
->>diff --git a/migration/ram.c b/migration/ram.c
->
->>index 1eb155a..3b7a09d 100644
->
->>--- a/migration/ram.c
->
->>+++ b/migration/ram.c
->
->>@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
->
->>version_id)
->
->>      }
->
->>
->
->>      rcu_read_unlock();
->
->>-    DPRINTF("Completed load of VM with exit code %d seq iteration "
->
->>+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
->
->>              "%" PRIu64 "\n", ret, seq_iter);
->
->>      return ret;
->
->>  }
->
->>diff --git a/migration/savevm.c b/migration/savevm.c
->
->>index 0ad1b93..3feaa61 100644
->
->>--- a/migration/savevm.c
->
->>+++ b/migration/savevm.c
->
->>@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
->
->>
->
->>  }
->
->>
->
->>+#include "exec/ram_addr.h"
->
->>+#include "qemu/rcu_queue.h"
->
->>+#include <clplumbing/md5.h>
->
->>+#ifndef MD5_DIGEST_LENGTH
->
->>+#define MD5_DIGEST_LENGTH 16
->
->>+#endif
->
->>+
->
->>+static void check_host_md5(void)
->
->>+{
->
->>+    int i;
->
->>+    unsigned char md[MD5_DIGEST_LENGTH];
->
->>+    rcu_read_lock();
->
->>+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
->
->>'pc.ram' block */
->
->>+    rcu_read_unlock();
->
->>+
->
->>+    MD5(block->host, block->used_length, md);
->
->>+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
->
->>+        fprintf(stderr, "%02x", md[i]);
->
->>+    }
->
->>+    fprintf(stderr, "\n");
->
->>+    error_report("end ram md5");
->
->>+}
->
->>+
->
->>  void qemu_savevm_state_begin(QEMUFile *f,
->
->>                               const MigrationParams *params)
->
->>  {
->
->>@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f,
->
->>bool iterable_only)
->
->>          save_section_header(f, se, QEMU_VM_SECTION_END);
->
->>
->
->>          ret = se->ops->save_live_complete_precopy(f, se->opaque);
->
->>+
->
->>+        fprintf(stderr, "after saving %s complete\n", se->idstr);
->
->>+        check_host_md5();
->
->>+
->
->>          trace_savevm_section_end(se->idstr, se->section_id, ret);
->
->>          save_section_footer(f, se);
->
->>          if (ret < 0) {
->
->>@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
->
->>MigrationIncomingState *mis)
->
->>                               section_id, le->se->idstr);
->
->>                  return ret;
->
->>              }
->
->>+            if (section_type == QEMU_VM_SECTION_END) {
->
->>+                error_report("after loading state section id %d(%s)",
->
->>+                             section_id, le->se->idstr);
->
->>+                check_host_md5();
->
->>+            }
->
->>              if (!check_section_footer(f, le)) {
->
->>                  return -EINVAL;
->
->>              }
->
->>@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
->
->>      }
->
->>
->
->>      cpu_synchronize_all_post_init();
->
->>+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
->
->>+    check_host_md5();
->
->>
->
->>      return ret;
->
->>  }
->
->>
->
->>
->
->>
->
->--
->
->Dr. David Alan Gilbert / address@hidden / Manchester, UK
->
->
->
->
->
->.
->
->
->
->
---
->
-Best regards.
->
-Li Zhijian (8555)
->
->
---
-Dr. David Alan Gilbert / address@hidden / Manchester, UK
-
-Li Zhijian <address@hidden> wrote:
->
-Hi all,
->
->
-Does anyboday remember the similar issue post by hailiang months ago
->
-http://patchwork.ozlabs.org/patch/454322/
->
-At least tow bugs about migration had been fixed since that.
->
->
-And now we found the same issue at the tcg vm(kvm is fine), after
->
-migration, the content VM's memory is inconsistent.
->
->
-we add a patch to check memory content, you can find it from affix
->
->
-steps to reporduce:
->
-1) apply the patch and re-build qemu
->
-2) prepare the ubuntu guest and run memtest in grub.
->
-soruce side:
->
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
-pc-i440fx-2.3,accel=tcg,usb=off
->
->
-destination side:
->
-x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
->
-e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
->
-if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
->
-virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
->
--vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
->
-tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
->
-pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
->
->
-3) start migration
->
-with 1000M NIC, migration will finish within 3 min.
->
->
-at source:
->
-(qemu) migrate tcp:192.168.2.66:8881
->
-after saving ram complete
->
-e9e725df678d392b1a83b3a917f332bb
->
-qemu-system-x86_64: end ram md5
->
-(qemu)
->
->
-at destination:
->
-...skip...
->
-Completed load of VM with exit code 0 seq iteration 1264
->
-Completed load of VM with exit code 0 seq iteration 1265
->
-Completed load of VM with exit code 0 seq iteration 1266
->
-qemu-system-x86_64: after loading state section id 2(ram)
->
-49c2dac7bde0e5e22db7280dcb3824f9
->
-qemu-system-x86_64: end ram md5
->
-qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
->
->
-49c2dac7bde0e5e22db7280dcb3824f9
->
-qemu-system-x86_64: end ram md5
->
->
-This occurs occasionally and only at tcg machine. It seems that
->
-some pages dirtied in source side don't transferred to destination.
->
-This problem can be reproduced even if we disable virtio.
->
->
-Is it OK for some pages that not transferred to destination when do
->
-migration ? Or is it a bug?
->
->
-Any idea...
-Thanks for describing how to reproduce the bug.
-If some pages are not transferred to destination then it is a bug, so we
-need to know what the problem is, notice that the problem can be that
-TCG is not marking dirty some page, that Migration code "forgets" about
-that page, or anything eles altogether, that is what we need to find.
-
-There are more posibilities, I am not sure that memtest is on 32bit
-mode, and it is inside posibility that we are missing some state when we
-are on real mode.
-
-Will try to take a look at this.
-
-THanks, again.
-
-
->
->
-=================md5 check patch=============================
->
->
-diff --git a/Makefile.target b/Makefile.target
->
-index 962d004..e2cb8e9 100644
->
---- a/Makefile.target
->
-+++ b/Makefile.target
->
-@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
->
-obj-y += memory_mapping.o
->
-obj-y += dump.o
->
-obj-y += migration/ram.o migration/savevm.o
->
--LIBS := $(libs_softmmu) $(LIBS)
->
-+LIBS := $(libs_softmmu) $(LIBS) -lplumb
->
->
-# xen support
->
-obj-$(CONFIG_XEN) += xen-common.o
->
-diff --git a/migration/ram.c b/migration/ram.c
->
-index 1eb155a..3b7a09d 100644
->
---- a/migration/ram.c
->
-+++ b/migration/ram.c
->
-@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque,
->
-int version_id)
->
-}
->
->
-rcu_read_unlock();
->
--    DPRINTF("Completed load of VM with exit code %d seq iteration "
->
-+    fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
->
-"%" PRIu64 "\n", ret, seq_iter);
->
-return ret;
->
-}
->
-diff --git a/migration/savevm.c b/migration/savevm.c
->
-index 0ad1b93..3feaa61 100644
->
---- a/migration/savevm.c
->
-+++ b/migration/savevm.c
->
-@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
->
->
-}
->
->
-+#include "exec/ram_addr.h"
->
-+#include "qemu/rcu_queue.h"
->
-+#include <clplumbing/md5.h>
->
-+#ifndef MD5_DIGEST_LENGTH
->
-+#define MD5_DIGEST_LENGTH 16
->
-+#endif
->
-+
->
-+static void check_host_md5(void)
->
-+{
->
-+    int i;
->
-+    unsigned char md[MD5_DIGEST_LENGTH];
->
-+    rcu_read_lock();
->
-+    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
->
-'pc.ram' block */
->
-+    rcu_read_unlock();
->
-+
->
-+    MD5(block->host, block->used_length, md);
->
-+    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
->
-+        fprintf(stderr, "%02x", md[i]);
->
-+    }
->
-+    fprintf(stderr, "\n");
->
-+    error_report("end ram md5");
->
-+}
->
-+
->
-void qemu_savevm_state_begin(QEMUFile *f,
->
-const MigrationParams *params)
->
-{
->
-@@ -1056,6 +1079,10 @@ void
->
-qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only)
->
-save_section_header(f, se, QEMU_VM_SECTION_END);
->
->
-ret = se->ops->save_live_complete_precopy(f, se->opaque);
->
-+
->
-+        fprintf(stderr, "after saving %s complete\n", se->idstr);
->
-+        check_host_md5();
->
-+
->
-trace_savevm_section_end(se->idstr, se->section_id, ret);
->
-save_section_footer(f, se);
->
-if (ret < 0) {
->
-@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
->
-MigrationIncomingState *mis)
->
-section_id, le->se->idstr);
->
-return ret;
->
-}
->
-+            if (section_type == QEMU_VM_SECTION_END) {
->
-+                error_report("after loading state section id %d(%s)",
->
-+                             section_id, le->se->idstr);
->
-+                check_host_md5();
->
-+            }
->
-if (!check_section_footer(f, le)) {
->
-return -EINVAL;
->
-}
->
-@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
->
-}
->
->
-cpu_synchronize_all_post_init();
->
-+    error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
->
-+    check_host_md5();
->
->
-return ret;
->
-}
-
->
->
-Thanks for describing how to reproduce the bug.
->
-If some pages are not transferred to destination then it is a bug, so we need
->
-to know what the problem is, notice that the problem can be that TCG is not
->
-marking dirty some page, that Migration code "forgets" about that page, or
->
-anything eles altogether, that is what we need to find.
->
->
-There are more posibilities, I am not sure that memtest is on 32bit mode, and
->
-it is inside posibility that we are missing some state when we are on real
->
-mode.
->
->
-Will try to take a look at this.
->
->
-THanks, again.
->
-Hi Juan & Amit
-
- Do you think we should add a mechanism to check the data integrity during LM 
-like Zhijian's patch did?  it may be very helpful for developers. 
- Actually, I did the similar thing before in order to make sure that I did the 
-right thing we I change the code related to LM.
-
-Liang
-
-On (Fri) 04 Dec 2015 [01:43:07], Li, Liang Z wrote:
->
->
->
-> Thanks for describing how to reproduce the bug.
->
-> If some pages are not transferred to destination then it is a bug, so we
->
-> need
->
-> to know what the problem is, notice that the problem can be that TCG is not
->
-> marking dirty some page, that Migration code "forgets" about that page, or
->
-> anything eles altogether, that is what we need to find.
->
->
->
-> There are more posibilities, I am not sure that memtest is on 32bit mode,
->
-> and
->
-> it is inside posibility that we are missing some state when we are on real
->
-> mode.
->
->
->
-> Will try to take a look at this.
->
->
->
-> THanks, again.
->
->
->
->
-Hi Juan & Amit
->
->
-Do you think we should add a mechanism to check the data integrity during LM
->
-like Zhijian's patch did?  it may be very helpful for developers.
->
-Actually, I did the similar thing before in order to make sure that I did
->
-the right thing we I change the code related to LM.
-If you mean for debugging, something that's not always on, then I'm
-fine with it.
-
-A script that goes along that shows the result of comparison of the
-diff will be helpful too, something that shows how many pages are
-differnt, how many bytes in a page on average, and so on.
-
-                Amit
-
diff --git a/results/classifier/016/virtual/79834768 b/results/classifier/016/virtual/79834768
deleted file mode 100644
index 16bb6142..00000000
--- a/results/classifier/016/virtual/79834768
+++ /dev/null
@@ -1,436 +0,0 @@
-virtual: 0.981
-KVM: 0.958
-debug: 0.901
-x86: 0.830
-operating system: 0.687
-hypervisor: 0.440
-kernel: 0.177
-register: 0.064
-performance: 0.042
-user-level: 0.026
-i386: 0.021
-assembly: 0.018
-files: 0.016
-VMM: 0.010
-PID: 0.010
-device: 0.009
-TCG: 0.008
-socket: 0.008
-semantic: 0.008
-architecture: 0.006
-vnc: 0.006
-risc-v: 0.006
-arm: 0.006
-peripherals: 0.004
-graphic: 0.003
-permissions: 0.003
-network: 0.003
-ppc: 0.002
-alpha: 0.002
-boot: 0.002
-mistranslation: 0.002
-
-[Qemu-devel] [BUG] Windows 7 got stuck easily while run PCMark10 application
-
-Hi,
-
-We hit a bug in our test while run PCMark 10 in a windows 7 VM,
-The VM got stuck and the wallclock was hang after several minutes running
-PCMark 10 in it.
-It is quite easily to reproduce the bug with the upstream KVM and Qemu.
-
-We found that KVM can not inject any RTC irq to VM after it was hang, it fails 
-to
-Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr.
-
-static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
-                  int irq_level, bool line_status)
-{
-… …
-         if (!irq_level) {
-                  ioapic->irr &= ~mask;
-                  ret = 1;
-                  goto out;
-         }
-… …
-         if ((edge && old_irr == ioapic->irr) ||
-             (!edge && entry.fields.remote_irr)) {
-                  ret = 0;
-                  goto out;
-         }
-
-According to RTC spec, after RTC injects a High level irq, OS will read CMOS’s
-register C to to clear the irq flag, and pull down the irq electric pin.
-
-For Qemu, we will emulate the reading operation in cmos_ioport_read(),
-but Guest OS will fire a write operation before to tell which register will be 
-read
-after this write, where we use s->cmos_index to record the following register 
-to read.
-
-But in our test, we found that there is a possible situation that Vcpu fails to 
-read
-RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading
-registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
-so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
-but before it tries to read register C, another vcpu1 is going to read RTC_YEAR,
-it changes s->cmos_index to RTC_YEAR by a writing action.
-The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we 
-will miss
-calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never 
-inject RTC irq,
-and Windows VM will hang.
-static void cmos_ioport_write(void *opaque, hwaddr addr,
-                              uint64_t data, unsigned size)
-{
-    RTCState *s = opaque;
-
-    if ((addr & 1) == 0) {
-        s->cmos_index = data & 0x7f;
-    }
-……
-static uint64_t cmos_ioport_read(void *opaque, hwaddr addr,
-                                 unsigned size)
-{
-    RTCState *s = opaque;
-    int ret;
-    if ((addr & 1) == 0) {
-        return 0xff;
-    } else {
-        switch(s->cmos_index) {
-
-According to CMOS spec, ‘any write to PROT 0070h should be followed by an 
-action to PROT 0071h or the RTC
-Will be RTC will be left in an unknown state’, but it seems that we can not 
-ensure this sequence in qemu/kvm.
-
-Any ideas ?
-
-Thanks,
-Hailiang
-
-Pls see the trace of kvm_pio:
-
-       CPU 1/KVM-15567 [003] .... 209311.762579: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762582: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x89
-       CPU 1/KVM-15567 [003] .... 209311.762590: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x17
-       CPU 0/KVM-15566 [005] .... 209311.762611: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0xc
-       CPU 1/KVM-15567 [003] .... 209311.762615: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762619: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x88
-       CPU 1/KVM-15567 [003] .... 209311.762627: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x12
-       CPU 0/KVM-15566 [005] .... 209311.762632: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x12
-       CPU 1/KVM-15567 [003] .... 209311.762633: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 0/KVM-15566 [005] .... 209311.762634: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0xc           <--- Firstly write to 0x70, cmo_index = 0xc & 
-0x7f = 0xc
-       CPU 1/KVM-15567 [003] .... 209311.762636: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x86       <-- Secondly write to 0x70, cmo_index = 0x86 & 
-0x7f = 0x6, cover the cmo_index result of first time
-       CPU 0/KVM-15566 [005] .... 209311.762641: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x6      <--  vcpu0 read 0x6 because cmo_index is 0x6 now
-       CPU 1/KVM-15567 [003] .... 209311.762644: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x6     <-  vcpu1 read 0x6
-       CPU 1/KVM-15567 [003] .... 209311.762649: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762669: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x87
-       CPU 1/KVM-15567 [003] .... 209311.762678: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x1
-       CPU 1/KVM-15567 [003] .... 209311.762683: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762686: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x84
-       CPU 1/KVM-15567 [003] .... 209311.762693: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x10
-       CPU 1/KVM-15567 [003] .... 209311.762699: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762702: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x82
-       CPU 1/KVM-15567 [003] .... 209311.762709: kvm_pio: pio_read at 0x71 size 
-1 count 1 val 0x25
-       CPU 1/KVM-15567 [003] .... 209311.762714: kvm_pio: pio_read at 0x70 size 
-1 count 1 val 0xff
-       CPU 1/KVM-15567 [003] .... 209311.762717: kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x80
-
-
-Regards,
--Gonglei
-
-From: Zhanghailiang
-Sent: Friday, December 01, 2017 3:03 AM
-To: address@hidden; address@hidden; Paolo Bonzini
-Cc: Huangweidong (C); Gonglei (Arei); wangxin (U); Xiexiangyou
-Subject: [BUG] Windows 7 got stuck easily while run PCMark10 application
-
-Hi,
-
-We hit a bug in our test while run PCMark 10 in a windows 7 VM,
-The VM got stuck and the wallclock was hang after several minutes running
-PCMark 10 in it.
-It is quite easily to reproduce the bug with the upstream KVM and Qemu.
-
-We found that KVM can not inject any RTC irq to VM after it was hang, it fails 
-to
-Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr.
-
-static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
-                  int irq_level, bool line_status)
-{
-… …
-         if (!irq_level) {
-                  ioapic->irr &= ~mask;
-                  ret = 1;
-                  goto out;
-         }
-… …
-         if ((edge && old_irr == ioapic->irr) ||
-             (!edge && entry.fields.remote_irr)) {
-                  ret = 0;
-                  goto out;
-         }
-
-According to RTC spec, after RTC injects a High level irq, OS will read CMOS’s
-register C to to clear the irq flag, and pull down the irq electric pin.
-
-For Qemu, we will emulate the reading operation in cmos_ioport_read(),
-but Guest OS will fire a write operation before to tell which register will be 
-read
-after this write, where we use s->cmos_index to record the following register 
-to read.
-
-But in our test, we found that there is a possible situation that Vcpu fails to 
-read
-RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading
-registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
-so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
-but before it tries to read register C, another vcpu1 is going to read RTC_YEAR,
-it changes s->cmos_index to RTC_YEAR by a writing action.
-The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we 
-will miss
-calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never 
-inject RTC irq,
-and Windows VM will hang.
-static void cmos_ioport_write(void *opaque, hwaddr addr,
-                              uint64_t data, unsigned size)
-{
-    RTCState *s = opaque;
-
-    if ((addr & 1) == 0) {
-        s->cmos_index = data & 0x7f;
-    }
-……
-static uint64_t cmos_ioport_read(void *opaque, hwaddr addr,
-                                 unsigned size)
-{
-    RTCState *s = opaque;
-    int ret;
-    if ((addr & 1) == 0) {
-        return 0xff;
-    } else {
-        switch(s->cmos_index) {
-
-According to CMOS spec, ‘any write to PROT 0070h should be followed by an 
-action to PROT 0071h or the RTC
-Will be RTC will be left in an unknown state’, but it seems that we can not 
-ensure this sequence in qemu/kvm.
-
-Any ideas ?
-
-Thanks,
-Hailiang
-
-On 01/12/2017 08:08, Gonglei (Arei) wrote:
->
-First write to 0x70, cmos_index = 0xc & 0x7f = 0xc
->
-       CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc>
->
-Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6>        CPU 1/KVM-15567
->
-kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because
->
-cmos_index is 0x6 now:>        CPU 0/KVM-15566 kvm_pio: pio_read at 0x71 size
->
-1 count 1 val 0x6> vcpu1 read 0x6:>        CPU 1/KVM-15567 kvm_pio: pio_read
->
-at 0x71 size 1 count 1 val 0x6
-This seems to be a Windows bug.  The easiest workaround that I
-can think of is to clear the interrupts already when 0xc is written,
-without waiting for the read (because REG_C can only be read).
-
-What do you think?
-
-Thanks,
-
-Paolo
-
-I also think it's windows bug, the problem is that it doesn't occur on xen 
-platform. And there are some other works need to be done while reading REG_C. 
-So I wrote that patch.
-
-Thanks,
-Gonglei
-发件人:Paolo Bonzini
-收件人:龚磊,张海亮,qemu-devel,Michael S. Tsirkin
-抄送:黄伟栋,王欣,谢祥有
-时间:2017-12-02 01:10:08
-主题:Re: [BUG] Windows 7 got stuck easily while run PCMark10 application
-
-On 01/12/2017 08:08, Gonglei (Arei) wrote:
->
-First write to 0x70, cmos_index = 0xc & 0x7f = 0xc
->
-CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc>
->
-Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6>        CPU 1/KVM-15567
->
-kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because
->
-cmos_index is 0x6 now:>        CPU 0/KVM-15566 kvm_pio: pio_read at 0x71 size
->
-1 count 1 val 0x6> vcpu1 read 0x6:>        CPU 1/KVM-15567 kvm_pio: pio_read
->
-at 0x71 size 1 count 1 val 0x6
-This seems to be a Windows bug.  The easiest workaround that I
-can think of is to clear the interrupts already when 0xc is written,
-without waiting for the read (because REG_C can only be read).
-
-What do you think?
-
-Thanks,
-
-Paolo
-
-On 01/12/2017 18:45, Gonglei (Arei) wrote:
->
-I also think it's windows bug, the problem is that it doesn't occur on
->
-xen platform.
-It's a race, it may just be that RTC PIO is faster in Xen because it's
-implemented in the hypervisor.
-
-I will try reporting it to Microsoft.
-
-Thanks,
-
-Paolo
-
->
-Thanks,
->
-Gonglei
->
-*发件人:*Paolo Bonzini
->
-*收件人:*龚磊,张海亮,qemu-devel,Michael S. Tsirkin
->
-*抄送:*黄伟栋,王欣,谢祥有
->
-*时间:*2017-12-02 01:10:08
->
-*主题:*Re: [BUG] Windows 7 got stuck easily while run PCMark10 application
->
->
-On 01/12/2017 08:08, Gonglei (Arei) wrote:
->
-> First write to 0x70, cmos_index = 0xc & 0x7f = 0xc
->
->        CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc>
->
-> Second write to 0x70, cmos_index = 0x86 & 0x7f = 0x6>        CPU 1/KVM-15567
->
-> kvm_pio: pio_write at 0x70 size 1 count 1 val 0x86> vcpu0 read 0x6 because
->
-> cmos_index is 0x6 now:>        CPU 0/KVM-15566 kvm_pio: pio_read at 0x71
->
-> size 1 count 1 val 0x6> vcpu1
->
-read 0x6:>        CPU 1/KVM-15567 kvm_pio: pio_read at 0x71 size 1 count
->
-1 val 0x6
->
-This seems to be a Windows bug.  The easiest workaround that I
->
-can think of is to clear the interrupts already when 0xc is written,
->
-without waiting for the read (because REG_C can only be read).
->
->
-What do you think?
->
->
-Thanks,
->
->
-Paolo
-
-On 2017/12/2 2:37, Paolo Bonzini wrote:
-On 01/12/2017 18:45, Gonglei (Arei) wrote:
-I also think it's windows bug, the problem is that it doesn't occur on
-xen platform.
-It's a race, it may just be that RTC PIO is faster in Xen because it's
-implemented in the hypervisor.
-No, In Xen, it does not has such problem because it injects the RTC irq without
-checking whether its previous irq been cleared or not, which we do has such 
-checking
-in KVM.
-
-static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
-        int irq_level, bool line_status)
-{
-   ... ...
-    if (!irq_level) {
-        ioapic->irr &= ~mask; -->clear the RTC irq in irr, Or we will can not 
-inject RTC irq.
-        ret = 1;
-        goto out;
-    }
-
-I agree that we move the operation of clearing RTC irq from cmos_ioport_read() 
-to
-cmos_ioport_write() to ensure the action been done.
-
-Thanks,
-Hailiang
-I will try reporting it to Microsoft.
-
-Thanks,
-
-Paolo
-Thanks,
-Gonglei
-*发件人:*Paolo Bonzini
-*收件人:*龚磊,张海亮,qemu-devel,Michael S. Tsirkin
-*抄送:*黄伟栋,王欣,谢祥有
-*时间:*2017-12-02 01:10:08
-*主题:*Re: [BUG] Windows 7 got stuck easily while run PCMark10 application
-
-On 01/12/2017 08:08, Gonglei (Arei) wrote:
-First write to 0x70, cmos_index = 0xc & 0x7f = 0xc
-        CPU 0/KVM-15566 kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc> Second write to 
-0x70, cmos_index = 0x86 & 0x7f = 0x6>        CPU 1/KVM-15567 kvm_pio: pio_write at 0x70 
-size 1 count 1 val 0x86> vcpu0 read 0x6 because cmos_index is 0x6 now:>        CPU 
-0/KVM-15566 kvm_pio: pio_read at 0x71 size 1 count 1 val 0x6> vcpu1
-read 0x6:>        CPU 1/KVM-15567 kvm_pio: pio_read at 0x71 size 1 count
-1 val 0x6
-This seems to be a Windows bug.  The easiest workaround that I
-can think of is to clear the interrupts already when 0xc is written,
-without waiting for the read (because REG_C can only be read).
-
-What do you think?
-
-Thanks,
-
-Paolo
-.
-