summary refs log tree commit diff stats
path: root/scripts/qapi/source.py (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-04-24docs/specs/pci-ids: Convert from txt to rSTPeter Maydell3-70/+99
Convert the pci-ids document from plain text to reStructuredText. I opted to use definition-lists here because rST tables are super-clunky, and actually formatting these as tables didn't seem necessary. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230420160334.1048224-2-peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-24acpi: pcihp: allow repeating hot-unplug requestsIgor Mammedov1-0/+10
with Q35 using ACPI PCI hotplug by default, user's request to unplug device is ignored when it's issued before guest OS has been booted. And any additional attempt to request device hot-unplug afterwards results in following error: "Device XYZ is already in the process of unplug" arguably it can be considered as a regression introduced by [2], before which it was possible to issue unplug request multiple times. Accept new uplug requests after timeout (1ms). This brings ACPI PCI hotplug on par with native PCIe unplug behavior [1] and allows user to repeat unplug requests at propper times. Set expire timeout to arbitrary 1msec so user won't be able to flood guest with SCI interrupts by calling device_del in tight loop. PS: ACPI spec doesn't mandate what OSPM can do with GPEx.status bits set before it's booted => it's impl. depended. Status bits may be retained (I tested with one Windows version) or cleared (Linux since 2.6 kernel times) during guest's ACPI subsystem initialization. Clearing status bits (though not wrong per se) hides the unplug event from guest, and it's upto user to repeat device_del later when guest is able to handle unplug requests. 1) 18416c62e3 ("pcie: expire pending delete") 2) Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del") Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> CC: mst@redhat.com CC: anisinha@redhat.com CC: jusual@redhat.com CC: kraxel@redhat.com Message-Id: <20230418090449.2155757-1-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com>
2023-04-24virtio: i2c: Check notifier helpers for VIRTIO_CONFIG_IRQ_IDXViresh Kumar1-0/+16
Since the driver doesn't support interrupts, we must return early when index is set to VIRTIO_CONFIG_IRQ_IDX. Fixes: 544f0278afca ("virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Message-Id: <d53ec8bc002001eafac597f6bd9a8812df989257.1681790067.git.viresh.kumar@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-24docs: Remove obsolete descriptions of SR-IOV supportAkihiko Odaki1-4/+1
The documentation used to say there is no device implemented with SR-IOV, but igb and nvme support SR-IOV today. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230414090441.23156-1-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-24intel_iommu: refine iotlb hash calculationJason Wang2-7/+8
Commit 1b2b12376c8 ("intel-iommu: PASID support") takes PASID into account when calculating iotlb hash like: static guint vtd_iotlb_hash(gconstpointer v) { const struct vtd_iotlb_key *key = v; return key->gfn | ((key->sid) << VTD_IOTLB_SID_SHIFT) | (key->level) << VTD_IOTLB_LVL_SHIFT | (key->pasid) << VTD_IOTLB_PASID_SHIFT; } This turns out to be problematic since: - the shift will lose bits if not converting to uint64_t - level should be off by one in order to fit into 2 bits - VTD_IOTLB_PASID_SHIFT is 30 but PASID is 20 bits which will waste some bits - the hash result is uint64_t so we will lose bits when converting to guint So this patch fixes them by - converting the keys into uint64_t before doing the shift - off level by one to make it fit into two bits - change the sid, lvl and pasid shift to 26, 42 and 44 in order to take the full width of uint64_t - perform an XOR to the top 32bit with the bottom 32bit for the final result to fit guint Fixes: Coverity CID 1508100 Fixes: 1b2b12376c8 ("intel-iommu: PASID support") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230412073510.7158-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24docs/cxl: Fix sentenceStefan Weil1-1/+1
Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20230409201828.1159568-1-sw@weilnetz.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-24MAINTAINERS: Add Eugenio Pérez as vhost-shadow-virtqueue reviewerEugenio Pérez1-0/+4
I'd like to be notified on SVQ patches and review them. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230331150410.2627214-1-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2023-04-24tests: bios-tables-test: replace memset with initializerPaolo Bonzini1-80/+43
Coverity complains that memset() writes over a const field. Use an initializer instead, so that the const field is left to zero. Tests that have to write the const field already use an initializer for the whole struct, here I am choosing the smallest possible patch (which is not that small already). Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230330131109.47856-1-pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-04-24hw/acpi: limit warning on acpi table size to pc machines older than version 2.3Ani Sinha4-2/+9
i440fx machine versions 2.3 and newer supports dynamic ram resizing. See commit a1666142db6233 ("acpi-build: make ROMs RAM blocks resizeable") . Currently supported all q35 machine types (versions 2.4 and newer) supports resizable RAM/ROM blocks.Therefore the warning generated when the ACPI table size exceeds a pre-defined value does not apply to those machine versions. Add a check limiting the warning message to only those machines that does not support expandable ram blocks (that is, i440fx machines with version 2.2 and older). Signed-off-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230329045726.14028-1-anisinha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-24Add my old and new work email mapping and use work email to support acpiAni Sinha1-1/+1
Updating mailmap to indicate ani@anisinha.ca and anisinha@redhat.com are one and the same person. Also updating my email in MAINTAINERS for all my acpi work (reviewing patches and biosbits) to my work email. Also doing the same for bios bits test framework documentation. Signed-off-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230329040834.11973-1-anisinha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21vhost-user-blk-server: notify client about disk resizeVladimir Sementsov-Ogievskiy3-0/+36
Currently block_resize qmp command is simply ignored by vhost-user-blk export. So, the block-node is successfully resized, but virtio config is unchanged and guest doesn't see that disk is resized. Let's handle the resize by modifying the config and notifying the guest appropriately. After this comment, lsblk in linux guest with attached vhost-user-blk-pci device shows new size immediately after block_resize QMP command on vhost-user exported block node. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230321201323.3695923-1-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21pci: avoid accessing slot_reserved_mask directly outside of pci.cChuck Zmudzinski4-8/+24
This patch provides accessor functions as replacements for direct access to slot_reserved_mask according to the comment at the top of include/hw/pci/pci_bus.h which advises that data structures for PCIBus should not be directly accessed but instead be accessed using accessor functions in pci.h. Three accessor functions can conveniently replace all direct accesses of slot_reserved_mask. With this patch, the new accessor functions are used in hw/sparc64/sun4u.c and hw/xen/xen_pt.c and pci_bus.h is removed from the included header files of the same two files. No functional change intended. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> Message-Id: <b1b7f134883cbc83e455abbe5ee225c71aa0e8d0.1678888385.git.brchuckz@aol.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [sun4u]
2023-04-21hw: Add compat machines for 8.1Cornelia Huck10-10/+79
Add 8.1 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20230314173009.152667-1-cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21hw/i386/amd_iommu: Factor amdvi_pci_realize out of amdvi_sysbus_realizePhilippe Mathieu-Daudé2-28/+39
Aside the Frankenstein model of a SysBusDevice realizing a PCIDevice, QOM parents shouldn't access children internals. In this particular case, amdvi_sysbus_realize() is just open-coding TYPE_AMD_IOMMU_PCI's DeviceRealize() handler. Factor it out. Declare QOM-cast macros with OBJECT_DECLARE_SIMPLE_TYPE() so we can cast the AMDVIPCIState in amdvi_pci_realize(). Note this commit removes the single use in the repository of pci_add_capability() and msi_init() on a *realized* QDev instance. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-7-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21hw/i386/amd_iommu: Set PCI static/const fields via PCIDeviceClassPhilippe Mathieu-Daudé1-2/+4
Set PCI static/const fields once in amdvi_pci_class_init. They will be propagated via DeviceClassRealize handler via pci_qdev_realize() -> do_pci_register_device() -> pci_config_set*(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-6-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21hw/i386/amd_iommu: Move capab_offset from AMDVIState to AMDVIPCIStatePhilippe Mathieu-Daudé3-9/+9
The 'PCI capability offset' is a *PCI* notion. Since AMDVIPCIState inherits PCIDevice and hold PCI-related fields, move capab_offset from AMDVIState to AMDVIPCIState. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-5-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21hw/i386/amd_iommu: Remove intermediate AMDVIState::devid fieldPhilippe Mathieu-Daudé3-5/+3
AMDVIState::devid is only accessed by build_amd_iommu() which has access to the PCIDevice state. Directly get the property calling object_property_get_int() there. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-4-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21hw/i386/amd_iommu: Explicit use of AMDVI_BASE_ADDR in amdvi_initPhilippe Mathieu-Daudé1-2/+2
By accessing MemoryRegion internals, amdvi_init() gives the false idea that the PCI BAR can be modified. However this isn't true (at least the model isn't ready for that): the device is explicitly maps at the BAR at the fixed AMDVI_BASE_ADDR address in amdvi_sysbus_realize(). Since the SysBus API isn't designed to remap regions, directly use the fixed address in amdvi_init(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-3-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21MAINTAINERS: Mark AMD-Vi emulation as orphanPhilippe Mathieu-Daudé1-0/+4
hw/i386/amd_iommu.c seems unmaintained: After commit 1c7955c450 ("x86-iommu: introduce parent class", 2016-07-14), almost no feature added, 2 bug fixes, other changes are generic tree-wide API cleanups. Cc: Roman Kapl <rka@sysgo.com> Cc: Wei Huang <wei.huang2@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Kiarie <davidkiarie4@gmail.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230313153031.86107-2-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21virtio-balloon: optimize the virtio-balloon on the ARM platformYangming3-28/+14
Optimize the virtio-balloon feature on the ARM platform by adding a variable to keep track of the current hot-plugged pc-dimm size, instead of traversing the virtual machine's memory modules to count the current RAM size during the balloon inflation or deflation process. This variable can be updated only when plugging or unplugging the device, which will result in an increase of approximately 60% efficiency of balloon process on the ARM platform. We tested the total amount of time required for the balloon inflation process on ARM: inflate the balloon to 64GB of a 128GB guest under stress. Before: 102 seconds After: 42 seconds Signed-off-by: Qi Xi <xiqi2@huawei.com> Signed-off-by: Ming Yang yangming73@huawei.com Message-Id: <e13bc78f96774bfab4576814c293aa52@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
2023-04-21docs: vhost-user: Add Xen specific memory mapping supportViresh Kumar1-0/+21
The current model of memory mapping at the back-end works fine where a standard call to mmap() (for the respective file descriptor) is enough before the front-end can start accessing the guest memory. There are other complex cases though where the back-end needs more information and simple mmap() isn't enough. For example Xen, a type-1 hypervisor, currently supports memory mapping via two different methods, foreign-mapping (via /dev/privcmd) and grant-dev (via /dev/gntdev). In both these cases, the back-end needs to call mmap() and ioctl(), with extra information like the Xen domain-id of the guest whose memory we are trying to map. Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_XEN_MMAP', which lets the back-end know about the additional memory mapping requirements. When this feature is negotiated, the front-end will send the additional information within the memory regions themselves. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Message-Id: <6d0bd7f0e1aeec3ddb603ae4ff334c75c7d0d7b3.1678351495.git.viresh.kumar@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2023-04-21docs: vhost-user: Define memory region separatelyViresh Kumar1-21/+18
The same layout is defined twice, once in "single memory region description" and then in "memory regions description". Separate out details of memory region from these two and reuse the same definition later on. While at it, also rename "memory regions description" to "multiple memory regions description", to avoid potential confusion around similar names. And define single region before multiple ones. This is just a documentation optimization, the protocol remains the same. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Message-Id: <7c3718e5eb99178b22696682ae73aca6df1899c7.1678351495.git.viresh.kumar@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2023-04-21vhost: Drop unused eventfd_add|del hooksPeter Xu1-14/+0
These hooks were introduced in: 80a1ea3748 ("memory: move ioeventfd ops to MemoryListener", 2012-02-29) But they seem to be never used. Drop them. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20230306193209.516011-1-peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21meson_options.txt: Enable qom-cast-debug by default againThomas Huth2-2/+2
This switch had been disabled by default by accident in commit c55cf6ab03f. But we should enable it by default instead to avoid regressions in the QOM device hierarchy. Fixes: c55cf6ab03 ("configure, meson: move some default-disabled options to meson_options.txt") Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230417130037.236747-3-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-04-21vdpa: accept VIRTIO_NET_F_SPEED_DUPLEX in SVQEugenio Pérez1-1/+2
There is no reason to block it as it has nothing to do with the vrings. All the support of the feature comes via config space. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Suggested-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Message-Id: <20230307170018.260557-1-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21Add my old and new work email mapping and use work email to support biosbitsAni Sinha3-2/+3
Update mailmap to indicate ani@anisinha.ca and anisinha@redhat.com are one and the same person. Additionally update MAINTAINERS and bits documentation to use my work (redhat) email. Signed-off-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230320114233.90638-1-anisinha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21virtio: refresh vring region cache after updating a virtqueue sizeCarlos López5-1/+5
When a virtqueue size is changed by the guest via virtio_queue_set_num(), its region cache is not automatically updated. If the size was increased, this could lead to accessing the cache out of bounds. For example, in vring_get_used_event(): static inline uint16_t vring_get_used_event(VirtQueue *vq) { return vring_avail_ring(vq, vq->vring.num); } static inline uint16_t vring_avail_ring(VirtQueue *vq, int i) { VRingMemoryRegionCaches *caches = vring_get_region_caches(vq); hwaddr pa = offsetof(VRingAvail, ring[i]); if (!caches) { return 0; } return virtio_lduw_phys_cached(vq->vdev, &caches->avail, pa); } vq->vring.num will be greater than caches->avail.len, which will trigger a failed assertion down the call path of virtio_lduw_phys_cached(). Fix this by calling virtio_init_region_cache() after virtio_queue_set_num() if we are not already calling virtio_queue_set_rings(). In the legacy path this is already done by virtio_queue_update_rings(). Signed-off-by: Carlos López <clopez@suse.de> Message-Id: <20230317002749.27379-1-clopez@suse.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-19Update version for v8.0.0 releasePeter Maydell1-1/+1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-04-13Update version for v8.0.0-rc4 releasePeter Maydell1-1/+1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-04-13hw/xen: Fix double-free in xen_console store_con_info()David Woodhouse1-10/+3
Coverity spotted a double-free (CID 1508254); we g_string_free(path) and then for some reason immediately call free(path) too. We should just use g_autoptr() for it anyway, which simplifies the code a bit. Fixes: 7a8a749da7d3 ("hw/xen: Move xenstore_store_pv_console_info to xen_console.c") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-04-12migration: fix ram_state_pending_exact()Juan Quintela1-1/+2
I removed that bit on commit: commit c8df4a7aeffcb46020f610526eea621fa5b0cd47 Author: Juan Quintela <quintela@redhat.com> Date: Mon Oct 3 02:00:03 2022 +0200 migration: Split save_live_pending() into state_pending_* Fixes: c8df4a7aeffcb46020f610526eea621fa5b0cd47 Suggested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12migration/ram.c: Fix migration with compress enabledLukas Straub1-13/+11
Since ec6f3ab9, migration with compress enabled was broken, because the compress threads use a dummy QEMUFile which just acts as a buffer and that commit accidentally changed it to use the outgoing migration channel instead. Fix this by using the dummy file again in the compress threads. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12migration: Recover behavior of preempt channel creation for pre-7.2Peter Xu3-2/+24
In 8.0 devel window we reworked preempt channel creation, so that there'll be no race condition when the migration channel and preempt channel got established in the wrong order in commit 5655aab079. However no one noticed that the change will also be not compatible with older qemus, majorly 7.1/7.2 versions where preempt mode started to be supported. Leverage the same pre-7.2 flag introduced in the previous patch to recover the behavior hopefully before 8.0 releases, so we don't break migration when we migrate from 8.0 to older qemu binaries. Fixes: 5655aab079 ("migration: Postpone postcopy preempt channel to be after main") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12migration: Fix potential race on postcopy_qemufile_srcPeter Xu4-8/+57
postcopy_qemufile_src object should be owned by one thread, either the main thread (e.g. when at the beginning, or at the end of migration), or by the return path thread (when during a preempt enabled postcopy migration). If that's not the case the access to the object might be racy. postcopy_preempt_shutdown_file() can be potentially racy, because it's called at the end phase of migration on the main thread, however during which the return path thread hasn't yet been recycled; the recycle happens in await_return_path_close_on_source() which is after this point. It means, logically it's posslbe the main thread and the return path thread are both operating on the same qemufile. While I don't think qemufile is thread safe at all. postcopy_preempt_shutdown_file() used to be needed because that's where we send EOS to dest so that dest can safely shutdown the preempt thread. To avoid the possible race, remove this only place that a race can happen. Instead we figure out another way to safely close the preempt thread on dest. The core idea during postcopy on deciding "when to stop" is that dest will send a postcopy SHUT message to src, telling src that all data is there. Hence to shut the dest preempt thread maybe better to do it directly on dest node. This patch proposed such a way that we change postcopy_prio_thread_created into PreemptThreadStatus, so that we kick the preempt thread on dest qemu by a sequence of: mis->preempt_thread_status = PREEMPT_THREAD_QUIT; qemu_file_shutdown(mis->postcopy_qemufile_dst); While here shutdown() is probably so far the easiest way to kick preempt thread from a blocked qemu_get_be64(). Then it reads preempt_thread_status to make sure it's not a network failure but a willingness to quit the thread. We could have avoided that extra status but just rely on migration status. The problem is postcopy_ram_incoming_cleanup() is just called early enough so we're still during POSTCOPY_ACTIVE no matter what.. So just make it simple to have the status introduced. One flag x-preempt-pre-7-2 is added to keep old pre-7.2 behaviors of postcopy preempt. Fixes: 9358982744 ("migration: Send requested page directly in rp-return thread") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12io: tls: Inherit QIO_CHANNEL_FEATURE_SHUTDOWN on server sidePeter Xu1-0/+3
TLS iochannel will inherit io_shutdown() from the master ioc, however we missed to do that on the server side. This will e.g. allow qemu_file_shutdown() to work on dest QEMU too for migration. Acked-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12block/nfs: do not poll within a coroutinePaolo Bonzini1-11/+13
Since the former nfs_get_allocated_file_size is now a coroutine function, it must suspend rather than poll. Switch BDRV_POLL_WHILE() to a qemu_coroutine_yield() loop and schedule nfs_co_generic_bh_cb() in place of the call to bdrv_wakeup(). Fixes: 82618d7bc341 ("block: Convert bdrv_get_allocated_file_size() to co_wrapper", 2023-02-01) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230412112606.80983-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-04-12hw/nvme: fix memory leak in nvme_dsmKlaus Jensen1-0/+3
The iocb (and the allocated memory to hold LBA ranges) leaks if reading the LBA ranges fails. Fix this by adding a free and an unref of the iocb. Reported-by: Coverity (CID 1508281) Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2023-04-12hw/nvme: fix memory leak in fdp ruhid parsingKlaus Jensen1-1/+2
Coverity reports a memory leak of memory when parsing ruhids at namespace initialization. Since this is just working memory, not needed beyond the scope of the functions, fix this by adding a g_autofree annotation. Reported-by: Coverity (CID 1507979) Fixes: 73064edfb864 ("hw/nvme: flexible data placement emulation") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2023-04-11block, block-backend: write some hot coroutine wrappers by handPaolo Bonzini4-4/+52
The introduction of the graph lock is causing blk_get_geometry, a hot function used in the I/O path, to create a coroutine. However, the only part that really needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors, which in turn only happens in the rare case of host CD-ROM devices. So, write by hand the three wrappers on the path from blk_co_get_geometry to bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors. Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block-backend: ignore inserted state in blk_co_nb_sectorsPaolo Bonzini1-15/+8
All callers of blk_co_nb_sectors (and blk_nb_sectors) are able to handle a non-inserted CD-ROM as a zero-length file, they do not need to raise an error. Not using blk_co_is_available() aligns the function with blk_co_get_geometry(), which becomes a simple wrapper for blk_co_nb_sectors(). It will also make it possible to skip the creation of a coroutine in the (common) case where bs->bl.has_variable_length is false. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-8-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block-backend: inline bdrv_co_get_geometryPaolo Bonzini3-15/+6
bdrv_co_get_geometry is only used in blk_co_get_geometry. Inline it in there, to reduce the number of wrappers for bs->total_sectors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-7-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11migration/block: replace uses of blk_nb_sectors that do not check resultPaolo Bonzini1-3/+2
Uses of blk_nb_sectors must check whether the result is negative. Otherwise, underflow can happen. Fortunately, alloc_aio_bitmap() and bmds_aio_inflight() both have an alternative way to retrieve the number of sectors in the file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-6-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: remove has_variable_length from BlockDriverPaolo Bonzini5-10/+11
Fill in the field in BlockLimits directly for host devices, and copy it from there for the raw format. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-5-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: refresh bs->total_sectors on reopenPaolo Bonzini1-0/+1
After reopening a BlockDriverState, it's possible that the size of the underlying file has changed. This for example is covered by test 171. Right now, this is handled by the raw driver's has_variable_length = true setting. Since this will be removed by the next patch, handle it on reopen instead, together with the existing bdrv_refresh_limits. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-4-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: remove has_variable_length from filtersPaolo Bonzini4-4/+0
Filters automatically get has_variable_length from their underlying BlockDriverState. There is no need to mark them as variable-length in the BlockDriver. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-3-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: move has_variable_length to BlockLimitsPaolo Bonzini3-1/+15
At the protocol level, has_variable_length only needs to be true in the very special case of host CD-ROM drives, so that they do not need an explicit monitor command to read the new size when a disc is loaded in the tray. However, at the format level has_variable_length has to be true for all raw blockdevs and for all filters, even though in practice the length depends on the underlying file and thus will not change except in the case of host CD-ROM drives. As a first step towards computing an accurate value of has_variable_length, add the value into the BlockLimits structure and initialize the field from the BlockDriver. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-2-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11hw/i2c/allwinner-i2c: Fix subclassing of TYPE_AW_I2C_SUN6IPeter Maydell1-3/+1
In commit 8461bfdca9c we added the TYPE_AW_I2C_SUN6I, which is a minor variant of the TYPE_AW_I2C device. However, we didn't quite get the class hierarchy right. We made the new TYPE_AW_I2C_SUN6I a subclass of TYPE_SYS_BUS_DEVICE, which means that you can't validly use a pointer to this object via the AW_I2C() cast macro, which insists on having something that is an instance of TYPE_AW_I2C or some subclass of that type. This only causes a problem if QOM cast macro debugging is enabled; that is supposed to be on by default, but a mistake in the meson conversion in commit c55cf6ab03f4c meant that it ended up disabled by default, and we didn't catch this bug. Fix the problem by arranging the classes in the same way we do for TYPE_PL011 and TYPE_PL011_LUMINARY in hw/char/pl011.c -- make the variant class be a subclass of the "normal" version of the device. This was reported in https://gitlab.com/qemu-project/qemu/-/issues/1586 but this fix alone isn't sufficient, as there is a separate cast-related issue in the CXL code in pci_expander_bridge.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Corey Minyard <cminyard@mvista.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-04-11iotests: Regression test for vhdx log corruptionKevin Wolf2-0/+76
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230411115231.90398-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block/vhdx: fix dynamic VHDX BAT corruptionLukas Tschoke1-1/+1
The corruption occurs when a BAT entry aligned to 4096 bytes is changed. Specifically, the corruption occurs during the creation of the LOG Data Descriptor. The incorrect behavior involves copying 4088 bytes from the original 4096 bytes aligned offset to `tmp[8..4096]` and then copying the new value for the first BAT entry to the beginning `tmp[0..8]`. This results in all existing BAT entries inside the 4K region being incorrectly moved by 8 bytes and the last entry being lost. This bug did not cause noticeable corruption when only sequentially writing once to an empty dynamic VHDX (e.g. using `qemu-img convert -O vhdx -o subformat=dynamic ...`), but it still resulted in invalid values for the (unused) Sector Bitmap BAT entries. Importantly, this corruption would only become noticeable after the corrupted BAT is re-read from the file. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/727 Cc: qemu-stable@nongnu.org Signed-off-by: Lukas Tschoke <lukts330@gmail.com> Message-Id: <6cfb6d6b-adc5-7772-c8a5-6bae9a0ad668@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-10tcg/ppc: Fix TCG_TARGET_CALL_{ARG,RET}_I128 for ppc32Richard Henderson1-3/+4
For both _CALL_SYSV and _CALL_DARWIN, return is by reference, not in 4 integer registers. For _CALL_SYSV, argument is also by reference. This error resulted in $ ./qemu-system-i386 -nographic qemu-system-i386: tcg/ppc/tcg-target.c.inc:185: \ tcg_target_call_oarg_reg: Assertion `slot >= 0 && slot <= 1' failed. Fixes: 5427a9a7604 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128") Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>