summary refs log tree commit diff stats
path: root/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* | hw/arm/xlnx-versal: xram: refactor creationLuc Michel2025-10-071-6/+0
| | | | | | | | | | | | | | | | | | | | | | Refactor the XRAM devices creation using the VersalMap structure. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-9-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: adma: refactor creationLuc Michel2025-10-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the ADMA creation using the VersalMap structure. Note that the connection to the CRL is removed for now and will be re-added by next commits. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-8-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: gem: refactor creationLuc Michel2025-10-071-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the GEM ethernet controllers creation using the VersalMap structure. Note that the connection to the CRL is removed for now and will be re-added by next commits. The FDT nodes are created in reverse order compared to the devices creation to keep backward compatibility with the previous generated FDTs. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-7-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: sdhci: refactor creationLuc Michel2025-10-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | Refactor the SDHCI controllers creation using the VersalMap structure. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-6-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: canfd: refactor creationLuc Michel2025-10-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the CAN controllers creation using the VersalMap structure. Note that the connection to the CRL is removed for now and will be re-added by next commits. The xlnx-versal-virt machine now dynamically creates the correct amount of CAN bus link properties based on the number of CAN controller advertised by the SoC. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-5-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: uart: refactor creationLuc Michel2025-10-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the UARTs creations. The VersalMap struct is now used to describe the SoC and its peripherals. For now it contains the two UARTs mapping information. The creation function now embeds the FDT creation logic as well. The devices are now created dynamically using qdev_new and (qdev|sysbus)_realize_and_unref. This will allow to rely entirely on the VersalMap structure to create the SoC and allow easy addition of new SoCs of the same family (like versal2 coming with next commits). Note that the connection to the CRL is removed for now and will be re-added by next commits. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-4-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: prepare for FDT creationLuc Michel2025-10-071-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following commits will move FDT creation logic from the xlnx-versal-virt machine to the xlnx-versal SoC itself. Prepare this by passing the FDT handle to the SoC before it is realized. For now the SoC only creates the two clock nodes. The ones from the xlnx-versal virt machine are renamed with a `old-' prefix and will be removed once they are not referenced anymore. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-3-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | hw/arm/xlnx-versal: split the xlnx-versal typeLuc Michel2025-10-072-1/+26
|/ | | | | | | | | | | | | | | | | Split the xlnx-versal device into two classes, a base, abstract class and the existing concrete one. Introduce a VersalVersion type that will be used across several device models when versal2 implementation is added. This is in preparation for versal2 implementation. Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20250926070806.292065-2-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Richard Henderson2025-10-0628-41/+403
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging virtio,pci,pc: features, fixes users can now control VM bit in smbios. vhost-user-device is now user-createable. intel_iommu now supports PRI virtio-net now supports GSO over UDP tunnel ghes now supports error injection amd iommu now supports dma remapping for vfio better error messages for virtio small fixes all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmji0s0PHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpuH4H/09h70IqAWZGHIWKGmmGGtdKOj3g54KuI0Ss # mGECEsHvvBexOy670Qy8jdgXfaW4UuNui8BiOnJnGsBX8Y0dy+/yZori3KhkXkaY # D57Ap9agkpHem7Vw0zgNsAF2bzDdlzTiQ6ns5oDnSq8yt82onCb5WGkWTGkPs/jL # Gf8Jv+Ddcpt5SU4/hHPYC8pUhl7z4xPOOyl0Qp1GG21Pxf5v4sGFcWuGGB7UEPSQ # MjZeoM0rSnLDtNg18sGwD5RPLQs13TbtgsVwijI79c3w3rcSpPNhGR5OWkdRCIYF # 8A0Nhq0Yfo0ogTht7yt1QNPf/ktJkuoBuGVirvpDaix2tCBECes= # =Zvq/ # -----END PGP SIGNATURE----- # gpg: Signature made Sun 05 Oct 2025 01:19:25 PM PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [unknown] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (75 commits) virtio: improve virtqueue mapping error messages pci: Fix wrong parameter passing to pci_device_get_iommu_bus_devfn() intel_iommu: Simplify caching mode check with VFIO device intel_iommu: Enable Enhanced Set Root Table Pointer Support (ESRTPS) vdpa-dev: add get_vhost() callback for vhost-vdpa device amd_iommu: HATDis/HATS=11 support intel-iommu: Move dma_translation to x86-iommu amd_iommu: Refactor amdvi_page_walk() to use common code for page walk amd_iommu: Do not assume passthrough translation when DTE[TV]=0 amd_iommu: Toggle address translation mode on devtab entry invalidation amd_iommu: Add dma-remap property to AMD vIOMMU device amd_iommu: Set all address spaces to use passthrough mode on reset amd_iommu: Toggle memory regions based on address translation mode amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL amd_iommu: Add replay callback amd_iommu: Unmap all address spaces under the AMD IOMMU on reset amd_iommu: Use iova_tree records to determine large page size on UNMAP amd_iommu: Sync shadow page tables on page invalidation amd_iommu: Add basic structure to support IOMMU notifier updates amd_iommu: Add a page walker to sync shadow page tables on invalidation ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
| * virtio: improve virtqueue mapping error messagesAlessandro Ratti2025-10-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve error reporting when virtqueue ring mapping fails by including a device identifier in the error message. Introduce a helper qdev_get_printable_name() in qdev-core, which returns either: - the device ID, if explicitly provided (e.g. -device ...,id=foo) - the QOM path from qdev_get_dev_path(dev) otherwise - "<unknown device>" as a fallback when no identifier is present This makes it easier to identify which device triggered the error in multi-device setups or when debugging complex guest configurations. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/230 Buglink: https://bugs.launchpad.net/qemu/+bug/1919021 Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alessandro Ratti <alessandro@0x65c.net> Message-Id: <20250924093138.559872-2-alessandro@0x65c.net> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * intel-iommu: Move dma_translation to x86-iommuJoao Martins2025-10-051-0/+1
| | | | | | | | | | | | | | | | | | | | To be later reused by AMD, now that it shares similar property. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250919213515.917111-22-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * virtio: unify virtio_notify_irqfd() and virtio_notify()Stefan Hajnoczi2025-10-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The difference between these two functions: - virtio_notify() uses the interrupt code path (MSI or classic IRQs) - virtio_notify_irqfd() uses guest notifiers (irqfds) virtio_notify() can only be called with the BQL held because the interrupt code path requires the BQL. Device models use virtio_notify_irqfd() from IOThreads since the BQL is not held. The two functions can be unified by pushing down the if (qemu_in_iothread()) check from virtio-blk and virtio-scsi into core virtio code. This is in preparation for the next commit that will add irqfd support to virtio_notify_config() and where it's unattractive to introduce another irqfd-only API for device model callers. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250922220149.498967-3-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * x86: ich9: fix default value of 'No Reboot' bit in GCSIgor Mammedov2025-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [2] initialized 'No Reboot' bit to 1 by default. And due to quirk it happened to work with linux iTCO_wdt driver (which clears it on module load). However spec [1] states: " R/W. This bit is set when the “No Reboot” strap (SPKR pin on ICH9) is sampled high on PWROK. " So it should be set only when '-global ICH9-LPC.noreboot=true' and cleared when it's false (which should be default). Fix it to behave according to spec and set 'No Reboot' bit only when '-global ICH9-LPC.noreboot=true'. 1) Intel I/O Controller Hub 9 (ICH9) Family Datasheet (rev: 004) 2) Fixes: 920557971b6 (ich9: add TCO interface emulation) Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250922132600.562193-1-imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * intel_iommu: Add PRI operations supportCLEMENT MATHIEU--DRIF2025-10-051-0/+1
| | | | | | | | | | | | | | | | | | Implement the PRI callbacks in vtd_iommu_ops. Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901111630.1018573-6-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * pcie: Add a way to get the outstanding page request allocation (pri) from ↵CLEMENT MATHIEU--DRIF2025-10-051-0/+1
| | | | | | | | | | | | | | | | | | the config space. Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901111630.1018573-2-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * hw/virtio: rename vhost-user-device and make user creatableAlex Bennée2025-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We didn't make the device user creatable in the first place because we were worried users might get confused. Rename the device to make its nature as a test device even more explicit. While we are at it add a Kconfig variable so it can be skipped for those that want to thin out their build configuration even further. Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-ID: <20250820195632.1956795-1-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901105948.982583-1-alex.bennee@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * pcie_sriov: Fix broken MMIO accesses from SR-IOV VFsDamien Bergamini2025-10-051-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting with commit cab1398a60eb, SR-IOV VFs are realized as soon as pcie_sriov_pf_init() is called. Because pcie_sriov_pf_init() must be called before pcie_sriov_pf_init_vf_bar(), the VF BARs types won't be known when the VF realize function calls pcie_sriov_vf_register_bar(). This breaks the memory regions of the VFs (for instance with igbvf): $ lspci ... Region 0: Memory at 281a00000 (64-bit, prefetchable) [virtual] [size=16K] Region 3: Memory at 281a20000 (64-bit, prefetchable) [virtual] [size=16K] $ info mtree ... address-space: pci_bridge_pci_mem 0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci 0000000081a00000-0000000081a03fff (prio 1, i/o): igbvf-mmio 0000000081a20000-0000000081a23fff (prio 1, i/o): igbvf-msix and causes MMIO accesses to fail: Invalid write at addr 0x281A01520, size 4, region '(null)', reason: rejected Invalid read at addr 0x281A00C40, size 4, region '(null)', reason: rejected To fix this, VF BARs are now registered with pci_register_bar() which has a type parameter and pcie_sriov_vf_register_bar() is removed. Fixes: cab1398a60eb ("pcie_sriov: Reuse SR-IOV VF device instances") Signed-off-by: Damien Bergamini <damien.bergamini@eviden.com> Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901151314.1038020-1-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * hw/smbios: allow clearing the VM bit in SMBIOS table 0Daniil Tatianin2025-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is useful to be able to freeze a specific version of SeaBIOS to prevent guest visible changes between BIOS updates. This is currently not possible since the extension byte 2 provided by SeaBIOS does not set the VM bit, whereas QEMU sets it unconditionally. Allowing to clear it also seems useful if we want to hide the fact that the guest system is running inside a virtual machine. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250724195409.43499-1-d-tatianin@yandex-team.ru> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * qapi/acpi-hest: add an interface to do generic CPER error injectionMauro Carvalho Chehab2025-10-052-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a QMP command to be used for generic ACPI APEI hardware error injection (HEST) via GHESv2, and add support for it for ARM guests. Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform independent. This is mapped at arch virt bindings, depending on the types supported by QEMU and by the BIOS. So, on ARM, this is supported via ACPI_GHES_NOTIFY_GPIO notification type. This patch was co-authored: - original ghes logic to inject a simple ARM record by Shiju Jose; - generic logic to handle block addresses by Jonathan Cameron; - generic GHESv2 error inject by Mauro Carvalho Chehab; Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-authored-by: Shiju Jose <shiju.jose@huawei.com> Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <81e2118b3c8b7e5da341817f277d61251655e0db.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * arm/virt: Wire up a GED error device for ACPI / GHESMauro Carvalho Chehab2025-10-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support to ARM virtualization to allow handling generic error ACPI Event via GED & error source device. It is aligned with Linux Kernel patch: https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/ Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <3237a76b1469d669436399495825348bf34122cd.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/generic_event_device: add an APEI error deviceMauro Carvalho Chehab2025-10-043-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds a generic error device to handle generic hardware error events as specified at ACPI 6.5 specification at 18.3.2.7.2: https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources using HID PNP0C33. The PNP0C33 device is used to report hardware errors to the guest via ACPI APEI Generic Hardware Error Source (GHES). Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <2790f664c849d53de0ce3049fa8c7950c1de1f86.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: add a notifier to notify when error data is readyMauro Carvalho Chehab2025-10-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Some error injection notify methods are async, like GPIO notify. Add a notifier to be used when the error record is ready to be sent to the guest OS. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <edf9c6e5b80dc57e3443893bf9e1eb25cb9d266b.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: don't hard-code the number of sources for HEST tableMauro Carvalho Chehab2025-10-041-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code is actually dependent on having just one error structure with a single source, as any change there would cause migration issues. As the number of sources should be arch-dependent, as it will depend on what kind of notifications will exist, and how many errors can be reported at the same time, change the logic to be more flexible, allowing the number of sources to be defined when building the HEST table by the caller. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <1698680848c11d6f26368426f1657e14faaf55c4.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: Use HEST table offsets when preparing GHES recordsMauro Carvalho Chehab2025-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two pointers that are needed during error injection: 1. The start address of the CPER block to be stored; 2. The address of the read ack. It is preferable to calculate them from the HEST table. This allows checking the source ID, the size of the table and the type of the HEST error block structures. Yet, keep the old code, as this is needed for migration purposes from older QEMU versions. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <d4344e8dbe66372e1e093d968eda2e8b0527ba48.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: add a firmware file with HEST addressMauro Carvalho Chehab2025-10-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | Store HEST table address at GPA, placing its the start of the table at hest_addr_le variable. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <c29aa5e6ab9b2d93dd5328481630c3b03da86261.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: prepare to change the way HEST offsets are calculatedMauro Carvalho Chehab2025-10-041-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new ags flag to change the way HEST offsets are calculated. Currently, offsets needed to store ACPI HEST offsets and read ack are calculated based on a previous knowledge from the logic which creates the HEST table. Such logic is not generic, not allowing to easily add more HEST entries nor replicates what OSPM does. As the next patches will be adding a more generic logic, add a new use_hest_addr, set to false, in preparation for such changes. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <f5de17bf04b27828e1a439ad396b4f7982eaf156.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi/ghes: Cleanup the code which gets ghes ged stateMauro Carvalho Chehab2025-10-041-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | Move the check logic into a common function and simplify the code which checks if GHES is enabled and was properly setup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <2bbb1d3eb88b0a668114adef2f1c2a94deebba0e.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"Mauro Carvalho Chehab2025-10-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ghes_record_cper_errors() function was introduced to be used by other types of errors, as part of the error injection patch series. That's why it is not static. Make it non-static again to allow its usage outside ghes.c This reverts commit 611f3bdb20f7828b0813aa90d47d1275ef18329b. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <14f2a888cfbf922d5f2bf94d7612114f25107d59.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * net: implement UDP tunnel features offloadingPaolo Abeni2025-10-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When any host or guest GSO over UDP tunnel offload is enabled the virtio net header includes the additional tunnel-related fields, update the size accordingly. Push the GSO over UDP tunnel offloads all the way down to the tap device extending the newly introduced NetFeatures struct, and eventually enable the associated features. As per virtio specification, to convert features bit to offload bit, map the extended features into the reserved range. Finally, make the vhost backend aware of the exact header layout, to copy it correctly. The tunnel-related field are present if either the guest or the host negotiated any UDP tunnel related feature: add them to the kernel supported features list, to allow qemu transfer to the backend the needed information. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <093b4bc68368046bffbcab2202227632d6e4e83b.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * net: implement tunnel probingPaolo Abeni2025-10-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tap devices support GSO over UDP tunnel offload. Probe for such feature in a similar manner to other offloads. GSO over UDP tunnel needs to be enabled in addition to a "plain" offload (TSO or USO). No need to check separately for the outer header checksum offload: the kernel is going to support both of them or none. The new features are disabled by default to avoid compat issues, and could be enabled, after that hw_compat_10_1 will be added, together with the related compat entries. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <a987a8a7613cbf33bb2209c7c7f5889b512638a7.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * virtio-net: implement extended features supportPaolo Abeni2025-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the extended types and helpers to manipulate the virtio_net features. Note that offloads are still 64bits wide, as per specification, and extended offloads will be mapped into such range. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <bc5afdc5c1cb1a37238dd2b36004db3d46cbf211.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: implement extended features supportPaolo Abeni2025-10-041-3/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide extended version of the features manipulation helpers, and let the device initialization deal with the full features space, adjusting the relevant format strings accordingly. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <69c78c432e28e146a8874b2a7d00e9cbd111b1ba.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: add support for negotiating extended featuresPaolo Abeni2025-10-042-7/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to virtio infra, vhost core maintains the features status in the full extended format and allows the devices to implement extended version of the getter/setter. Note that 'protocol_features' are not extended: they are only used by vhost-user, and the latter device is not going to implement extended features soon. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <a0062c3b1847fb2baedd6cd8f6ef13b051d6beb2.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * virtio-pci: implement support for extended featuresPaolo Abeni2025-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the features configuration space to 128 bits. If the virtio device supports any extended features, allow the common read/write operation to access all of it, otherwise keep exposing only the lower 64 bits. On migration, save the 128 bit version of the features only if the upper bits are non zero. Relay on reset to clear all the feature space before load. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <c0b81601f65b41ca8310eba8f05e2dcf3702de89.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * virtio: add support for negotiating extended featuresPaolo Abeni2025-10-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The virtio specifications allows for a device features space up to 128 bits and more. Soon we are going to use some of the 'extended' bits features for the virtio net driver. Add support to allow extended features negotiation on a per devices basis. Devices willing to negotiated extended features need to implemented a new pair of features getter/setter, the core will conditionally use them instead of the basic one. Note that 'bad_features' don't need to be extended, as they are bound to the 64 bits limit. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <9bb29d70adc3f2b8c7756d4e3cd076cffee87826.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * virtio: introduce extended features typePaolo Abeni2025-10-042-3/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The virtio specifications allows for up to 128 bits for the device features. Soon we are going to use some of the 'extended' bits features (bit 64 and above) for the virtio net driver. Represent the virtio features bitmask with a fixed size array, and introduce a few helpers to help manipulate them. Most drivers will keep using only 64 bits features space: use union to allow them access the lower part of the extended space without any per driver change. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <6a9bbb5eb33830f20afbcb7e64d300af4126dd98.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * linux-headers: Update to Linux v6.17-rc1Paolo Abeni2025-10-047-3/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | Update headers to include the virtio GSO over UDP tunnel features Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <0b1f3c011f90583ab52aa4fef04df6db35cc4a69.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * net: bundle all offloads in a single structPaolo Abeni2025-10-041-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The set_offload() argument list is already pretty long and we are going to introduce soon a bunch of additional offloads. Replace the offload arguments with a single struct and update all the relevant call-sites. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <a9d4dd043b8c71b791e9ff05e17ef06072d9714e.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge tag 'staging-pull-request' of https://gitlab.com/peterx/qemu into stagingRichard Henderson2025-10-049-20/+77
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migration/Memory Pull for 10.2 - PeterX's fix on tls warning for preempt channel when migratino completes - Arun's series to enhance error reporting for vTPM and migration framework - PeterX's patch to cleanup multifd send TLS BYE messages - Juraj's fix on postcopy start state transition when switchover failed - Yanfei's fix to migrate APIC before VFIO-PCI to avoid irq fallbacks - Dan's cleanup to simplify error reporting in qemu_fill_buffer() - PeterM's fix on address space leak when cpu hot plug / unplug - Steve's cpr-exec wholeset # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCaN/uIhIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wZ+mAEA1l2RS9sZS1W3vXQMCNb+Nu8Uo2p+e5Qj # Uu6J0WVV+XsBANtzGZk2UM/frqlABywW3/ozJ4qBvIPKo758Mr6/lqUH # =asUv # -----END PGP SIGNATURE----- # gpg: Signature made Fri 03 Oct 2025 08:39:14 AM PDT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown] # gpg: aka "Peter Xu <peterx@redhat.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'staging-pull-request' of https://gitlab.com/peterx/qemu: (45 commits) migration-test: test cpr-exec vfio: cpr-exec mode migration: cpr-exec docs migration: cpr-exec mode migration: cpr-exec save and load migration: cpr-exec-command parameter oslib: qemu_clear_cloexec migration: add cpr_walk_fd migration: multi-mode notifier migration: simplify error reporting after channel read physmem: Destroy all CPU AddressSpaces on unrealize memory: New AS helper to serialize destroy+free include/system/memory.h: Clarify address_space_destroy() behaviour migration: ensure APIC is loaded prior to VFIO PCI devices migration: Fix state transition in postcopy_start() error handling migration/multifd/tls: Cleanup BYE message processing on sender side migration: HMP: Adjust the order of output fields migration: Make migration_has_failed() work even for CANCELLING io/crypto: Move tls premature termination handling into QIO layer backends/tpm: Propagate vTPM error on migration failure ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
| * migration: cpr-exec modeSteve Sistare2025-10-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the cpr-exec migration mode. Usage: qemu-system-$arch -machine aux-ram-share=on ... migrate_set_parameter mode cpr-exec migrate_set_parameter cpr-exec-command \ <arg1> <arg2> ... -incoming <uri-1> \ migrate -d <uri-1> The migrate command stops the VM, saves state to uri-1, directly exec's a new version of QEMU on the same host, replacing the original process while retaining its PID, and loads state from uri-1. Guest RAM is preserved in place, albeit with new virtual addresses. The new QEMU process is started by exec'ing the command specified by the @cpr-exec-command parameter. The first word of the command is the binary, and the remaining words are its arguments. The command may be a direct invocation of new QEMU, or may be a non-QEMU command that exec's the new QEMU binary. This mode creates a second migration channel that is not visible to the user. At the start of migration, old QEMU saves CPR state to the second channel, and at the end of migration, it tells the main loop to call cpr_exec. New QEMU loads CPR state early, before objects are created. Because old QEMU terminates when new QEMU starts, one cannot stream data between the two, so uri-1 must be a type, such as a file, that accepts all data before old QEMU exits. Otherwise, old QEMU may quietly block writing to the channel. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine aux-ram-share=on' option, which allows anonymous memory to be transferred in place to the new process. The memfds are kept open across exec by clearing the close-on-exec flag, their values are saved in CPR state, and they are mmap'd in new QEMU. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/1759332851-370353-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: cpr-exec save and loadSteve Sistare2025-10-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | To preserve CPR state across exec, create a QEMUFile based on a memfd, and keep the memfd open across exec. Save the value of the memfd in an environment variable so post-exec QEMU can find it. These new functions are called in a subsequent patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1759332851-370353-6-git-send-email-steven.sistare@oracle.com [peterx: fix build for Windows] Signed-off-by: Peter Xu <peterx@redhat.com>
| * oslib: qemu_clear_cloexecSteve Sistare2025-10-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | Define qemu_clear_cloexec, analogous to qemu_set_cloexec. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1759332851-370353-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: add cpr_walk_fdSteve Sistare2025-10-031-0/+3
| | | | | | | | | | | | | | | | | | Add a helper to walk all CPR fd's and run a callback for each. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1759332851-370353-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: multi-mode notifierSteve Sistare2025-10-031-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | Allow a notifier to be added for multiple migration modes. To allow a notifier to appear on multiple per-node lists, use a generic list type. We can no longer use NotifierWithReturnList, because it shoe horns the notifier onto a single list. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1759332851-370353-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * physmem: Destroy all CPU AddressSpaces on unrealizePeter Maydell2025-10-032-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we unrealize a CPU object (which happens on vCPU hot-unplug), we should destroy all the AddressSpace objects we created via calls to cpu_address_space_init() when the CPU was realized. Commit 24bec42f3d6eae added a function to do this for a specific AddressSpace, but did not add any places where the function was called. Since we always want to destroy all the AddressSpaces on unrealize, regardless of the target architecture, we don't need to try to keep track of how many are still undestroyed, or make the target architecture code manually call a destroy function for each AS it created. Instead we can adjust the function to always completely destroy the whole cpu->ases array, and arrange for it to be called during CPU unrealize as part of the common code. Without this fix, AddressSanitizer will report a leak like this from a run where we hot-plugged and then hot-unplugged an x86 KVM vCPU: Direct leak of 416 byte(s) in 1 object(s) allocated from: #0 0x5b638565053d in calloc (/data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/qemu-system-x86_64+0x1ee153d) (BuildId: c1cd6022b195142106e1bffeca23498c2b752bca) #1 0x7c28083f77b1 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x637b1) (BuildId: 1eb6131419edb83b2178b682829a6913cf682d75) #2 0x5b6386999c7c in cpu_address_space_init /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../system/physmem.c:797:25 #3 0x5b638727f049 in kvm_cpu_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../target/i386/kvm/kvm-cpu.c:102:5 #4 0x5b6385745f40 in accel_cpu_common_realize /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../accel/accel-common.c:101:13 #5 0x5b638568fe3c in cpu_exec_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/cpu-common.c:232:10 #6 0x5b63874a2cd5 in x86_cpu_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../target/i386/cpu.c:9321:5 #7 0x5b6387a0469a in device_set_realized /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/qdev.c:494:13 #8 0x5b6387a27d9e in property_set_bool /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:2375:5 #9 0x5b6387a2090b in object_property_set /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:1450:5 #10 0x5b6387a35b05 in object_property_set_qobject /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/qom-qobject.c:28:10 #11 0x5b6387a21739 in object_property_set_bool /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:1520:15 #12 0x5b63879fe510 in qdev_realize /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/qdev.c:276:12 Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2517 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250929144228.1994037-4-peter.maydell@linaro.org Signed-off-by: Peter Xu <peterx@redhat.com>
| * memory: New AS helper to serialize destroy+freePeter Xu2025-10-031-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an AddressSpace has been created in its own allocated memory, cleaning it up requires first destroying the AS and then freeing the memory. Doing this doesn't work: address_space_destroy(as); g_free_rcu(as, rcu); because both address_space_destroy() and g_free_rcu() try to use the same 'rcu' node in the AddressSpace struct and the address_space_destroy hook gets overwritten. Provide a new address_space_destroy_free() function which will destroy the AS and then free the memory it uses, all in one RCU callback. (CC to stable because the next commit needs this function.) Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250929144228.1994037-3-peter.maydell@linaro.org Signed-off-by: Peter Xu <peterx@redhat.com>
| * include/system/memory.h: Clarify address_space_destroy() behaviourPeter Maydell2025-10-031-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | address_space_destroy() doesn't actually immediately destroy the AS; it queues it to be destroyed via RCU. This means you can't g_free() the memory the AS struct is in until that has happened. Clarify this in the documentation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250929144228.1994037-2-peter.maydell@linaro.org Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: ensure APIC is loaded prior to VFIO PCI devicesYanfei Xu2025-10-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The load procedure of VFIO PCI devices involves setting up IRT for each VFIO PCI devices. This requires determining whether an interrupt is single-destination interrupt to decide between Posted Interrupt(PI) or remapping mode for the IRTE. However, determining this may require accessing the VM's APIC registers. For example: ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs) ... kvm_arch_irq_bypass_add_producer kvm_x86_call(pi_update_irte) vmx_pi_update_irte kvm_intr_is_single_vcpu If the LAPIC has not been loaded yet, interrupts will use remapping mode. To prevent the fallback of interrupt mode, keep APIC is always loaded prior to VFIO PCI devices. Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com> Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20250818131127.1021648-1-yanfei.xu@bytedance.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * io/crypto: Move tls premature termination handling into QIO layerPeter Xu2025-10-031-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QCryptoTLSSession allows TLS premature termination in two cases, one of the case is when the channel shutdown() is invoked on READ side. It's possible the shutdown() happened after the read thread blocked at gnutls_record_recv(). In this case, we should allow the premature termination to happen. The problem is by the time qcrypto_tls_session_read() was invoked, tioc->shutdown may not have been set, so this may instead be treated as an error if there is concurrent shutdown() calls. To allow the flag to reflect the latest status of tioc->shutdown, move the check upper into the QIOChannel level, so as to read the flag only after QEMU gets an GNUTLS_E_PREMATURE_TERMINATION. When at it, introduce qio_channel_tls_allow_premature_termination() helper to make the condition checks easier to read. When doing so, change the qatomic_load_acquire() to qatomic_read(): here we don't need any ordering of memory accesses, but reading a flag. qatomic_read() would suffice because it guarantees fetching from memory. Nothing else we should need to order on memory access. This patch will fix a qemu qtest warning when running the preempt tls test, reporting premature termination: QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --full -r /x86_64/migration/postcopy/preempt/tls/psk ... qemu-kvm: Cannot read from TLS channel: The TLS connection was non-properly terminated. ... In this specific case, the error was set by postcopy_preempt_thread, which normally will be concurrently shutdown()ed by the main thread. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250918203937.200833-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: Add error-parameterized function variants in VMSD structArun Menon2025-10-031-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - We need to have good error reporting in the callbacks in VMStateDescription struct. Specifically pre_save, pre_load and post_load callbacks. - It is not possible to change these functions everywhere in one patch, therefore, we introduce a duplicate set of callbacks with Error object passed to them. - So, in this commit, we implement 'errp' variants of these callbacks, introducing an explicit Error object parameter. - This is a functional step towards transitioning the entire codebase to the new error-parameterized functions. - Deliberately called in mutual exclusion from their counterparts, to prevent conflicts during the transition. - New impls should preferentally use 'errp' variants of these methods, and existing impls incrementally converted. The variants without 'errp' are intended to be removed once all usage is converted. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Arun Menon <armenon@redhat.com> Tested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-26-36f11a6fb9d3@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>