summary refs log tree commit diff stats
path: root/include/hw/vfio (follow)
Commit message (Collapse)AuthorAgeFilesLines
* hw/vfio: Use uint64_t for IOVA mapping size in vfio_container_dma_*mapPhilippe Mathieu-Daudé2025-10-022-6/+6
| | | | | | | | | | | | | | | | The 'ram_addr_t' type is described as: a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces. This doesn't represent well an IOVA mapping size. Simply use the uint64_t type. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250930123528.42878-5-philmd@linaro.org Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio: Avoid ram_addr_t in vfio_container_query_dirty_bitmap()Philippe Mathieu-Daudé2025-10-021-1/+2
| | | | | | | | | | | | | | | | | | The 'ram_addr_t' type is described as: a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces. vfio_container_query_dirty_bitmap() doesn't expect such QEMU intermediate address, but a guest physical addresses. Use the appropriate 'hwaddr' type, rename as @translated_addr for clarity. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250930123528.42878-4-philmd@linaro.org Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-device.h: fix include header guard nameMark Cave-Ayland2025-09-251-3/+3
| | | | | | | | | | The header guard was incorrectly called HW_VFIO_VFIO_COMMON_H instead of HW_VFIO_VFIO_DEVICE_H. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-29-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-container-base.h: rename file to vfio-container.hMark Cave-Ayland2025-09-253-5/+5
| | | | | | | | | | | | With the rename of VFIOContainerBase to VFIOContainer, the vfio-container-base.h header file containing the struct definition is misleading. Rename it from vfio-container-base.h to vfio-container.h accordingly, fixing up the name of the include guard at the same time. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-5-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-container.h: rename file to vfio-container-legacy.hMark Cave-Ayland2025-09-251-3/+3
| | | | | | | | | | | | With the rename of VFIOContainer to VFIOLegacyContainer, the vfio-container.h header file containing the struct definition is misleading. Rename it from vfio-container.h to vfio-container-legacy.h accordingly, fixing up the name of the include guard at the same time. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-4-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-container-base.h: rename VFIOContainerBase to VFIOContainerMark Cave-Ayland2025-09-254-45/+45
| | | | | | | | | | | Now that the VFIOContainer struct name is available, rename VFIOContainerBase to VFIOContainer to better indicate that it is the superclass of other VFIOFooContainer structs. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-3-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-container.h: rename VFIOContainer to VFIOLegacyContainerMark Cave-Ayland2025-09-252-8/+9
| | | | | | | | | | | | The VFIOContainer struct represents the legacy VFIO container even though the name suggests it may be the common superclass of all VFIO containers. Rename it to VFIOLegacyContainer to make this clearer, which is also a better match for its VFIO_IOMMU_LEGACY QOM type name. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-2-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/vfio-container.h: rename VFIOContainer bcontainer field to parent_objMark Cave-Ayland2025-09-081-1/+1
| | | | | | | | | | | Now that nothing accesses the bcontainer field directly, rename bcontainer to parent_obj as per our current coding guidelines. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250715093110.107317-8-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/vfio-container.h: update VFIOContainer declarationMark Cave-Ayland2025-09-081-2/+3
| | | | | | | | | | | | Update the VFIOContainer declaration so that it is closer to our coding guidelines: emove the explicit typedef (this is already handled by the OBJECT_DECLARE_TYPE() macro) and add a blank line after the parent object. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250715093110.107317-3-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/vfio-container-base.h: update VFIOContainerBase declarationMark Cave-Ayland2025-09-081-6/+7
| | | | | | | | | | | | | Update the VFIOContainerBase declaration to match our current coding guidelines: remove the explicit typedef (this is already handled by the OBJECT_DECLARE_TYPE() macro), add a blank line after the parent object, rename parent to parent_obj, and move the macro declaration next to the VFIOContainerBase struct declaration. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250715093110.107317-2-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Move vfio-region.h under hw/vfio/Cédric Le Goater2025-09-081-48/+0
| | | | | | | | | | Since the removal of vfio-platform, header file vfio-region.h no longer needs to be a public VFIO interface. Move it under hw/vfio. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250901064631.530723-9-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Remove 'vfio-platform'Cédric Le Goater2025-09-082-79/+1
| | | | | | | | | | | | | | The VFIO_PLATFORM device type has been deprecated in the QEMU 10.0 timeframe. All dependent devices have been removed. Now remove the core vfio platform framework. Rename VFIO_DEVICE_TYPE_PLATFORM enum to VFIO_DEVICE_TYPE_UNUSED to maintain the same index for the CCW and AP VFIO device types. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250901064631.530723-8-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Remove 'vfio-calxeda-xgmac' deviceCédric Le Goater2025-09-081-43/+0
| | | | | | | | | | The VFIO_XGMAC device type has been deprecated in the QEMU 10.0 timeframe. Remove it. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250901064631.530723-7-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Remove 'vfio-amd-xgbe' deviceCédric Le Goater2025-09-081-46/+0
| | | | | | | | | | The VFIO_AMD_XGBE device type has been deprecated in the QEMU 10.0 timeframe. The AMD "Seattle" device is not supported anymore. Remove it. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250901064631.530723-6-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: preserve pending interruptsSteve Sistare2025-08-091-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | cpr-transfer may lose a VFIO interrupt because the KVM instance is destroyed and recreated. If an interrupt arrives in the middle, it is dropped. To fix, stop pending new interrupts during cpr save, and pick up the pieces. In more detail: Stop the VCPUs. Call kvm_irqchip_remove_irqfd_notifier_gsi --> KVM_IRQFD to deassign the irqfd gsi that routes interrupts directly to the VCPU and KVM. After this call, interrupts fall back to the kernel vfio_msihandler, which writes to QEMU's kvm_interrupt eventfd. CPR already preserves that eventfd. When the route is re-established in new QEMU, the kernel tests the eventfd and injects an interrupt to KVM if necessary. Deassign INTx in a similar manner. For both MSI and INTx, remove the eventfd handler so old QEMU does not consume an event. If an interrupt was already pended to KVM prior to the completion of kvm_irqchip_remove_irqfd_notifier_gsi, it will be recovered by the subsequent call to cpu_synchronize_all_states, which pulls KVM interrupt state to userland prior to saving it in vmstate. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1752689169-233452-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Max in-flight VFIO device state buffers size limitMaciej S. Szmigiero2025-07-151-0/+1
| | | | | | | | | | | | | | | | | Allow capping the maximum total size of in-flight VFIO device state buffers queued at the destination, otherwise a malicious QEMU source could theoretically cause the target QEMU to allocate unlimited amounts of memory for buffers-in-flight. Since this is not expected to be a realistic threat in most of VFIO live migration use cases and the right value depends on the particular setup disable this limit by default by setting it to UINT64_MAX. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/4f7cad490988288f58e36b162d7a888ed7e7fd17.1752589295.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Add x-migration-load-config-after-iter VFIO propertyMaciej S. Szmigiero2025-07-151-0/+1
| | | | | | | | | | | | | | | | This property allows configuring whether to start the config load only after all iterables were loaded, during non-iterables loading phase. Such interlocking is required for ARM64 due to this platform VFIO dependency on interrupt controller being loaded first. The property defaults to AUTO, which means ON for ARM, OFF for other platforms. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/0e03c60dbc91f9a9ba2516929574df605b7dfcb4.1752589295.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: delete old cpr registerSteve Sistare2025-07-031-4/+0
| | | | | | | | | | | vfio_cpr_[un]register_container is no longer used since they were subsumed by container type-specific registration. Delete them. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-21-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: cpr stateSteve Sistare2025-07-031-0/+3
| | | | | | | | | | | | VFIO iommufd devices will need access to ioas_id, devid, and hwpt_id in new QEMU at realize time, so add them to CPR state. Define CprVFIODevice as the object which holds the state and is serialized to the vmstate file. Define accessors to copy state between VFIODevice and CprVFIODevice. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* migration: vfio cpr state hookSteve Sistare2025-07-031-0/+1
| | | | | | | | | | | | | | | Define a list of vfio devices in CPR state, in a subsection so that older QEMU can be live updated to this version. However, new QEMU will not be live updateable to old QEMU. This is acceptable because CPR is not yet commonly used, and updates to older versions are unusual. The contents of each device object will be defined by the vfio subsystem in a subsequent patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-14-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: register container for cprSteve Sistare2025-07-031-0/+12
| | | | | | | | | | | | | | Register a vfio iommufd container and device for CPR, replacing the generic CPR register call with a more specific iommufd register call. Add a blocker if the kernel does not support IOMMU_IOAS_CHANGE_PROCESS. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-13-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: device name blockerSteve Sistare2025-07-031-0/+1
| | | | | | | | | If an invariant device name cannot be created, block CPR. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-12-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: add vfio_device_free_nameSteve Sistare2025-07-031-0/+1
| | | | | | | | | | | | Define vfio_device_free_name to free the name created by vfio_device_get_name. A subsequent patch will do more there. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-11-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: use IOMMU_IOAS_MAP_FILESteve Sistare2025-07-031-0/+15
| | | | | | | | | | | | Use IOMMU_IOAS_MAP_FILE when the mapped region is backed by a file. Such a mapping can be preserved without modification during CPR, because it depends on the file's address space, which does not change, rather than on the process's address space, which does change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* migration: close kvm after cprSteve Sistare2025-07-032-0/+4
| | | | | | | | | | | | | | cpr-transfer breaks vfio network connectivity to and from the guest, and the host system log shows: irq bypass consumer (token 00000000a03c32e5) registration fails: -16 which is EBUSY. This occurs because KVM descriptors are still open in the old QEMU process. Close them. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-pci: preserve MSISteve Sistare2025-07-031-0/+8
| | | | | | | | | | | | | Save the MSI message area as part of vfio-pci vmstate, and preserve the interrupt and notifier eventfd's. migrate_incoming loads the MSI data, then the vfio-pci post_load handler finds the eventfds in CPR state, rebuilds vector data structures, and attaches the interrupts to the new KVM instance. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: Fix vfio_container_post_load()Zhenzhong Duan2025-07-031-3/+4
| | | | | | | | | | | | | | | When there are multiple VFIO containers, vioc->dma_map is restored multiple times, this made only first container work and remaining containers using vioc->dma_map restored by first container. Fix it by save and restore vioc->dma_map locally. saved_dma_map in VFIOContainerCPR becomes useless and is removed. Fixes: 7e9f21411302 ("vfio/container: restore DMA vaddr") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250627063332.5173-3-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-user: connect vfio proxy to remote serverJohn Levon2025-06-261-0/+2
| | | | | | | | | | | | | | | | | Introduce the vfio-user "proxy": this is the client code responsible for sending and receiving vfio-user messages across the control socket. The new files hw/vfio-user/proxy.[ch] contain some basic plumbing for managing the proxy; initialize the proxy during realization of the VFIOUserPCIDevice instance. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-3-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-user: add vfio-user class and containerJohn Levon2025-06-261-0/+1
| | | | | | | | | | | | | | | | | | | Introduce basic plumbing for vfio-user with CONFIG_VFIO_USER. We introduce VFIOUserContainer in hw/vfio-user/container.c, which is a container type for the "IOMMU" type "vfio-iommu-user", and share some common container code from hw/vfio/container.c. Add hw/vfio-user/pci.c for instantiating VFIOUserPCIDevice objects, sharing some common code from hw/vfio/pci.c. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add documentation for posted write argumentJohn Levon2025-06-261-0/+1
| | | | | | | Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250616101314.3189793-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add vfio_device_get_region_fd()John Levon2025-06-261-0/+12
| | | | | | | | | | This keeps the existence of ->region_fds private to hw/vfio/device.c. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250616101337.3190027-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: improve VFIODeviceIOOps docsJohn Levon2025-06-111-9/+43
| | | | | | | | | Explicitly describe every parameter rather than summarizing. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250611104753.1199796-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-pci: skip reset during cprSteve Sistare2025-06-111-0/+2
| | | | | | | | | | | Do not reset a vfio-pci device during CPR, and do not complain if the kernel's PCI config space changes for non-emulated bits between the vmstate save and load, which can happen due to ongoing interrupt activity. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-12-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: recover from unmap-all-vaddr failureSteve Sistare2025-06-112-0/+13
| | | | | | | | | | | | | | If there are multiple containers and unmap-all fails for some container, we need to remap vaddr for the other containers for which unmap-all succeeded. Recover by walking all address ranges of all containers to restore the vaddr for each. Do so by invoking the vfio listener callback, and passing a new "remap" flag that tells it to restore a mapping without re-allocating new userland data structures. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: mdev cpr blockerSteve Sistare2025-06-112-0/+5
| | | | | | | | | | | | | During CPR, after VFIO_DMA_UNMAP_FLAG_VADDR, the vaddr is temporarily invalid, so mediated devices cannot be supported. Add a blocker for them. This restriction will not apply to iommufd containers when CPR is added for them in a future patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-8-git-send-email-steven.sistare@oracle.com [ clg: Fixed context change in VFIODevice ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: restore DMA vaddrSteve Sistare2025-06-111-0/+3
| | | | | | | | | | | | | | | | | | In new QEMU, do not register the memory listener at device creation time. Register it later, in the container post_load handler, after all vmstate that may affect regions and mapping boundaries has been loaded. The post_load registration will cause the listener to invoke its callback on each flat section, and the calls will match the mappings remembered by the kernel. The listener calls a special dma_map handler that passes the new VA of each section to the kernel using VFIO_DMA_MAP_FLAG_VADDR. Restore the normal handler at the end. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: preserve descriptorsSteve Sistare2025-06-111-0/+6
| | | | | | | | | | | | | | | | | At vfio creation time, save the value of vfio container, group, and device descriptors in CPR state. On qemu restart, vfio_realize() finds and uses the saved descriptors. During reuse, device and iommu state is already configured, so operations in vfio_realize that would modify the configuration, such as vfio ioctl's, are skipped. The result is that vfio_realize constructs qemu data structures that reflect the current state of the device. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: register container for cprSteve Sistare2025-06-112-0/+17
| | | | | | | | | | | | | | Register a legacy container for cpr-transfer, replacing the generic CPR register call with a more specific legacy container register call. Add a blocker if the kernel does not support VFIO_UPDATE_VADDR or VFIO_UNMAP_ALL. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: mark posted writes in region write callbacksJohn Levon2025-06-112-2/+3
| | | | | | | | | | For vfio-user, the region write implementation needs to know if the write is posted; add the necessary plumbing to support this. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-5-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add per-region fd supportJohn Levon2025-06-111-2/+5
| | | | | | | | | | | For vfio-user, each region has its own fd rather than sharing vbasedev's. Add the necessary plumbing to support this, and use the correct fd in vfio_region_mmap(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-4-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: move vfio-cpr.hSteve Sistare2025-06-051-0/+18
| | | | | | | | | | | Move vfio-cpr.h to include/hw/vfio, because it will need to be included by other files there. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1748546679-154091-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: vfio_find_ram_discard_listenerSteve Sistare2025-06-051-0/+3
| | | | | | | | | | | Define vfio_find_ram_discard_listener as a subroutine so additional calls to it may be added in a subsequent patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1748546679-154091-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: pass MemoryRegion to DMA operationsJohn Levon2025-06-051-4/+5
| | | | | | | | | | | | | | | | Pass through the MemoryRegion to DMA operation handlers of vfio containers. The vfio-user container will need this later, to translate the vaddr into an offset for the dma map vfio-user message; CPR will also will need this. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250521215534.2688540-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add more VFIOIOMMUClass docsJohn Levon2025-06-051-3/+72
| | | | | | | | | Add some additional doc comments for these class methods. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250520162530.2194548-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: pass listener_begin/commit callbacksJohn Levon2025-05-091-0/+2
| | | | | | | | | | | The vfio-user container will later need to hook into these callbacks; set up vfio to use them, and optionally pass them through to the container. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-15-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add read/write to device IO ops vectorJohn Levon2025-05-091-0/+18
| | | | | | | | | Now we have the region info cache, add ->region_read/write device I/O operations instead of explicit pread()/pwrite() system calls. Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-13-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add region info cacheJohn Levon2025-05-091-0/+1
| | | | | | | | | | | | | | | | | | | | Instead of requesting region information on demand with VFIO_DEVICE_GET_REGION_INFO, maintain a cache: this will become necessary for performance for vfio-user, where this call becomes a message over the control socket, so is of higher overhead than the traditional path. We will also need it to generalize region accesses, as that means we can't use ->config_offset for configuration space accesses, but must look up the region offset (if relevant) each time. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-12-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add device IO ops vectorJohn Levon2025-05-091-0/+38
| | | | | | | | | | | | | | | | For vfio-user, device operations such as IRQ handling and region read/writes are implemented in userspace over the control socket, not ioctl() to the vfio kernel driver; add an ops vector to generalize this, and implement vfio_device_io_ops_ioctl for interacting with the kernel vfio driver. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-11-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add unmap_all flag to DMA unmap callbackJohn Levon2025-05-091-2/+13
| | | | | | | | | We'll use this parameter shortly; this just adds the plumbing. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-9-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add strread/writeerror()John Levon2025-05-091-0/+14
| | | | | | | | | | Add simple helpers to correctly report failures from read/write routines using the return -errno style. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-7-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>