summary refs log tree commit diff stats
path: root/hw/vfio (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* vfio/pci: pass vector to virq functionsSteve Sistare2025-06-111-6/+7
| | | | | | | | | | | Pass the vector number to vfio_connect_kvm_msi_virq and vfio_remove_kvm_msi_virq, so it can be passed to their subroutines in a subsequent patch. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: vfio_notifier_initSteve Sistare2025-06-111-15/+25
| | | | | | | | | | | Move event_notifier_init calls to a helper vfio_notifier_init. This version is trivial, but it will be expanded to support CPR in subsequent patches. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-14-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: vfio_pci_vector_initSteve Sistare2025-06-111-7/+17
| | | | | | | | | Extract a subroutine vfio_pci_vector_init. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-13-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-pci: skip reset during cprSteve Sistare2025-06-112-0/+38
| | | | | | | | | | | Do not reset a vfio-pci device during CPR, and do not complain if the kernel's PCI config space changes for non-emulated bits between the vmstate save and load, which can happen due to ongoing interrupt activity. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-12-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* pci: skip reset during cprSteve Sistare2025-06-111-0/+7
| | | | | | | | | Do not reset a vfio-pci device during CPR. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749576403-25355-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: recover from unmap-all-vaddr failureSteve Sistare2025-06-112-1/+109
| | | | | | | | | | | | | | If there are multiple containers and unmap-all fails for some container, we need to remap vaddr for the other containers for which unmap-all succeeded. Recover by walking all address ranges of all containers to restore the vaddr for each. Do so by invoking the vfio listener callback, and passing a new "remap" flag that tells it to restore a mapping without re-allocating new userland data structures. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: mdev cpr blockerSteve Sistare2025-06-111-0/+8
| | | | | | | | | | | | | During CPR, after VFIO_DMA_UNMAP_FLAG_VADDR, the vaddr is temporarily invalid, so mediated devices cannot be supported. Add a blocker for them. This restriction will not apply to iommufd containers when CPR is added for them in a future patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-8-git-send-email-steven.sistare@oracle.com [ clg: Fixed context change in VFIODevice ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: restore DMA vaddrSteve Sistare2025-06-112-2/+70
| | | | | | | | | | | | | | | | | | In new QEMU, do not register the memory listener at device creation time. Register it later, in the container post_load handler, after all vmstate that may affect regions and mapping boundaries has been loaded. The post_load registration will cause the listener to invoke its callback on each flat section, and the calls will match the mappings remembered by the kernel. The listener calls a special dma_map handler that passes the new VA of each section to the kernel using VFIO_DMA_MAP_FLAG_VADDR. Restore the normal handler at the end. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: discard old DMA vaddrSteve Sistare2025-06-111-0/+29
| | | | | | | | | | | | In the container pre_save handler, discard the virtual addresses in DMA mappings with VFIO_DMA_UNMAP_FLAG_VADDR, because guest RAM will be remapped at a different VA after in new QEMU. DMA to already-mapped pages continues. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: preserve descriptorsSteve Sistare2025-06-112-15/+94
| | | | | | | | | | | | | | | | | At vfio creation time, save the value of vfio container, group, and device descriptors in CPR state. On qemu restart, vfio_realize() finds and uses the saved descriptors. During reuse, device and iommu state is already configured, so operations in vfio_realize that would modify the configuration, such as vfio ioctl's, are skipped. The result is that vfio_realize constructs qemu data structures that reflect the current state of the device. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: register container for cprSteve Sistare2025-06-114-7/+74
| | | | | | | | | | | | | | Register a legacy container for cpr-transfer, replacing the generic CPR register call with a more specific legacy container register call. Add a blocker if the kernel does not support VFIO_UPDATE_VADDR or VFIO_UNMAP_ALL. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: mark posted writes in region write callbacksJohn Levon2025-06-113-3/+8
| | | | | | | | | | For vfio-user, the region write implementation needs to know if the write is posted; add the necessary plumbing to support this. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-5-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add per-region fd supportJohn Levon2025-06-112-6/+32
| | | | | | | | | | | For vfio-user, each region has its own fd rather than sharing vbasedev's. Add the necessary plumbing to support this, and use the correct fd in vfio_region_mmap(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-4-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: export PCI helpers needed for vfio-userJohn Levon2025-06-113-27/+38
| | | | | | | | | | | | The vfio-user code will need to re-use various parts of the vfio PCI code. Export them in hw/vfio/pci.h, and rename them to the vfio_pci_* namespace. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio/ap: Storing event information for an AP configuration change eventRorie Reyes2025-06-111-0/+40
| | | | | | | | | | | These functions can be invoked by the function that handles interception of the CHSC SEI instruction for requests indicating the accessibility of one or more adjunct processors has changed. Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-4-rreyes@linux.ibm.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio/ap: store object indicating AP config changed in a queueRorie Reyes2025-06-111-0/+17
| | | | | | | | | | | | Creates an object indicating that an AP configuration change event has been received and stores it in a queue. These objects will later be used to store event information for an AP configuration change when the CHSC instruction is intercepted. Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-3-rreyes@linux.ibm.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio/ap: notification handler for AP config changed eventRorie Reyes2025-06-111-0/+31
| | | | | | | | | | | Register an event notifier handler to process AP configuration change events by queuing the event and generating a CRW to let the guest know its AP configuration has changed Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-2-rreyes@linux.ibm.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Fix instance_size of VFIO_PCI_BASEZhenzhong Duan2025-06-111-2/+1
| | | | | | | | | | | | | | | | Currently the final instance_size of VFIO_PCI_BASE is sizeof(PCIDevice). It should be sizeof(VFIOPCIDevice), VFIO_PCI uses same structure as base class VFIO_PCI_BASE, so no need to set its instance_size explicitly. This isn't catastrophic only because VFIO_PCI_BASE is an abstract class. Fixes: d4e392d0a99b ("vfio: add vfio-pci-base class") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/qemu-devel/20250611024228.423666-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: Fix vfio_listener_commit()Zhenzhong Duan2025-06-111-1/+1
| | | | | | | | | | | It's wrong to call into listener_begin callback in vfio_listener_commit(). Currently this impacts vfio-user. Fixes: d9b7d8b6993b ("vfio/container: pass listener_begin/commit callbacks") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250609115433.401775-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: move vfio-cpr.hSteve Sistare2025-06-054-18/+3
| | | | | | | | | | | Move vfio-cpr.h to include/hw/vfio, because it will need to be included by other files there. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1748546679-154091-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: vfio_find_ram_discard_listenerSteve Sistare2025-06-051-13/+22
| | | | | | | | | | | Define vfio_find_ram_discard_listener as a subroutine so additional calls to it may be added in a subsequent patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1748546679-154091-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Save vendor specific device infoZhenzhong Duan2025-06-051-5/+3
| | | | | | | | | | | | | | | | | | Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor specific. Save them as raw data in a union supporting different vendors, then vendor IOMMU can query the raw data with its fixed format for capability directly. Because IOMMU_GET_HW_INFO is only supported in linux, so declare those capability related structures with CONFIG_LINUX. Suggested-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-5-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Implement [at|de]tach_hwpt handlersZhenzhong Duan2025-06-051-0/+22
| | | | | | | | | | | | | Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU utilizes them to attach to or detach from hwpt on host side. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-4-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFDZhenzhong Duan2025-06-051-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific to the iommufd BE + 2 new class functions. IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle and devid are used to allocate/free ioas and hwpt. hwpt_id is used to re-attach IOMMUFD backed device to its default VFIO sub-system created hwpt, i.e., when vIOMMU is disabled by guest. These properties are initialized in hiod::realize() after attachment. 2 new class functions are [at|de]tach_hwpt(). They are used to attach/detach hwpt. VFIO and VDPA can have different implementions, so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD, e.g., in HostIOMMUDeviceIOMMUFDVFIO. Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the two functions. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-3-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: pass MemoryRegion to DMA operationsJohn Levon2025-06-054-7/+9
| | | | | | | | | | | | | | | | Pass through the MemoryRegion to DMA operation handlers of vfio containers. The vfio-user container will need this later, to translate the vaddr into an offset for the dma map vfio-user message; CPR will also will need this. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250521215534.2688540-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: return mr from vfio_get_xlat_addrSteve Sistare2025-06-051-11/+22
| | | | | | | | | | | | | | | | | | | | | Modify memory_get_xlat_addr and vfio_get_xlat_addr to return the memory region that the translated address is found in. This will be needed by CPR in a subsequent patch to map blocks using IOMMU_IOAS_MAP_FILE. Also return the xlat offset, so we can simplify the interface by removing the out parameters that can be trivially derived from mr and xlat. Lastly, rename the functions to to memory_translate_iotlb() and vfio_translate_iotlb(). Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: John Levon <john.levon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1747661203-136490-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Fix incorrect error propagation in vfio_pci_igd_opregion_detect()Tomita Moeko2025-06-051-12/+10
| | | | | | | | | | | | | | | | | | | | | | In vfio_pci_igd_opregion_detect(), errp will be set when the device does not have OpRegion or is hotplugged. This errp will be propagated to pci_qdev_realize(), which interprets it as failure, causing unexpected termination on devices without OpRegion like SR-IOV VFs or discrete GPUs. Fix it by not setting errp in vfio_pci_igd_opregion_detect(). This patch also checks if the device has OpRegion before hotplug status to prevent unwanted warning messages on non-IGD devices. Fixes: c0273e77f2d7 ("vfio/igd: Detect IGD device by OpRegion") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2968 Reported-by: Edmund Raile <edmund.raile@protonmail.com> Link: https://lore.kernel.org/qemu-devel/30044d14-17ec-46e3-b9c3-63d27a5bde27@gmail.com Tested-by: Edmund Raile <edmund.raile@protonmail.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250522151636.20001-1-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Add comment emphasizing no movement of hiod->realize() callZhenzhong Duan2025-06-051-0/+4
| | | | | | | | | | | | The nested IOMMU support needs device and hwpt id which are generated only after attachment. Hiod encapsulates these information in realize() and passes to vIOMMU. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250521110301.3313877-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: refactor out IRQ signalling setupJohn Levon2025-06-051-15/+20
| | | | | | | | | | This makes for a slightly more readable vfio_msix_vector_do_use() implementation, and we will rely on this shortly. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-5-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: move config space read into vfio_pci_config_setup()John Levon2025-06-051-15/+16
| | | | | | | | | | | Small cleanup that reduces duplicate code for vfio-user and reduces the size of vfio_realize(); while we're here, correct that name to vfio_pci_realize(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-4-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: move more cleanup into vfio_pci_put_device()John Levon2025-06-051-11/+12
| | | | | | | | | | All of the cleanup can be done in the same place, and vfio-user will want to do the same. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-3-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: OpRegion not found fix error typoEdmund Raile2025-06-051-1/+1
| | | | | | | | Signed-off-by: Edmund Raile <edmund.raile@protonmail.com> Reviewed-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/MFFbQoTpea_CK5ELq8oJ-a3Q57wo7ywQlrIqDvdIDKhUuCm59VUz2QzvdojO5r_wb_7SHifU0Kym3loj4eASPhdzYpjtiMCTePzyg1zrroo=@protonmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* system/runstate: add VM state change cb with return valueHaoqian He2025-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`, which has return value for `VMChangeStateEntry`. Thus, we can register a new VM state change cb with return value for device. Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually exclusive and cannot be provided at the same time. This patch is the pre patch for 'vhost-user: return failure if backend crashes when live migration', which makes the live migration aware of the loss of connection with the vhost-user backend and aborts the live migration. Virtio device will use VMChangeStateHandlerWithRet. Signed-off-by: Haoqian He <haoqian.he@smartx.com> Message-Id: <20250416024729.3289157-2-haoqian.he@smartx.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge tag 'pull-vfio-20250509' of https://github.com/legoater/qemu into stagingStefan Hajnoczi2025-05-0912-366/+677
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfio queue: * Preparatory changes for the introduction of CPR support * Automatic enablement of OpRegion for IGD device passthrough * Linux headers update * Preparatory changes for the introduction of vfio-user # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmgd/0kACgkQUaNDx8/7 # 7KHRmRAArw1PXMCmoVBBeLcZ8BZPGjBZHtsvRzwS1yhVnNQadlpDlq4wd9HrfDFK # BTr7//Ag2Q1dKgibesh0A8hSjorXHUGQCmdkcCuGGTFnEwC86q5jCH1lUxgI0cs5 # 3bVwc43zhXGoKqmo07g4f2UFbjDYHe89LgWz2c7TFFGz7Tda/LCOdhnmXlXcIwz+ # v1ocutXd7VbDWvUzN7uZbf0SIH3Zj3p96dwmpLDtdzdliDA0JidNvS27+Z5gtvWe # O+1NW9MDzNfd6zLXCxL3GLeT61WZCe1dRCHEPX4cBo+DhnrifsC25DtJwYlDFvi2 # NMFfGzdKcEVSpeDp7WeM6MJgCZsGHC7ytmAKOKgN2M2kFSj3SI3sTFNlE1rzUhe6 # yjjCa59HzNLIi7L7xYCrVtCLGC/VXOp9kh67Sjs7FY7v778QUEdiudFBdBki7Bwh # bpRhdFJgCLHuKc6XrM7hsMnsRyM28MywyfHDo3M/pRSFNKfeImW6zSMXnyncZztK # W8e8OIz2DBMfH8pIu8hPw9Gsm5VAAs4aVmVFNa0CLl0oBko0Ew2YXcA5pTK5gGqv # x24uc/BhbLcfFUtK0OnP4N/B4rcoADebPV2u4eWoUK3aF5u4+7BY235bFuoTj+sb # 55DPDyWm5cmkX58Tdq46tD39dbD1hlUYkcydPbANH51wYx/lPpc= # =OqYP # -----END PGP SIGNATURE----- # gpg: Signature made Fri 09 May 2025 09:12:41 EDT # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full] # gpg: aka "Cédric Le Goater <clg@kaod.org>" [full] # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20250509' of https://github.com/legoater/qemu: (28 commits) vfio/container: pass listener_begin/commit callbacks vfio: add vfio-pci-base class vfio: add read/write to device IO ops vector vfio: add region info cache vfio: add device IO ops vector vfio: implement unmap all for DMA unmap callbacks vfio: add unmap_all flag to DMA unmap callback vfio: add vfio_pci_config_space_read/write() vfio: add strread/writeerror() vfio: consistently handle return value for helpers vfio: add vfio_device_get_irq_info() helper vfio: add vfio_attach_device_by_iommu_type() vfio: add vfio_device_unprepare() vfio: add vfio_device_prepare() linux-headers: Update to Linux v6.15-rc3 linux-header: update-linux-header script changes vfio/igd: Remove generation limitation for IGD passthrough vfio/igd: Only emulate GGC register when x-igd-gms is set vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+ vfio/igd: Enable OpRegion by default ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
| * vfio/container: pass listener_begin/commit callbacksJohn Levon2025-05-091-0/+28
| | | | | | | | | | | | | | | | | | | | | | The vfio-user container will later need to hook into these callbacks; set up vfio to use them, and optionally pass them through to the container. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-15-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio-pci-base classJohn Levon2025-05-093-24/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split out parts of TYPE_VFIO_PCI into a base TYPE_VFIO_PCI_BASE, although we have not yet introduced another subclass, so all the properties have remained in TYPE_VFIO_PCI. Note that currently there is no need for additional data for TYPE_VFIO_PCI, so it shares the same C struct type as TYPE_VFIO_PCI_BASE, VFIOPCIDevice. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-14-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add read/write to device IO ops vectorJohn Levon2025-05-093-20/+59
| | | | | | | | | | | | | | | | | | Now we have the region info cache, add ->region_read/write device I/O operations instead of explicit pread()/pwrite() system calls. Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-13-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add region info cacheJohn Levon2025-05-095-18/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of requesting region information on demand with VFIO_DEVICE_GET_REGION_INFO, maintain a cache: this will become necessary for performance for vfio-user, where this call becomes a message over the control socket, so is of higher overhead than the traditional path. We will also need it to generalize region accesses, as that means we can't use ->config_offset for configuration space accesses, but must look up the region offset (if relevant) each time. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-12-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add device IO ops vectorJohn Levon2025-05-094-27/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For vfio-user, device operations such as IRQ handling and region read/writes are implemented in userspace over the control socket, not ioctl() to the vfio kernel driver; add an ops vector to generalize this, and implement vfio_device_io_ops_ioctl for interacting with the kernel vfio driver. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-11-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: implement unmap all for DMA unmap callbacksJohn Levon2025-05-093-24/+51
| | | | | | | | | | | | | | | | | | Handle unmap_all in the DMA unmap handlers rather than in the caller. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-10-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add unmap_all flag to DMA unmap callbackJohn Levon2025-05-094-9/+17
| | | | | | | | | | | | | | | | | | We'll use this parameter shortly; this just adds the plumbing. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-9-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio_pci_config_space_read/write()John Levon2025-05-091-43/+80
| | | | | | | | | | | | | | | | | | | | Add these helpers that access config space and return an -errno style return. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-8-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: consistently handle return value for helpersJohn Levon2025-05-091-13/+20
| | | | | | | | | | | | | | | | | | | | | | Various bits of code that call vfio device APIs should consistently use the "return -errno" approach for passing errors back, rather than presuming errno is (still) set correctly. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-6-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio_device_get_irq_info() helperJohn Levon2025-05-095-33/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper similar to vfio_device_get_region_info() and use it everywhere. Replace a couple of needless allocations with stack variables. As a side-effect, this fixes a minor error reporting issue in the call from vfio_msix_early_setup(). Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-5-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio_attach_device_by_iommu_type()John Levon2025-05-091-7/+15
| | | | | | | | | | | | | | | | | | | | Allow attachment by explicitly passing a TYPE_VFIO_IOMMU_* string; vfio-user will use this later. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-4-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio_device_unprepare()John Levon2025-05-093-6/+11
| | | | | | | | | | | | | | | | | | Add a helper that's the inverse of vfio_device_prepare(). Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-3-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio: add vfio_device_prepare()John Levon2025-05-093-20/+17
| | | | | | | | | | | | | | | | | | | | Commonize some initialization code shared by the legacy and iommufd vfio implementations. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio/igd: Remove generation limitation for IGD passthroughTomita Moeko2025-05-091-37/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from Intel Core Ultra Series (Meteor Lake), Data Stolen Memory has became a part of LMEMBAR (MMIO BAR2) [1][2], meaning that BDSM and GGC register quirks are no longer needed on these platforms. To support Meteor/Arrow/Lunar Lake and future IGD devices, remove the generation limitation in IGD passthrough, and apply BDSM and GGC quirks only to known Gen6-12 devices. [1] https://edc.intel.com/content/www/us/en/design/publications/14th-generation-core-processors-cfg-and-mem-registers/d2-f0-processor-graphics-registers/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/i915/gem/i915_gem_stolen.c?h=v6.14#n142 Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250505170305.23622-10-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio/igd: Only emulate GGC register when x-igd-gms is setTomita Moeko2025-05-091-23/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | x-igd-gms is used for overriding DSM region size in GGC register in both config space and MMIO BAR0, by default host value is used. There is no need to emulate it in default case. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250505170305.23622-9-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
| * vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+Tomita Moeko2025-05-091-18/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | On Gen9 and later IGD devices, GMS 0xf0 to 0xfe represents 4MB to 60MB pre-allocated memory size in 4MB increments. Allow users overriding GMS with these values. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250505170305.23622-8-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>