summary refs log tree commit diff stats
path: root/hw/vfio/iommufd.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* hw/vfio: Use uint64_t for IOVA mapping size in vfio_container_dma_*mapPhilippe Mathieu-Daudé2025-10-021-3/+3
| | | | | | | | | | | | | | | | The 'ram_addr_t' type is described as: a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces. This doesn't represent well an IOVA mapping size. Simply use the uint64_t type. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250930123528.42878-5-philmd@linaro.org Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd.c: use QOM casts where appropriateMark Cave-Ayland2025-09-251-20/+14
| | | | | | | | | | Use QOM casts to convert between VFIOIOMMUFDContainer and VFIOContainer instead of accessing bcontainer directly. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-8-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/hw/vfio/vfio-container-base.h: rename VFIOContainerBase to VFIOContainerMark Cave-Ayland2025-09-251-9/+9
| | | | | | | | | | | Now that the VFIOContainer struct name is available, rename VFIOContainerBase to VFIOContainer to better indicate that it is the superclass of other VFIOFooContainer structs. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-3-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce helper vfio_pci_from_vfio_device()Zhenzhong Duan2025-09-081-2/+2
| | | | | | | | | | | | Introduce helper vfio_pci_from_vfio_device() to transform from VFIODevice to VFIOPCIDevice, also to hide low level VFIO_DEVICE_TYPE_PCI type check. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250822064101.123526-5-zhenzhong.duan@intel.com [ clg: Added documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: reconstruct hwptSteve Sistare2025-07-031-8/+22
| | | | | | | | | Skip allocation of, and attachment to, hwpt_id. Recover it from CPR state. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-18-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: reconstruct deviceSteve Sistare2025-07-031-2/+28
| | | | | | | | | | | | | | | | | | | Reconstruct userland device state after CPR. During vfio_realize, skip all ioctls that configure the device, as it was already configured in old QEMU. Skip bind, and use the devid from CPR state. Skip allocation of, and attachment to, ioas_id. Recover ioas_id from CPR state, and use it to find a matching container, if any, before creating a new one. This reconstruction is not complete. hwpt_id is handled in a subsequent patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-17-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: cpr stateSteve Sistare2025-07-031-0/+2
| | | | | | | | | | | | VFIO iommufd devices will need access to ioas_id, devid, and hwpt_id in new QEMU at realize time, so add them to CPR state. Define CprVFIODevice as the object which holds the state and is serialized to the vmstate file. Define accessors to copy state between VFIODevice and CprVFIODevice. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: register container for cprSteve Sistare2025-07-031-2/+4
| | | | | | | | | | | | | | Register a vfio iommufd container and device for CPR, replacing the generic CPR register call with a more specific iommufd register call. Add a blocker if the kernel does not support IOMMU_IOAS_CHANGE_PROCESS. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-13-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: use IOMMU_IOAS_MAP_FILESteve Sistare2025-07-031-0/+13
| | | | | | | | | | | | Use IOMMU_IOAS_MAP_FILE when the mapped region is backed by a file. Such a mapping can be preserved without modification during CPR, because it depends on the file's address space, which does not change, rather than on the process's address space, which does change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: move vfio-cpr.hSteve Sistare2025-06-051-1/+1
| | | | | | | | | | | Move vfio-cpr.h to include/hw/vfio, because it will need to be included by other files there. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1748546679-154091-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Save vendor specific device infoZhenzhong Duan2025-06-051-5/+3
| | | | | | | | | | | | | | | | | | Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor specific. Save them as raw data in a union supporting different vendors, then vendor IOMMU can query the raw data with its fixed format for capability directly. Because IOMMU_GET_HW_INFO is only supported in linux, so declare those capability related structures with CONFIG_LINUX. Suggested-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-5-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Implement [at|de]tach_hwpt handlersZhenzhong Duan2025-06-051-0/+22
| | | | | | | | | | | | | Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU utilizes them to attach to or detach from hwpt on host side. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-4-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFDZhenzhong Duan2025-06-051-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific to the iommufd BE + 2 new class functions. IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle and devid are used to allocate/free ioas and hwpt. hwpt_id is used to re-attach IOMMUFD backed device to its default VFIO sub-system created hwpt, i.e., when vIOMMU is disabled by guest. These properties are initialized in hiod::realize() after attachment. 2 new class functions are [at|de]tach_hwpt(). They are used to attach/detach hwpt. VFIO and VDPA can have different implementions, so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD, e.g., in HostIOMMUDeviceIOMMUFDVFIO. Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the two functions. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-3-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/container: pass MemoryRegion to DMA operationsJohn Levon2025-06-051-1/+2
| | | | | | | | | | | | | | | | Pass through the MemoryRegion to DMA operation handlers of vfio containers. The vfio-user container will need this later, to translate the vaddr into an offset for the dma map vfio-user message; CPR will also will need this. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250521215534.2688540-1-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Add comment emphasizing no movement of hiod->realize() callZhenzhong Duan2025-06-051-0/+4
| | | | | | | | | | | | The nested IOMMU support needs device and hwpt id which are generated only after attachment. Hiod encapsulates these information in realize() and passes to vIOMMU. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250521110301.3313877-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: implement unmap all for DMA unmap callbacksJohn Levon2025-05-091-1/+14
| | | | | | | | | Handle unmap_all in the DMA unmap handlers rather than in the caller. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-10-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add unmap_all flag to DMA unmap callbackJohn Levon2025-05-091-1/+5
| | | | | | | | | We'll use this parameter shortly; this just adds the plumbing. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-9-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add vfio_device_unprepare()John Levon2025-05-091-3/+1
| | | | | | | | | Add a helper that's the inverse of vfio_device_prepare(). Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-3-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add vfio_device_prepare()John Levon2025-05-091-8/+1
| | | | | | | | | | Commonize some initialization code shared by the legacy and iommufd vfio implementations. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* Merge tag 'single-binary-20250425' of https://github.com/philmd/qemu into ↵Stefan Hajnoczi2025-04-271-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging Various patches loosely related to single binary work: - Replace cpu_list() definition by CPUClass::list_cpus() callback - Remove few MO_TE definitions on Hexagon / X86 targets - Remove target_ulong uses in ARMMMUFaultInfo and ARM CPUWatchpoint - Remove DEVICE_HOST_ENDIAN definition - Evaluate TARGET_BIG_ENDIAN at compile time and use target_needs_bswap() more - Rename target_words_bigendian() as target_big_endian() - Convert target_name() and target_cpu_type() to TargetInfo API - Constify QOM TypeInfo class_data/interfaces fields - Get default_cpu_type calling machine_class_default_cpu_type() - Correct various uses of GLibCompareDataFunc prototype - Simplify ARM/Aarch64 gdb_get_core_xml_file() handling a bit - Move device tree files in their own pc-bios/dtb/ subdir - Correctly check strchrnul() symbol availability on macOS SDK - Move target-agnostic methods out of cpu-target.c and accel-target.c - Unmap canceled USB XHCI packet - Use deposit/extract API in designware model - Fix MIPS16e translation - Few missing header fixes # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmgLqb8ACgkQ4+MsLN6t # wN6nCQ//cmv1M+NsndhO5TAK8T1eUSXKlTZh932uro6ZgxKwN4p+j1Qo7bq3O9gu # qUMHNbcfQl8sHSytiXBoxCjLMCXC3u38iyz75WGXuPay06rs4wqmahqxL4tyno3l # 1RviFts9xlLn+tJqqrAR6+pRdALld0TY+yXUjXgr4aK5pIRpLz9U/sIEoh7qbA5U # x0MTaceDG3A91OYo0TgrNbcMe1b9GqQZ+a4tbaP+oE37wbiKdyQ68LjrEbV08Y1O # qrFF4oxquV31QJcUiuII1W7hC6psGrMsUA1f1qDu7QvmybAZWNZNsR9T66X9jH5J # wXMShJmmXwxugohmuPPFnDshzJy90aFL6Jy2shrfqcG2v0W66ARY1ZnbJLCcfczt # 073bnE2dnOVhd/ny37RrIJNJLLmYM0yFDeKuYtNNAzpK9fpA7Q2PI8QiqNacQ3Pa # TdEYrGlMk7OeNck8xJmJMY5rATthi1D4dIBv3rjQbUolQvPJe2Y9or0R2WL1jK5v # hhr6DY01iSPES3CravmUs/aB1HRMPi/nX45OmFR6frAB7xqWMreh81heBVuoTTK8 # PuXtRQgRMRKwDeTxlc6p+zba4mIEYG8rqJtPFRgViNCJ1KsgSIowup3BNU05YuFn # NoPoRayMDVMgejVgJin3Mg2DCYvt/+MBmO4IoggWlFsXj59uUgA= # =DXnZ # -----END PGP SIGNATURE----- # gpg: Signature made Fri 25 Apr 2025 11:26:55 EDT # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full] # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'single-binary-20250425' of https://github.com/philmd/qemu: (58 commits) qemu: Convert target_name() to TargetInfo API accel: Move target-agnostic code from accel-target.c -> accel-common.c accel: Make AccelCPUClass structure target-agnostic accel: Include missing 'qemu/accel.h' header in accel-internal.h accel: Implement accel_init_ops_interfaces() for both system/user mode cpus: Move target-agnostic methods out of cpu-target.c cpus: Replace CPU_RESOLVING_TYPE -> target_cpu_type() qemu: Introduce target_cpu_type() qapi: Rename TargetInfo structure as QemuTargetInfo hw/microblaze: Evaluate TARGET_BIG_ENDIAN at compile time hw/mips: Evaluate TARGET_BIG_ENDIAN at compile time target/xtensa: Evaluate TARGET_BIG_ENDIAN at compile time target/mips: Check CPU endianness at runtime using env_is_bigendian() accel/kvm: Use target_needs_bswap() linux-user/elfload: Use target_needs_bswap() target/hexagon: Include missing 'accel/tcg/getpc.h' accel/tcg: Correct list of included headers in tcg-stub.c system/kvm: make functions accessible from common code meson: Use osdep_prefix for strchrnul() meson: Share common C source prefixes ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
| * qom: Have class_init() take a const data argumentPhilippe Mathieu-Daudé2025-04-251-2/+2
| | | | | | | | | | | | | | | | | | | | Mechanical change using gsed, then style manually adapted to pass checkpatch.pl script. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-4-philmd@linaro.org>
* | vfio: Register/unregister container for CPR only once for each containerZhenzhong Duan2025-04-251-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | vfio_cpr_register_container and vfio_cpr_unregister_container are container scoped function. Calling them for each device attaching/detaching would corrupt CPR reboot notifier list, i.e., when two VFIO devices are attached to same container and have same notifier registered twice. Fixes: d9fa4223b30a ("vfio: register container for cpr") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250424063355.3855174-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Remove hiod_typename propertyZhenzhong Duan2025-04-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | Because we handle host IOMMU device creation in each container backend, we know which type name to use, so hiod_typename property is useless now, just remove it. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250423072824.3647952-6-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Cleanup host IOMMU device creationZhenzhong Duan2025-04-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | realize() is now moved after attachment, do the same for hiod creation. Introduce a new function vfio_device_hiod_create_and_realize() to do them all in one go. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250423072824.3647952-5-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio/iommufd: Move realize() after attachmentZhenzhong Duan2025-04-251-11/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously device attaching depends on realize() getting host IOMMU capabilities to check dirty tracking support. Now we have a separate call to ioctl(IOMMU_GET_HW_INFO) to get host IOMMU capabilities and check that for dirty tracking support, there is no dependency any more, move realize() call after attachment succeed. Suggested-by: Cédric Le Goater <clg@redhat.com> Suggested-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250423072824.3647952-3-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio/iommufd: Make a separate call to get IOMMU capabilitiesZhenzhong Duan2025-04-251-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we depend on .realize() calling iommufd_backend_get_device_info() to get IOMMU capabilities and check for dirty page tracking support. By make a extra separate call, this dependency is removed. This happens only during device attach, it's not a hot path. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250423072824.3647952-2-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Rename vfio-common.h to vfio-device.hCédric Le Goater2025-04-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | "hw/vfio/vfio-common.h" has been emptied of most of its declarations by the previous changes and the only declarations left are related to VFIODevice. Rename it to "hw/vfio/vfio-device.h" and make the necessary adjustments. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-36-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Introduce vfio_listener_un/register() routinesCédric Le Goater2025-04-251-7/+2
| | | | | | | | | | | | | | | | | | | | This hides the MemoryListener implementation and makes the code common to both IOMMU backends, legacy and IOMMUFD. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-35-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Introduce new files for VFIO MemoryListenerCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | File "common.c" has been emptied of most of its definitions by the previous changes and the only definitions left are related to the VFIO MemoryListener handlers. Rename it to "listener.c" and introduce its associated "vfio-listener.h" header file for the declarations. Cleanup a little the includes while at it. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-33-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Introduce new files for CPR definitions and declarationsCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Gather all CPR related declarations into "vfio-cpr.h" to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". These were introduced in commit d9fa4223b30a ("vfio: register container for cpr"). Order file list in meson.build while at it. Cc: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-22-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Move vfio_kvm_device_add/del_fd() to helpers.cCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | vfio_kvm_device_add/del_fd() are low level routines. Move them with the other helpers. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-18-clg@redhat.com Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-19-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Move Host IOMMU type declarations into their respective filesCédric Le Goater2025-04-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | These definitions don't have any use outside of their respective submodules. There is no need to expose them externally. Keep them private. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-15-clg@redhat.com Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-16-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Move VFIOAddressSpace helpers into container-base.cCédric Le Goater2025-04-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VFIOAddressSpace is a common object used by VFIOContainerBase which is declared in "hw/vfio/vfio-container-base.h". Move the VFIOAddressSpace related services into "container-base.c". While at it, rename : vfio_get_address_space -> vfio_address_space_get vfio_put_address_space -> vfio_address_space_put to better reflect the namespace these routines belong to. Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-15-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* | vfio: Introduce a new header file for VFIOIOMMUFD declarationsCédric Le Goater2025-04-251-0/+1
|/ | | | | | | | | | | | | | | | | Gather all VFIOIOMMUFD related declarations introduced by commits 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend") and 5b1e96e65403 ("vfio/iommufd: Introduce auto domain creation") into "vfio-iommufd.h". This to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Yi Liu <yi.l.liu@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-10-clg@redhat.com Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-11-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio: Compile iommufd.c oncePhilippe Mathieu-Daudé2025-03-111-1/+0
| | | | | | | | | | | | | | Removing unused "exec/ram_addr.h" header allow to compile iommufd.c once for all targets. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20250308230917.18907-6-philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250311085743.21724-8-philmd@linaro.org Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/iommufd: Fix SIGSEV in iommufd_cdev_attach()Zhenzhong Duan2025-02-111-2/+3
| | | | | | | | | | | | | | | When iommufd_cdev_ram_block_discard_disable() fails for whatever reason, errp should be set or else SIGSEV is triggered in vfio_realize() when error_prepend() is called. By this chance, use the same error message for both legacy and iommufd backend. Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20250116102307.260849-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-2/+2
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap supportJoao Martins2024-07-231-0/+28
| | | | | | | | | | | | | | | ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI that fetches the bitmap that tells what was dirty in an IOVA range. A single bitmap is allocated and used across all the hwpts sharing an IOAS which is then used in log_sync() to set Qemu global bitmaps. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
* vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking supportJoao Martins2024-07-231-0/+32
| | | | | | | | | | | ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that enables or disables dirty page tracking. The ioctl is used if the hwpt has been created with dirty tracking supported domain (stored in hwpt::flags) and it is called on the whole list of iommu domains. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* vfio/iommufd: Probe and request hwpt dirty tracking capabilityJoao Martins2024-07-231-0/+26
| | | | | | | | | | | | | | | | | | | | | In preparation to using the dirty tracking UAPI, probe whether the IOMMU supports dirty tracking. This is done via the data stored in hiod::caps::hw_caps initialized from GET_HW_INFO. Qemu doesn't know if VF dirty tracking is supported when allocating hardware pagetable in iommufd_cdev_autodomains_get(). This is because VFIODevice migration state hasn't been initialized *yet* hence it can't pick between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING even if later on VFIOMigration decides to use VF dirty tracking instead. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in iommufd_cdev_autodomains_get() - Added warning for heterogeneous dirty page tracking support in iommufd_cdev_autodomains_get() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
* vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during ↵Joao Martins2024-07-231-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | attach_device() Move the HostIOMMUDevice::realize() to be invoked during the attach of the device before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU behind the device supports dirty tracking. Note: The HostIOMMUDevice data from legacy backend is static and doesn't need any information from the (type1-iommu) backend to be initialized. In contrast however, the IOMMUFD HostIOMMUDevice data requires the iommufd FD to be connected and having a devid to be able to successfully GET_HW_INFO. This means vfio_device_hiod_realize() is called in different places within the backend .attach_device() implementation. Suggested-by: Cédric Le Goater <clg@redhat.cm> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> [ clg: Fixed error handling in iommufd_cdev_attach() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCapsJoao Martins2024-07-231-0/+1
| | | | | | | | | | | | | | Store the value of @caps returned by iommufd_backend_get_device_info() in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING). This is in preparation for HostIOMMUDevice::realize() being called early during attach_device(). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* vfio/{iommufd,container}: Remove caps::aw_bitsJoao Martins2024-07-231-1/+0
| | | | | | | | | | | | | | Remove caps::aw_bits which requires the bcontainer::iova_ranges being initialized after device is actually attached. Instead defer that to .get_cap() and call vfio_device_get_aw_bits() directly. This is in preparation for HostIOMMUDevice::realize() being called early during attach_device(). Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* vfio/iommufd: Introduce auto domain creationJoao Martins2024-07-231-0/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's generally two modes of operation for IOMMUFD: 1) The simple user API which intends to perform relatively simple things with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches to VFIO and mainly performs IOAS_MAP and UNMAP. 2) The native IOMMUFD API where you have fine grained control of the IOMMU domain and model it accordingly. This is where most new feature are being steered to. For dirty tracking 2) is required, as it needs to ensure that the stage-2/parent IOMMU domain will only attach devices that support dirty tracking (so far it is all homogeneous in x86, likely not the case for smmuv3). Such invariant on dirty tracking provides a useful guarantee to VMMs that will refuse incompatible device attachments for IOMMU domains. Dirty tracking insurance is enforced via HWPT_ALLOC, which is responsible for creating an IOMMU domain. This is contrast to the 'simple API' where the IOMMU domain is created by IOMMUFD automatically when it attaches to VFIO (usually referred as autodomains) but it has the needed handling for mdevs. To support dirty tracking with the advanced IOMMUFD API, it needs similar logic, where IOMMU domains are created and devices attached to compatible domains. Essentially mimicking kernel iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain it falls back to IOAS attach. The auto domain logic allows different IOMMU domains to be created when DMA dirty tracking is not desired (and VF can provide it), and others where it is. Here it is not used in this way given how VFIODevice migration state is initialized after the device attachment. But such mixed mode of IOMMU dirty tracking + device dirty tracking is an improvement that can be added on. Keep the 'all of nothing' of type1 approach that we have been using so far between container vs device dirty tracking. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> [ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()Joao Martins2024-07-231-4/+4
| | | | | | | | | | | | | In preparation to implement auto domains have the attach function return the errno it got during domain attach instead of a bool. -EINVAL is tracked to track domain incompatibilities, and decide whether to create a new IOMMU domain. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
* backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW ↵Joao Martins2024-07-231-1/+3
| | | | | | | | | | | | | capabilities The helper will be able to fetch vendor agnostic IOMMU capabilities supported both by hardware and software. Right now it is only iommu dirty tracking. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
* HostIOMMUDevice: Introduce get_page_size_mask() callbackEric Auger2024-07-091-0/+11
| | | | | | | | | | This callback will be used to retrieve the page size mask supported along a given Host IOMMU device. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
* HostIOMMUDevice : remove Error handle from get_iova_ranges callbackEric Auger2024-07-091-1/+1
| | | | | | | | | The error handle argument is not used anywhere. let's remove it. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
* vfio-container-base: Introduce vfio_container_get_iova_ranges() helperEric Auger2024-07-091-7/+1
| | | | | | | | | | | | | | | | | | Introduce vfio_container_get_iova_ranges() to retrieve the usable IOVA regions of the base container and use it in the Host IOMMU device implementations of get_iova_ranges() callback. We also fix a UAF bug as the list was shallow copied while g_list_free_full() was used both on the single call site, in virtio_iommu_set_iommu_device() but also in vfio_container_instance_finalize(). Instead use g_list_copy_deep. Fixes: cf2647a76e ("virtio-iommu: Compute host reserved regions") Signed-off-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
* vfio/container: Move vfio_container_destroy() to an instance_finalize() handlerCédric Le Goater2024-06-241-1/+0
| | | | | | | | | | | | | vfio_container_destroy() clears the resources allocated VFIOContainerBase object. Now that VFIOContainerBase is a QOM object, add an instance_finalize() handler to do the cleanup. It will be called through object_unref(). Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>