summary refs log tree commit diff stats
path: root/hw/vfio/pci.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* hw/vfio/types.h: rename TYPE_VFIO_PCI_BASE to TYPE_VFIO_PCI_DEVICEMark Cave-Ayland2025-09-251-1/+1
| | | | | | | | | This brings the QOM type name in line with the underlying VFIOPCIDevice structure. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250925113159.1760317-17-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci.h: rename VFIOPCIDevice pdev field to parent_objMark Cave-Ayland2025-09-081-1/+1
| | | | | | | | | | | | Now that nothing accesses the pdev field directly, rename pdev to parent_obj as per our current coding guidelines. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250715093110.107317-23-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci.h: update VFIOPCIDevice declarationMark Cave-Ayland2025-09-081-0/+1
| | | | | | | | | | | Update the VFIOPCIDevice declaration so that it is closer to our coding guidelines: add a blank line after the parent object. Signed-off-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250715093110.107317-15-mark.caveayland@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce helper vfio_pci_from_vfio_device()Zhenzhong Duan2025-09-081-0/+12
| | | | | | | | | | | | Introduce helper vfio_pci_from_vfio_device() to transform from VFIODevice to VFIOPCIDevice, also to hide low level VFIO_DEVICE_TYPE_PCI type check. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250822064101.123526-5-zhenzhong.duan@intel.com [ clg: Added documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Enable quirks when IGD is not the primary displayTomita Moeko2025-09-081-0/+5
| | | | | | | | | | | | | | Since linux 6.15, commit 41112160ca87 ("vfio/pci: match IGD devices in display controller class"), IGD related regions are also exposed when IGD is not primary display (device class is Display controller). Allow IGD quirks to be enabled in this configuration so that guests can have display output on IGD when it is not the primary display. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250813160510.23553-1-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: preserve pending interruptsSteve Sistare2025-08-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | cpr-transfer may lose a VFIO interrupt because the KVM instance is destroyed and recreated. If an interrupt arrives in the middle, it is dropped. To fix, stop pending new interrupts during cpr save, and pick up the pieces. In more detail: Stop the VCPUs. Call kvm_irqchip_remove_irqfd_notifier_gsi --> KVM_IRQFD to deassign the irqfd gsi that routes interrupts directly to the VCPU and KVM. After this call, interrupts fall back to the kernel vfio_msihandler, which writes to QEMU's kvm_interrupt eventfd. CPR already preserves that eventfd. When the route is re-established in new QEMU, the kernel tests the eventfd and injects an interrupt to KVM if necessary. Deassign INTx in a similar manner. For both MSI and INTx, remove the eventfd handler so old QEMU does not consume an event. If an interrupt was already pended to KVM prior to the completion of kvm_irqchip_remove_irqfd_notifier_gsi, it will be recovered by the subsequent call to cpu_synchronize_all_states, which pulls KVM interrupt state to userland prior to saving it in vmstate. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1752689169-233452-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: augment set_handlerSteve Sistare2025-08-091-1/+2
| | | | | | | | | | Extend vfio_pci_msi_set_handler() so it can set or clear the handler. Add a similar accessor for INTx. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1752689169-233452-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Fix VGA regions are not exposed in legacy modeTomita Moeko2025-07-281-0/+1
| | | | | | | | | | | | | | | | In commit a59d06305fff ("vfio/pci: Introduce x-pci-class-code option"), pci_register_vga() has been moved ouside of vfio_populate_vga(). As a result, IGD VGA ranges are no longer properly exposed to guest. To fix this, call pci_register_vga() after vfio_populate_vga() legacy mode. A wrapper function vfio_pci_config_register_vga() is introduced to handle it. Fixes: a59d06305fff ("vfio/pci: Introduce x-pci-class-code option") Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250723160906.44941-3-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: fix sub-page bar after cprSteve Sistare2025-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Regions for sub-page BARs are normally mapped here, in response to the guest writing to PCI config space: vfio_pci_write_config() pci_default_write_config() pci_update_mappings() memory_region_add_subregion() vfio_sub_page_bar_update_mapping() ... vfio_dma_map() However, after CPR, the guest does not reconfigure the device and the code path above is not taken. To fix, in vfio_cpr_pci_post_load, call vfio_sub_page_bar_update_mapping for each sub-page BAR with a valid address. Fixes: 7e9f21411302 ("vfio/container: restore DMA vaddr") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1752520890-223356-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* Merge tag 'display-20250718-pull-request' of https://gitlab.com/kraxel/qemu ↵Stefan Hajnoczi2025-07-211-9/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Load ramfb vgabios on x86 only. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmh6o80ACgkQTLbY7tPo # cTjxPBAAktTXxFK6loSMSWC1ul8RCl/4F7G84J4eT+Ui8/KIG8do5KcebTnXb9zo # keOG7n9HPk4fROWiAFgGnuBfw41DWmLDS34iuENrG3X26TQgSSgBveuwas67Pzqu # HpaFSxjh7BRLlkUWaNoll57cDM3kKLmx+Onw6m/7kbcVXAsy1N4wxfCT1faUU7ID # R1ggULG1WhB8q+YtQjac6EfOpdHe1BTBGLuxSwE3mNkce9ZP7C8uxZTCR5PXggZi # IXzJzGpFRDCHqrilWksiE62yF20Kem4ZcpO/GgLWmF+X+DYBDEWcajihvF20TGUL # n6dyT7MBxuvqFy0OtBPHNcnq2PZzOIKyxyMvBg9402xeD6goNbFKloAYeae4C9u0 # QuqQUpb8D3lVagVu55N5XfpdMHR0P8yefPAjaFL4o3rf2JSjyI6MRX/+2eA7aXcX # xiwHSx3iavEeNQNsPZsS3JhH5bKy/zkWRiBd+msGVAYMZGzhdEtLg/w8yUd6dQ5p # /3Y3F4fL6T6QSwhsiihcbdPtjhfVCP09MYK/P4cIFbWOzjfbndt1/UIXHQ54s8Jo # PShcE7QH7ttT2gK5nFPG5yeTqF70kKpSyhwF2pukf2fAgcU+0SNoj2zZNtHAvKeh # 8EHqAy8m1J4AlQeO5nT9tJj/v1CM0q6cljzIfV8hWWgM/hL/vLc= # =76m5 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 18 Jul 2025 15:43:09 EDT # gpg: using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * tag 'display-20250718-pull-request' of https://gitlab.com/kraxel/qemu: hw/i386: Add the ramfb romfile compatibility vfio: Move the TYPE_* to hw/vfio/types.h ramfb: Add property to control if load the romfile Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Conflicts: hw/core/machine.c Context conflict because the vfio-pci "x-migration-load-config-after-iter" was added recently.
| * vfio: Move the TYPE_* to hw/vfio/types.hShaoqin Huang2025-07-181-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the TYPE_* to a new file hw/vfio/types.h because the TYPE_VFIO_PCI will be used in later patch, but directly include the hw/vfio/pci.h can cause some compilation error when cross build the windows version. The hw/vfio/types.h can be included to mitigate that problem. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Message-ID: <20250717100941.2230408-3-shahuang@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
| * ramfb: Add property to control if load the romfileShaoqin Huang2025-07-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the ramfb device loads the vgabios-ramfb.bin unconditionally, but only the x86 need the vgabios-ramfb.bin, this can cause that when use the release package on arm64 it can't find the vgabios-ramfb.bin. Because only seabios will use the vgabios-ramfb.bin, load the rom logic is x86-specific. For other !x86 platforms, the edk2 ships an EFI driver for ramfb, so they don't need to load the romfile. So add a new property use-legacy-x86-rom in both ramfb and vfio_pci device, because the vfio display also use the ramfb_setup() to load the vgabios-ramfb.bin file. After have this property, the machine type can set the compatibility to not load the vgabios-ramfb.bin if the arch doesn't need it. For now the default value is true but it will be turned off by default in subsequent patch when compats get properly handled. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Message-ID: <20250717100941.2230408-2-shahuang@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* | vfio/pci: Introduce x-pci-class-code optionTomita Moeko2025-07-151-4/+2
|/ | | | | | | | | | | | | | | | | | | | | Introduce x-pci-class-code option to allow users to override PCI class code of a device, similar to the existing x-pci-vendor-id option. Only the lower 24 bits of this option are used, though a uint32 is used here for determining whether the value is valid and set by user. Additionally, to ensure VGA ranges are only exposed on VGA devices, pci_register_vga() is now called in vfio_pci_config_setup(), after the class code override is completed. This is mainly intended for IGD devices that expose themselves either as VGA controller (primary display) or Display controller (non-primary display). The UEFI GOP driver depends on the device reporting a VGA controller class code (0x030000). Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250708145211.6179-1-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-pci: preserve MSISteve Sistare2025-07-031-0/+2
| | | | | | | | | | | | | Save the MSI message area as part of vfio-pci vmstate, and preserve the interrupt and notifier eventfd's. migrate_incoming loads the MSI data, then the vfio-pci post_load handler finds the eventfds in CPR state, rebuilds vector data structures, and attaches the interrupts to the new KVM instance. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio-user: forward MSI-X PBA BAR accesses to serverJohn Levon2025-06-261-0/+1
| | | | | | | | | | | | | | For vfio-user, the server holds the pending IRQ state; set up an I/O region for the MSI-X PBA so we can ask the server for this state on a PBA read. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-11-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: export MSI functionsSteve Sistare2025-06-111-0/+8
| | | | | | | | | | Export various MSI functions, renamed with a vfio_pci prefix, for use by CPR in subsequent patches. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-18-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: export PCI helpers needed for vfio-userJohn Levon2025-06-111-0/+11
| | | | | | | | | | | | The vfio-user code will need to re-use various parts of the vfio PCI code. Export them in hw/vfio/pci.h, and rename them to the vfio_pci_* namespace. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: add vfio-pci-base classJohn Levon2025-05-091-1/+9
| | | | | | | | | | | | | | | | | | Split out parts of TYPE_VFIO_PCI into a base TYPE_VFIO_PCI_BASE, although we have not yet introduced another subclass, so all the properties have remained in TYPE_VFIO_PCI. Note that currently there is no need for additional data for TYPE_VFIO_PCI, so it shares the same C struct type as TYPE_VFIO_PCI_BASE, VFIOPCIDevice. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-14-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Rename vfio-common.h to vfio-device.hCédric Le Goater2025-04-251-1/+1
| | | | | | | | | | | "hw/vfio/vfio-common.h" has been emptied of most of its declarations by the previous changes and the only declarations left are related to VFIODevice. Rename it to "hw/vfio/vfio-device.h" and make the necessary adjustments. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-36-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce new files for VFIORegion definitions and declarationsCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | | | | | | | Gather all VFIORegion related declarations and definitions into their own files to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". They were introduced for 'vfio-platform' support in commits db0da029a185 ("vfio: Generalize region support") and a664477db8da ("hw/vfio/pci: Introduce VFIORegion"). To be noted that the 'vfio-platform' devices have been deprecated and will be removed in QEMU 10.2. Until then, make the declarations available externally for 'sysbus-fdt.c'. Cc: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-12-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce a new header file for VFIOdisplay declarationsCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | Gather all VFIOdisplay related declarations into "vfio-display.h" to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-8-clg@redhat.com Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-9-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include/system: Move exec/memory.h to system/memory.hRichard Henderson2025-04-231-1/+1
| | | | | | | | | | | | Convert the existing includes with sed -i ,exec/memory.h,system/memory.h,g Move the include within cpu-all.h into a !CONFIG_USER_ONLY block. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* vfio/igd: Introduce x-igd-lpc option for LPC bridge ID quirkTomita Moeko2025-03-111-0/+3
| | | | | | | | | | | | | | | The LPC bridge/Host bridge IDs quirk is also not dependent on legacy mode. Recent Windows driver no longer depends on these IDs, as well as Linux i915 driver, while UEFI GOP seems still needs them. Make it an option to allow users enabling and disabling it as needed. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-10-tomitamoeko@gmail.com [ clg: - Fixed spelling in vfio_probe_igd_config_quirk() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Handle x-igd-opregion option in config quirkTomita Moeko2025-03-111-2/+0
| | | | | | | | | | | | | | Both enable OpRegion option (x-igd-opregion) and legacy mode require setting up OpRegion copy for IGD devices. As the config quirk no longer depends on legacy mode, we can now handle x-igd-opregion option there instead of in vfio_realize. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-9-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Decouple common quirks from legacy modeTomita Moeko2025-03-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far, IGD-specific quirks all require enabling legacy mode, which is toggled by assigning IGD to 00:02.0. However, some quirks, like the BDSM and GGC register quirks, should be applied to all supported IGD devices. A new config option, x-igd-legacy-mode=[on|off|auto], is introduced to control the legacy mode only quirks. The default value is "auto", which keeps current behavior that enables legacy mode implicitly and continues on error when all following conditions are met. * Machine type is i440fx * IGD device is at guest BDF 00:02.0 If any one of the conditions above is not met, the default behavior is equivalent to "off", QEMU will fail immediately if any error occurs. Users can also use "on" to force enabling legacy mode. It checks if all the conditions above are met and set up legacy mode. QEMU will also fail immediately on error in this case. Additionally, the hotplug check in legacy mode is removed as hotplugging IGD device is never supported, and it will be checked when enabling the OpRegion quirk. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-8-tomitamoeko@gmail.com [ clg: - Changed warn_report() by info_report() in vfio_probe_igd_config_quirk() as suggested by Alex W. - Fixed spelling in vfio_probe_igd_config_quirk () ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Refactor vfio_probe_igd_bar4_quirk into pci config quirkTomita Moeko2025-03-111-1/+1
| | | | | | | | | | | | | | | | | | The actual IO BAR4 write quirk in vfio_probe_igd_bar4_quirk was removed in previous change, leaving the function not matching its name, so move it into the newly introduced vfio_config_quirk_setup. There is no functional change in this commit. For now, to align with current legacy mode behavior, it returns and proceeds on error. Later it will fail on error after decoupling the quirks from legacy mode. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-7-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Add placeholder for device-specific config space quirksTomita Moeko2025-03-111-0/+1
| | | | | | | | | | | | | | IGD devices require device-specific quirk to be applied to their PCI config space. Currently, it is put in the BAR4 quirk that does nothing to BAR4 itself. Add a placeholder for PCI config space quirks to hold that quirk later. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-6-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/igd: Consolidate OpRegion initialization into a single functionTomita Moeko2025-03-111-3/+1
| | | | | | | | | | | | | | | | | | | | | Both x-igd-opregion option and legacy mode require identical steps to set up OpRegion for IGD devices. Consolidate these steps into a single vfio_pci_igd_setup_opregion function. The function call in pci.c is wrapped with ifdef temporarily to prevent build error for non-x86 archs, it will be removed after we decouple it from legacy mode. Additionally, move vfio_pci_igd_opregion_init to igd.c to prevent it from being compiled in non-x86 builds. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-4-tomitamoeko@gmail.com [ clg: Fixed spelling in vfio_pci_igd_setup_opregion() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Delete local pm_capAlex Williamson2025-03-061-1/+0
| | | | | | | | | | | | This is now redundant to PCIDevice.pm_cap. Cc: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-4-alex.williamson@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-1/+1
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* vfio/igd: add new bar0 quirk to emulate BDSM mirrorCorvin Köhne2024-09-171-0/+1
| | | | | | | | | | | | | | | | | | | The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value from MMIO space instead of PCI config space for those devices [1]. Therefore, we either have to keep a 1:1 mapping for the host and guest address or we have to emulate the MMIO register too. Using the igd in legacy mode is already hard due to it's many constraints. Keeping a 1:1 mapping may not work in all cases and makes it even harder to use. An MMIO emulation has to trap the whole MMIO page. This makes accesses to this page slower compared to using second level address translation. Nevertheless, it doesn't have any constraints and I haven't noticed any performance degradation yet making it a better solution. [1] https://github.com/projectacrn/acrn-hypervisor/blob/5c351bee0f6ae46250eefc07f44b4a31e770f3cf/devicemodel/hw/pci/passthrough.c#L650-L653 Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
* vfio/pci-quirks: Make vfio_add_*_cap() return boolZhenzhong Duan2024-05-221-1/+1
| | | | | | | | | | | | | | | This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Include below functions: vfio_add_virt_caps() vfio_add_nv_gpudirect_cap() vfio_add_vmd_shadow_cap() Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return boolZhenzhong Duan2024-05-221-3/+3
| | | | | | | | | | This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Make vfio_populate_vga() return boolZhenzhong Duan2024-05-221-1/+1
| | | | | | | | | | This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/display: Make vfio_display_*() return boolZhenzhong Duan2024-05-221-1/+1
| | | | | | | | | | This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: migration: Skip config space check for Vendor Specific Information ↵Vinayak Kale2024-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in VSC during restore/load In case of migration, during restore operation, qemu checks config space of the pci device with the config space in the migration stream captured during save operation. In case of config space data mismatch, restore operation is failed. config space check is done in function get_pci_config_device(). By default VSC (vendor-specific-capability) in config space is checked. Due to qemu's config space check for VSC, live migration is broken across NVIDIA vGPU devices in situation where source and destination host driver is different. In this situation, Vendor Specific Information in VSC varies on the destination to ensure vGPU feature capabilities exposed to the guest driver are compatible with destination host. If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with volatile Vendor Specific Info in VSC then qemu should exempt config space check for Vendor Specific Info. It is vendor driver's responsibility to ensure that VSC is consistent across migration. Here consistency could mean that VSC format should be same on source and destination, however actual Vendor Specific Info may not be byte-to-byte identical. This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI device by clearing pdev->cmask[] offsets. Config space check is still enforced for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips config space check for that offset. VSC check is skipped for machine types >= 9.1. The check would be enforced on older machine types (<= 9.0). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Vinayak Kale <vkale@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Introduce a vfio pci hot reset interfaceZhenzhong Duan2023-12-191-0/+3
| | | | | | | | | | | | | | | | | | | | Legacy vfio pci and iommufd cdev have different process to hot reset vfio device, expand current code to abstract out pci_hot_reset callback for legacy vfio, this same interface will also be used by iommufd cdev vfio device. Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it into container.c. vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so they could be called in legacy and iommufd pci_hot_reset callback. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_infoZhenzhong Duan2023-12-191-0/+3
| | | | | | | | | | | | | This helper will be used by both legacy and iommufd backends. No functional changes intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio: add ramfb migration supportMarc-André Lureau2023-10-181-0/+3
| | | | | | | | | | | | | Add a "VFIODisplay" subsection whenever "x-ramfb-migrate" is turned on. Turn it off by default on machines <= 8.1 for compatibility reasons. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> [ clg: - checkpatch fixes - improved warn_report() in vfio_realize() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: detect the support of dynamic MSI-X allocationJing Liu2023-10-051-0/+1
| | | | | | | | | | | | | | | Kernel provides the guidance of dynamic MSI-X allocation support of passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to guide user space. Fetch the flags from host to determine if dynamic MSI-X allocation is supported. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* spapr: Remove support for NVIDIA V100 GPU with NVLink2Cédric Le Goater2023-09-181-2/+0
| | | | | | | | | | | | | | | | | | | NVLink2 support was removed from the PPC PowerNV platform and VFIO in Linux 5.13 with commits : 562d1e207d32 ("powerpc/powernv: remove the nvlink support") b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2") This was 2.5 years ago. Do the same in QEMU with a revert of commit ec132efaa81f ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some adjustements are required on the NUMA part. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230918091717.149950-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* vfio/pci: Enable AtomicOps completers on root portsAlex Williamson2023-07-101-0/+1
| | | | | | | | | | | | | | | | Dynamically enable Atomic Ops completer support around realize/exit of vfio-pci devices reporting host support for these accesses and adhering to a minimal configuration standard. While the Atomic Ops completer bits in the root port device capabilities2 register are read-only, the PCIe spec does allow RO bits to change to reflect hardware state. We take advantage of that here around the realize and exit functions of the vfio-pci device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Robin Voetter <robin@streamhpc.com> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/pci: add support for VF tokenMinwoo Im2023-05-091-0/+1
| | | | | | | | | | | | | | | | | | | VF token was introduced [1] to kernel vfio-pci along with SR-IOV support [2]. This patch adds support VF token among PF and VF(s). To passthu PCIe VF to a VM, kernel >= v5.7 needs this. It can be configured with UUID like: -device vfio-pci,host=DDDD:BB:DD:F,vf-token=<uuid>,... [1] https://lore.kernel.org/linux-pci/158396393244.5601.10297430724964025753.stgit@gimli.home/ [2] https://lore.kernel.org/linux-pci/158396044753.5601.14804870681174789709.stgit@gimli.home/ Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20230320073522epcms2p48f682ecdb73e0ae1a4850ad0712fd780@epcms2p4 Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* include/hw/pci: Split pci_device.h off pci.hMarkus Armbruster2023-01-081-1/+1
| | | | | | | | | | | | | | | | | | | PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vfio: defer to commit kvm irq routing when enable msi/msixLongpeng(Mike)2022-05-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In migration resume phase, all unmasked msix vectors need to be setup when loading the VF state. However, the setup operation would take longer if the VM has more VFs and each VF has more unmasked vectors. The hot spot is kvm_irqchip_commit_routes, it'll scan and update all irqfds that are already assigned each invocation, so more vectors means need more time to process them. vfio_pci_load_config vfio_msix_enable msix_set_vector_notifiers for (vector = 0; vector < dev->msix_entries_nr; vector++) { vfio_msix_vector_do_use vfio_add_kvm_msi_virq kvm_irqchip_commit_routes <-- expensive } We can reduce the cost by only committing once outside the loop. The routes are cached in kvm_state, we commit them first and then bind irqfd for each vector. The test VM has 128 vcpus and 8 VF (each one has 65 vectors), we measure the cost of the vfio_msix_enable for each VF, and we can see 90+% costs can be reduce. VF Count of irqfds[*] Original With this patch 1st 65 8 2 2nd 130 15 2 3rd 195 22 2 4th 260 24 3 5th 325 36 2 6th 390 44 3 7th 455 51 3 8th 520 58 4 Total 258ms 21ms [*] Count of irqfds How many irqfds that already assigned and need to process in this round. The optimization can be applied to msi type too. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-6-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* hw/vfio/pci-quirks: Replace the word 'blacklist'Philippe Mathieu-Daudé2021-03-161-1/+1
| | | | | | | | | | | | | | | Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [*] and replace the word "blacklist" appropriately. [*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-9-philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* vfio: Make vfio-pci device migration capableKirti Wankhede2020-11-011-1/+0
| | | | | | | | | | | | | | | If the device is not a failover primary device, call vfio_migration_probe() and vfio_migration_finalize() to enable migration support for those devices that support it respectively to tear it down again. Removed migration blocker from VFIO PCI device specific structure and use migration blocker from generic structure of VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* Use OBJECT_DECLARE_SIMPLE_TYPE when possibleEduardo Habkost2020-09-181-3/+1
| | | | | | | | | | | | | This converts existing DECLARE_INSTANCE_CHECKER usage to OBJECT_DECLARE_SIMPLE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareSimpleType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20200916182519.415636-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* vfio: Rename PCI_VFIO to VFIO_PCIEduardo Habkost2020-09-091-1/+1
| | | | | | | | | | Make the type checking macro name consistent with the TYPE_* constant. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20200902224311.1321159-56-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* Use DECLARE_*CHECKER* macrosEduardo Habkost2020-09-091-1/+2
| | | | | | | | | | | | | | | Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>