summary refs log tree commit diff stats
path: root/hw/pci/pcie.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* pcie: Add a way to get the outstanding page request allocation (pri) from ↵CLEMENT MATHIEU--DRIF2025-10-051-0/+8
| | | | | | | | | the config space. Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901111630.1018573-2-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Helper functions to check to check if PRI is enabledCLEMENT MATHIEU--DRIF2025-06-011-0/+9
| | | | | | | | | | pri_enabled can be used to check whether the capability is present and enabled on a PCIe device Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Message-Id: <20250520071823.764266-6-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Add a helper to declare the PRI capability for a pcie deviceCLEMENT MATHIEU--DRIF2025-06-011-0/+26
| | | | | | | Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Message-Id: <20250520071823.764266-5-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Helper function to check if ATS is enabledCLEMENT MATHIEU--DRIF2025-06-011-0/+9
| | | | | | | | | | | ats_enabled checks whether the capability is present or not. If so, we read the configuration space to get the status of the feature (enabled or not). Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Message-Id: <20250520071823.764266-4-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Helper functions to check if PASID is enabledCLEMENT MATHIEU--DRIF2025-06-011-0/+9
| | | | | | | | | | | pasid_enabled checks whether the capability is present or not. If so, we read the configuration space to get the status of the feature (enabled or not). Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Message-Id: <20250520071823.764266-3-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Add helper to declare PASID capability for a pcie deviceCLEMENT MATHIEU--DRIF2025-06-011-0/+25
| | | | | | | Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Message-Id: <20250520071823.764266-2-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: ensure valid link status bits for downstream portsSebastian Ott2025-01-151-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | PCI hotplug for downstream endpoints on arm fails because Linux' PCIe hotplug driver doesn't like the QEMU provided LNKSTA: pcieport 0000:08:01.0: pciehp: Slot(2): Card present pcieport 0000:08:01.0: pciehp: Slot(2): Link Up pcieport 0000:08:01.0: pciehp: Slot(2): Cannot train link: status 0x2000 There's 2 cases where LNKSTA isn't setup properly: * the downstream device has no express capability * max link width of the bridge is 0 Move the sanity checks added via 88c869198aa63 ("pci: Sanity test minimum downstream LNKSTA") outside of the branch to make sure downstream ports always have a valid LNKSTA. Signed-off-by: Sebastian Ott <sebott@redhat.com> Tested-by: Zhenyu Zhang <zhenyzha@redhat.com> Message-Id: <20241203121928.14861-1-sebott@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: enable Extended tag field supportMarcin Juszkiewicz2024-11-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | >From what I read PCI has 32 transactions, PCI Express devices can handle 256 with Extended tag enabled (spec mentions also larger values but I lack PCIe knowledge). QEMU leaves 'Extended tag field' with 0 as value: Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, IntMsgNum 0 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ FLReset- TEE-IO- SBSA ACS has test 824 which checks for PCIe device capabilities. BSA specification [1] (SBSA is on top of BSA) in section F.3.2 lists expected values for Device Capabilities Register: Device Capabilities Register Requirement Role based error reporting RCEC and RCiEP: Hardwired to 1 Endpoint L0s acceptable latency RCEC and RCiEP: Hardwired to 0 L1 acceptable latency RCEC and RCiEP: Hardwired to 0 Captured slot power limit scale RCEC and RCiEP: Hardwired to 0 Captured slot power limit value RCEC and RCiEP: Hardwired to 0 Max payload size value must be compliant with PCIe spec Phantom functions RCEC and RCiEP: Recommendation is to hardwire this bit to 0. Extended tag field Hardwired to 1 1. https://developer.arm.com/documentation/den0094/c/ This change enables Extended tag field. All versioned platforms should have it disabled for older versions (tested with Arm/virt). Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Message-Id: <20241023113820.486017-1-marcin.juszkiewicz@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/pcie: Provide a utility function for control of EP / SW USP linkJonathan Cameron2024-11-041-0/+18
| | | | | | | | | | | | | Whilst similar to existing PCIESlot link configuration a few registers need to be set differently so that the downstream device presents a 'configured' state that is then used to 'train' the upstream port on the link. Basically that means setting the status register to reflect it succeeding in training up to target settings. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240916173518.1843023-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/pcie: Factor out PCI Express link register filling common to EP.Jonathan Cameron2024-11-041-39/+48
| | | | | | | | | | | Whilst not all link related registers are common between RP / Switch DSP and EP / Switch USP many of them are. Factor that group out to save on duplication when adding EP / Swtich USP configurability. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240916173518.1843023-4-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Peter Maydell2024-03-131-0/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging virtio,pc,pci: features, cleanups, fixes more memslots support in libvhost-user support PCIe Gen5/Gen6 link speeds in pcie more traces in vdpa network simulation devices support in vdpa SMBIOS type 9 descriptor implementation Bump max_cpus to 4096 vcpus in q35 aw-bits and granule options in VIRTIO-IOMMU Support report NUMA nodes for device memory using GI in acpi Beginning of shutdown event support in pvpanic fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/ # 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt # +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ # uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr # 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G # 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg= # =C0UU # -----END PGP SIGNATURE----- # gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits) docs/specs/pvpanic: document shutdown event hw/cxl: Fix missing reserved data in CXL Device DVSEC hmat acpi: Fix out of bounds access due to missing use of indirection hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory. qemu-options.hx: Document the virtio-iommu-pci aw-bits option hw/arm/virt: Set virtio-iommu aw-bits default value to 48 hw/i386/q35: Set virtio-iommu aw-bits default value to 39 virtio-iommu: Add an option to define the input range width virtio-iommu: Trace domain range limits as unsigned int qemu-options.hx: Document the virtio-iommu-pci granule option virtio-iommu: Change the default granule to the host page size virtio-iommu: Add a granule property hw/i386/acpi-build: Add support for SRAT Generic Initiator structures hw/acpi: Implement the SRAT GI affinity structure qom: new object to associate device to NUMA node hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it hw/i386/pc: Set "normal" boot device order in pc_basic_device_init() hw/i386/pc: Avoid one use of the current_machine global hw/i386/pc: Remove "rtc_state" link again Revert "hw/i386/pc: Confine system flash handling to pc_sysfw" ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/core/machine.c
| * pcie: Support PCIe Gen5/Gen6 link speedsLukas Stockner2024-03-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the PCIe link speed option so that slots can be configured as supporting 32GT/s (Gen5) or 64GT/s (Gen5) speeds. This is as simple as setting the appropriate bit in LnkCap2 and the appropriate value in LnkCap and LnkCtl2. Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com> Message-Id: <20240215012326.3272366-1-lstockner@genesiscloud.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | hw/pci: add some convenient trace-events for pcie and shpc hotplugVladimir Sementsov-Ogievskiy2024-03-111-0/+56
|/ | | | | | | | | Add trace-events that may help to debug problems with hotplugging. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240301154146.761531-2-vsementsov@yandex-team.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* pcie: Specify 0 for ARI next function numbersAkihiko Odaki2023-07-101-1/+1
| | | | | | | | | | | | | | | | | | | | | The current implementers of ARI are all SR-IOV devices. The ARI next function number field is undefined for VF according to PCI Express Base Specification Revision 5.0 Version 1.0 section 9.3.7.7. The PF still requires some defined value so end the linked list formed with the field by specifying 0 as required for any ARI implementation according to section 7.8.7.2. For migration, the field will keep having 1 as its value on the old QEMU machine versions. Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt") Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV") Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-3-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Use common ARI next function numberAkihiko Odaki2023-07-101-1/+3
| | | | | | | | | | | Currently the only implementers of ARI is SR-IOV devices, and they behave similar. Share the ARI next function number. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-2-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Add hotplug detect state register to cmaskLeonardo Bras2023-07-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to migrate a machine type pc-q35-6.0 or lower, with this cmdline options, -device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \ -device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1 the following bug happens after all ram pages were sent: qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19 qemu-kvm: Failed to load PCIDevice:config qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port' qemu-kvm: load of migration failed: Invalid argument This happens on pc-q35-6.0 or lower because of: { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" } In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(), which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI hotplug for the guest. After a while the guest will deal with this hotplug and qemu will clear the above bit. Then, during migration, get_pci_config_device() will compare the configs of both the freshly created device and the one that is being received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit and cause the bug to reproduce. To avoid this fake incompatibility, there are tree fields in PCIDevice that can help: - wmask: Used to implement R/W bytes, and - w1cmask: Used to implement RW1C(Write 1 to Clear) bytes - cmask: Used to enable config checks on load. According to PCI Express® Base Specification Revision 5.0 Version 1.0, table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is listed as RO (read-only), so it only makes sense to make use of the cmask field. So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on get_pci_config_device() does not abort the migration. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819 Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20230706045546.593605-3-leobras@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
* pcie: Add a PCIe capability version helperAlex Williamson2023-07-101-0/+7
| | | | | | | | | | Report the PCIe capability version for a device Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* pcie: set power indicator to off on reset by defaultVladimir Sementsov-Ogievskiy2023-03-021-0/+1
| | | | | | | | | | It should not be zero, the only valid values are ON, OFF and BLINK. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-13-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: introduce pcie_sltctl_powered_off() helperVladimir Sementsov-Ogievskiy2023-03-021-6/+10
| | | | | | | | | | | | | | | | | In pcie_cap_slot_write_config() we check for PCI_EXP_SLTCTL_PWR_OFF in a bad form. We should distinguish PCI_EXP_SLTCTL_PWR which is a "mask" and PCI_EXP_SLTCTL_PWR_OFF which is value for that mask. Better code is in pcie_cap_slot_unplug_request_cb() and in pcie_cap_update_power(). Let's use same pattern everywhere. To simplify things add also a helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-12-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: pcie_cap_slot_enable_power() use correct helperVladimir Sementsov-Ogievskiy2023-03-021-2/+2
| | | | | | | | | | | | | | | | | | | *_by_mask() helpers shouldn't be used here (and that's the only one). *_by_mask() helpers do shift their value argument, but in pcie.c code we use values that are already shifted appropriately. Happily, PCI_EXP_SLTCTL_PWR_ON is zero, so shift doesn't matter. But if we apply same helper for PCI_EXP_SLTCTL_PWR_OFF constant it will do wrong thing. So, let's use instead pci_word_test_and_clear_mask() which is already used in the file to clear PCI_EXP_SLTCTL_PWR_OFF bit in pcie_cap_slot_init() and pcie_cap_slot_reset(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-11-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie_regs: drop duplicated indicator value macrosVladimir Sementsov-Ogievskiy2023-03-021-6/+7
| | | | | | | | | | | | | | | We already have indicator values in include/standard-headers/linux/pci_regs.h , no reason to reinvent them in include/hw/pci/pcie_regs.h. (and we already have usage of PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in hw/pci/pcie.c, so let's be consistent) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-9-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: pcie_cap_slot_write_config(): use correct macroVladimir Sementsov-Ogievskiy2023-03-021-2/+2
| | | | | | | | | | | | | | PCI_EXP_SLTCTL_PIC_OFF is a value, and PCI_EXP_SLTCTL_PIC is a mask. Happily PCI_EXP_SLTCTL_PIC_OFF is a maximum value for this mask and is equal to the mask itself. Still the code looks like a bug. Let's make it more reader-friendly. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-8-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: acpi hotplug: rename x-native-hotplug to x-do-not-expose-native-hotplug-capIgor Mammedov2023-01-281-3/+3
| | | | | | | | | | | | | | | | | | | | When ACPI PCI hotplug for Q35 was introduced (6.1), it was implemented by hiding HPC capability on PCIE slot. That however led to a number of regressions and to fix it, it was decided to keep HPC cap exposed in ACPI PCI hotplug case and force guest in ACPI PCI hotplug mode by other means [1]. That reduced meaning of x-native-hotplug to a compat knob [2] for broken 6.1 machine type. Rename property to match its current purpose. 1) 211afe5c69 (hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC) 2) c318bef762 (hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type) Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230112140312.3096331-10-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Don't try triggering a LSI when not definedFrederic Barrat2022-04-201-2/+3
| | | | | | | | | | | | | | | | | This patch skips [de]asserting a LSI interrupt if the device doesn't have any LSI defined. Doing so would trigger an assert in pci_irq_handler(). The PCIE root port implementation in qemu requests a LSI (INTA), but a subclass may want to change that behavior since it's a valid configuration. For example on the POWER8/POWER9/POWER10 systems, the root bridge doesn't request any LSI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220408131303.147840-2-fbarrat@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* acpi: pcihp: pcie: set power on cap on parent slotIgor Mammedov2022-03-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on creation a PCIDevice has power turned on at the end of pci_qdev_realize() however later on if PCIe slot isn't populated with any children it's power is turned off. It's fine if native hotplug is used as plug callback will power slot on among other things. However when ACPI hotplug is enabled it replaces native PCIe plug callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and as result slot stays powered off. It works fine as ACPI hotplug on guest side takes care of enumerating/initializing hotplugged device. But when later guest is migrated, call chain introduced by] commit d5daff7d312 (pcie: implement slot power control for pcie root ports) pcie_cap_slot_post_load() -> pcie_cap_update_power() -> pcie_set_power_device() -> pci_set_power() -> pci_update_mappings() will disable earlier initialized BARs for the hotplugged device in powered off slot due to commit 23786d13441 (pci: implement power state) which disables BARs if power is off. Fix it by setting PCI_EXP_SLTCTL_PCC to PCI_EXP_SLTCTL_PWR_ON on slot (root port/downstream port) at the time a device hotplugged into it. As result PCI_EXP_SLTCTL_PWR_ON is migrated to target and above call chain keeps device plugged into it powered on. Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") Fixes: 23786d13441 ("pci: implement power state") Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Suggested-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220301151200.3507298-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Add support for Single Root I/O Virtualization (SR/IOV)Knut Omang2022-03-061-0/+5
| | | | | | | | | | | This patch provides the building blocks for creating an SR/IOV PCIe Extended Capability header and register/unregister SR/IOV Virtual Functions. Signed-off-by: Knut Omang <knuto@ifi.uio.no> Message-Id: <20220217174504.1051716-2-lukasz.maniak@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Fix bad overflow check in hw/pci/pcie.cDaniella Lee2021-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Orginal qemu commit hash:14d02cfbe4adaeebe7cb833a8cc71191352cf03b In function pcie_add_capability, an assert contains the "offset < offset + size" expression. Both variable offset and variable size are uint16_t, the comparison is always true due to type promotion. The next expression may be the same. It might be like this: Thread 1 "qemu-system-x86" hit Breakpoint 1, pcie_add_capability ( dev=0x555557ce5f10, cap_id=1, cap_ver=2 '\002', offset=256, size=72) at ../hw/pci/pcie.c:930 930 { (gdb) n 931 assert(offset >= PCI_CONFIG_SPACE_SIZE); (gdb) n 932 assert(offset < offset + size); (gdb) p offset $1 = 256 (gdb) p offset < offset + size $2 = 1 (gdb) set offset=65533 (gdb) p offset < offset + size $3 = 1 (gdb) p offset < (uint16_t)(offset + size) $4 = 0 Signed-off-by: Daniella Lee <daniellalee111@gmail.com> Message-Id: <20211126061324.47331-1-daniellalee111@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: expire pending deleteGerd Hoffmann2021-11-151-0/+2
| | | | | | | | | | | | | | Add an expire time for pending delete, once the time is over allow pressing the attention button again. This makes pcie hotplug behave more like acpi hotplug, where one can try sending an 'device_del' monitor command again in case the guest didn't respond to the first attempt. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-7-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: fast unplug when slot power is offGerd Hoffmann2021-11-151-0/+10
| | | | | | | | | | | | | In case the slot is powered off (and the power indicator turned off too) we can unplug right away, without round-trip to the guest. Also clear pending attention button press, there is nothing to care about any more. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-6-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: factor out pcie_cap_slot_unplug()Gerd Hoffmann2021-11-151-12/+20
| | | | | | | | | No functional change. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-5-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: add power indicator blink checkGerd Hoffmann2021-11-151-0/+7
| | | | | | | | | | Refuse to push the attention button in case the guest is busy with some hotplug operation (as indicated by the power indicator blinking). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-4-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: implement slot power control for pcie root portsGerd Hoffmann2021-11-151-0/+28
| | | | | | | | | | | | | | With this patch hot-plugged pci devices will only be visible to the guest if the guests hotplug driver has enabled slot power. This should fix the hot-plug race which one can hit when hot-plugging a pci device at boot, while the guest is in the middle of the pci bus scan. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-3-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: Export pci_for_each_device_under_bus*()Peter Xu2021-11-011-3/+1
| | | | | | | | | | | | | | They're actually more commonly used than the helper without _under_bus, because most callers do have the pci bus on hand. After exporting we can switch a lot of the call sites to use these two helpers. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20211028043129.38871-3-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
* hw/pci/pcie: Do not set HPC flag if acpihp is usedJulia Suvorova2021-07-161-1/+7
| | | | | | | | | | | | | | | | | | Instead of changing the hot-plug type in _OSC register, do not set the 'Hot-Plug Capable' flag. This way guest will choose ACPI hot-plug if it is preferred and leave the option to use SHPC with pcie-pci-bridge. The ability to control hot-plug for each downstream port is retained, while 'hotplug=off' on the port means all hot-plug types are disabled. Signed-off-by: Julia Suvorova <jusual@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210713004205.775386-4-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-pci: compat page aligned ATSJason Wang2021-04-061-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | Commit 4c70875372b8 ("pci: advertise a page aligned ATS") advertises the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU drivers since 5.2. But it forgot the compat the capability which breaks the migration from old machine type: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 This patch introduces a new parameter "x-ats-page-aligned" for virtio-pci device and turns it on for machine type which is newer than 5.1. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: qemu-stable@nongnu.org Fixes: 4c70875372b8 ("pci: advertise a page aligned ATS") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20210406040330.11306-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: don't set link state active if the slot is emptyLaurent Vivier2021-02-231-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the pcie slot is initialized, by default PCI_EXP_LNKSTA_DLLLA (Data Link Layer Link Active) is set in PCI_EXP_LNKSTA (Link Status) without checking if the slot is empty or not. This is confusing for the kernel because as it sees the link is up it tries to read the vendor ID and fails: (From https://bugzilla.kernel.org/show_bug.cgi?id=211691) [ 1.661105] pcieport 0000:00:02.2: pciehp: Slot Capabilities : 0x0002007b [ 1.661115] pcieport 0000:00:02.2: pciehp: Slot Status : 0x0010 [ 1.661123] pcieport 0000:00:02.2: pciehp: Slot Control : 0x07c0 [ 1.661138] pcieport 0000:00:02.2: pciehp: Slot #0 AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+ Interlock+ NoCompl- IbPresDis- LLActRep+ [ 1.662581] pcieport 0000:00:02.2: pciehp: pciehp_get_power_status: SLOTCTRL 6c value read 7c0 [ 1.662597] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662703] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 1.662706] pcieport 0000:00:02.2: pciehp: pcie_enable_notification: SLOTCTRL 6c write cmd 1031 [ 1.662730] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662748] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662750] pcieport 0000:00:02.2: pciehp: Slot(0-2): Link Up [ 2.896132] pcieport 0000:00:02.2: pciehp: pciehp_check_link_status: lnk_status = 2204 [ 2.896135] pcieport 0000:00:02.2: pciehp: Slot(0-2): No device found [ 2.896900] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 2.896903] pcieport 0000:00:02.2: pciehp: pciehp_power_off_slot: SLOTCTRL 6c write cmd 400 [ 3.656901] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0009 from Slot Status This is really a problem with virtio-net failover that hotplugs a VFIO card during the boot process. The kernel can shutdown the slot while QEMU is hotplugging it, and this likely ends by an automatic unplug of the card. At the end of the boot sequence the card has disappeared. To fix that, don't set the "Link Active" state in the init function, but rely on the plug function to do it, as the mechanism has already been introduced by 2f2b18f60bf1. Fixes: 2f2b18f60bf1 ("pcie: set link state inactive/active after hot unplug/plug") Cc: zhengxiang9@huawei.com Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths") Cc: alex.williamson@redhat.com Fixes: b2101eae63ea ("pcie: Set the "link active" in the link status register") Cc: benh@kernel.crashing.org Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20210212135250.2738750-5-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: advertise a page aligned ATSJason Wang2020-10-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | After Linux kernel commit 61363c1474b1 ("iommu/vt-d: Enable ATS only if the device uses page aligned address."), ATS will be only enabled if device advertises a page aligned request. Unfortunately, vhost-net is the only user and we don't advertise the aligned request capability in the past since both vhost IOTLB and address_space_get_iotlb_entry() can support non page aligned request. Though it's not clear that if the above kernel commit makes sense. Let's advertise a page aligned ATS here to make vhost device IOTLB work with Intel IOMMU again. Note that in the future we may extend pcie_ats_init() to accept parameters like queue depth and page alignment. Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20200909081731.24688-1-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qdev: Drop qbus_set_hotplug_handler() parameter @errpMarkus Armbruster2020-07-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qbus_set_hotplug_handler() is a simple wrapper around object_property_set_link(). object_property_set_link() fails when the property doesn't exist, is not settable, or its .check() method fails. These are all programming errors here, so passing &error_abort to qbus_set_hotplug_handler() is appropriate. Most of its callers do. Exceptions: * pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL, i.e. they ignore errors. * spapr_machine_init() passes &error_fatal. * s390_pcihost_realize(), virtio_serial_device_realize(), s390_pcihost_plug() pass the error to their callers. The latter two keep going after the error, which looks wrong. Drop the @errp parameter, and instead pass &error_abort to object_property_set_link(). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
* qdev: Convert to qdev_unrealize() with CoccinelleMarkus Armbruster2020-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | For readability, and consistency with qbus_realize(). Coccinelle script: @ depends on !(file in "hw/core/qdev.c")@ typedef DeviceState; DeviceState *dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(dev); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(DEVICE(dev)); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-8-armbru@redhat.com>
* hw/pci/pcie: Move hot plug capability check to pre_plug callbackJulia Suvorova2020-06-091-8/+11
| | | | | | | | | | | | | | | | | | | | Check for hot plug capability earlier to avoid removing devices attached during the initialization process. Run qemu with an unattached drive: -drive file=$FILE,if=none,id=drive0 \ -device pcie-root-port,id=rp0,slot=3,bus=pcie.0,hotplug=off Hotplug a block device: device_add virtio-blk-pci,id=blk0,drive=drive0,bus=rp0 If hotplug fails on plug_cb, drive0 will be deleted. Fixes: 0501e1aa1d32a6 ("hw/pci/pcie: Forbid hot-plug if it's disabled on the slot") Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200604125947.881210-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qdev: Unrealize must not failMarkus Armbruster2020-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
* hw/pci/pcie: Replace PCI_DEVICE() casts with existing variableJulia Suvorova2020-05-041-3/+3
| | | | | | | | | | A little cleanup is possible because of hotplug_pdev introduction. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-3-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
* hw/pci/pcie: Forbid hot-plug if it's disabled on the slotJulia Suvorova2020-05-041-0/+19
| | | | | | | | | | | | | Raise an error when trying to hot-plug/unplug a device through QMP to a device with disabled hot-plug capability. This makes the device behaviour more consistent and provides an explanation of the failure in the case of asynchronous unplug. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-2-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
* pcie_root_port: Add hotplug disabling optionJulia Suvorova2020-03-081-4/+7
| | | | | | | | | | | | | | | | | | | | | | Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt manage it and restrict unplug for the whole machine. This is going to prevent user-initiated unplug in guests (Windows mostly). Hotplug is enabled by default. Usage: -device pcie-root-port,hotplug=off,... If you want to disable hot-unplug on some downstream ports of one switch, disable hot-unplug on PCIe Root Port connected to the upstream port as well as on the selected downstream ports. Discussion related: https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg00530.html Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200226174607.205941-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
* pci: mark device having guest unplug request pendingJens Freimann2019-10-291-0/+3
| | | | | | | | | | Set pending_deleted_event in DeviceState for failover primary devices that were successfully unplugged by the Guest OS. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-5-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: mark devices partially unpluggedJens Freimann2019-10-291-0/+3
| | | | | | | | | | | Only the guest unplug request was triggered. This is needed for the failover feature. In case of a failed migration we need to plug the device back to the guest. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-4-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: minor cleanups for slot control/statusMichael S. Tsirkin2019-07-011-6/+11
| | | | | | | | | | Rename function arguments to make intent clearer. Better documentation for slot control logic. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* pcie: work around for racy guest initMichael S. Tsirkin2019-07-011-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During boot, linux guests tend to clear all bits in pcie slot status register which is used for hotplug. If they clear bits that weren't set this is racy and will lose events: not a big problem for manual hotplug on bare-metal, but a problem for us. For example, the following is broken ATM: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 \ -monitor stdio disk.qcow2 (qemu)device_del balloon (qemu)cont Balloon isn't deleted as it should. As a work-around, detect this attempt to clear slot status and revert status to what it was before the write. Note: in theory this can be detected as a duplicate button press which cancels the previous press. Does not seem to happen in practice as guests seem to only have this bug during init. Note2: the right thing to do is probably to fix Linux to read status before clearing it, and act on the bits that are set. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>
* pcie: check that slt ctrl changed before deletingMichael S. Tsirkin2019-07-011-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | During boot, linux would sometimes overwrites control of a powered off slot before powering it on. Unfortunately QEMU interprets that as a power off request and ejects the device. For example: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -monitor stdio disk.qcow2 (qemu)device_add virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 (qemu)cont Balloon is deleted during guest boot. To fix, save control beforehand and check that power or led state actually change before ejecting. Note: this is more a hack than a solution, ideally we'd find a better way to detect ejects, or move away from ejects completely and instead monitor whether it's safe to delete device due to e.g. its power state. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>
* pcie: don't skip multi-mask eventsMichael S. Tsirkin2019-07-011-1/+1
| | | | | | | | | | | | If we are trying to set multiple bits at once, testing that just one of them is already set gives a false positive. As a result we won't interrupt guest if e.g. presence detection change and attention button press are both set. This happens with multi-function device removal. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>