summary refs log tree commit diff stats
path: root/hw (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
| * tpm-passthrough: don't save guessed cancel_path in optionsMarc-André Lureau2017-12-141-3/+1
| | | | | | | | | | | | | | | | | | | | The value is later unneeded, and may leak if the free visitor doesn't consider it since has_cancel_path is false. And for consistency with "path" it shouldn't be returned in get_tpm_options(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-be: ask model to the TPM interfaceMarc-André Lureau2017-12-141-2/+1
| | | | | | | | | | | | | | | | | | No need to store the mode in the backend, or to let the frontend set it itself. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-be: report error instead of front-endMarc-André Lureau2017-12-141-3/+1
| | | | | | | | | | | | | | | | | | Backend can give more accurate error description, and lift out the job from the frontend. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-be: call request_completed() out of threadMarc-André Lureau2017-12-143-27/+12
| | | | | | | | | | | | | | | | | | | | | | Lift from the backend implementation the responsability to call the request_completed() callback outside of thread context. This also simplify frontend/interface work, as they no longer need to care whether the callback is called from a different thread. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-tis: no longer expose TPMStateMarc-André Lureau2017-12-141-2/+2
| | | | | | | | | | | | | | | | Now that there is an interface instead. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-backend: store TPMIf interface, improve backend_init()Marc-André Lureau2017-12-143-5/+5
| | | | | | | | | | | | | | | | | | | | Store the TPM interface, the actual object may be different from TPMState. Keep a reference on the interface, and check the backend wasn't already initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm: move TpmIf in include/sysemu/tpm.hMarc-André Lureau2017-12-141-21/+1
| | | | | | | | | | | | | | | | | | This is a better location than hw/tpm, since we are going to use the interface from outside hw/tpm. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
| * tpm-tis: remove unused locty_numberMarc-André Lureau2017-12-141-1/+0
| | | | | | | | | | | | | | | | This field slipped in commit 5086bf9784. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
* | Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20171215-v2' into ↵Peter Maydell2017-12-159-184/+288
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging s390x changes for 2.12: - Lots of tcg improvements: ccw hotplug is now working and we can run a Linux kernel built for z12 under tcg - zPCI improvements to get virtio-pci working - get rid of the cssid restrictions for virtual and non-virtual channel devices - we now support 8TB+ systems - 2.12 compat machine - fixes and cleanups # gpg: Signature made Fri 15 Dec 2017 10:57:01 GMT # gpg: using RSA key 0xDECF6B93C6F02FAF # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" # gpg: aka "Cornelia Huck <cohuck@kernel.org>" # gpg: aka "Cornelia Huck <cohuck@redhat.com>" # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF * remotes/cohuck/tags/s390x-20171215-v2: (46 commits) s390-ccw-virtio: allow for systems larger that 7.999TB s390x: change the QEMU cpu model to a stripped down z12 s390x/tcg: we already implement the Set-Program-Parameter facility s390x/tcg: implement extract-CPU-time facility s390x/tcg: Implement SIGNAL ADAPTER instruction s390x/tcg: Implement STORE CHANNEL PATH STATUS s390x/tcg: wire up SET CHANNEL MONITOR s390x/tcg: wire up SET ADDRESS LIMIT s390x/tcg: implement Interlocked-Access Facility 2 s390x/tcg: ASI/ASGI/ALSI/ALSGI are atomic with Interlocked-acccess facility 1 s390x/tcg: wire up STORE CHANNEL REPORT WORD s390x/tcg: indicate value of TODPR in STCKE s390x/tcg: implement SET CLOCK PROGRAMMABLE FIELD s390x/tcg: fix and cleanup mcck injection s390x/kvm: factor out build_channel_report_mcic() into cpu.h s390x/css: attach css bridge s390x: deprecate s390-squash-mcss machine prop s390x/css: unrestrict cssids s390x/pci: search for subregion inside the BARs s390x/pci: move the memory region write from pcistg ... # Conflicts: # include/hw/compat.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | s390-ccw-virtio: allow for systems larger that 7.999TBChristian Borntraeger2017-12-151-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM does not allow memory regions > KVM_MEM_MAX_NR_PAGES, basically limiting the memory per slot to 8TB-4k. As memory slots on s390/kvm must be a multiple of 1MB we need start a new memory region if we cross 8TB-1M. With that (and optimistic overcommitment in the kernel) I was able to start a 24TB guest on a 1TB system. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20171211122146.162430-1-borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> [CH: 1UL -> 1ULL in KVM_MEM_MAX_NR_PAGES; build fix on 32 bit hosts] Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x: change the QEMU cpu model to a stripped down z12David Hildenbrand2017-12-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are good enough to boot upstream Linux kernels / Fedora 26/27. That should be sufficient for now. As the QEMU CPU model is migration safe, let's add compatibility code. Generate the feature list to reduce the chance of messing things up in the future. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171208165529.14124-1-david@redhat.com> [CH: squashed 's390x/cpumodel: make qemu cpu model play with "none" machine' (20171213132407.5227-1-david@redhat.com) and 's390x/tcg: don't include z13 features in the qemu model' (20171213171512.17601-1-david@redhat.com) into patch] Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/css: attach css bridgeCornelia Huck2017-12-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Logically, the css bridge should be attached to the machine. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Tested-by: Bjoern Walk <bwalk@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x: deprecate s390-squash-mcss machine propHalil Pasic2017-12-141-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the cssids unrestricted (commit "s390x/css: unrestrict cssids") the s390-squash-mcss machine property should not be used. Actually Libvirt never supported this, so the expectation is that removing it should be pretty painless. But let's play nice and deprecate it first. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Message-Id: <20171206144438.28908-3-pasic@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/css: unrestrict cssidsHalil Pasic2017-12-146-28/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default css 0xfe is currently restricted to virtual subchannel devices. The hope when the decision was made was, that non-virtual subchannel devices will come around when guest can exploit multiple channel subsystems. Since the guests generally don't do, the pain of the partitioned (cssid) namespace outweighs the gain. Let us remove the corresponding restrictions (virtual devices can be put only in 0xfe and non-virtual devices in any css except the 0xfe -- while s390-squash-mcss then remaps everything to cssid 0). At the same time, change our schema for generating css bus ids to put both virtual and non-virtual devices into the default css (spilling over into other css images, if needed). The intention is to deprecate s390-squash-mcss. With this change devices without a specified devno won't end up hidden to guests not supporting multiple channel subsystems, unless this can not be avoided (default css full). Let us also advertise the changes to the management software (so it can tell are cssids unrestricted or restricted). The adverse effect of getting rid of the restriction on migration should not be too severe. Vfio-ccw devices are not live-migratable yet, and for virtual devices using the extra freedom would only make sense with the aforementioned guest support in place. The auto-generated bus ids are affected by both changes. We hope to not encounter any auto-generated bus ids in production as Libvirt is always explicit about the bus id. Since 8ed179c937 ("s390x/css: catch section mismatch on load", 2017-05-18) the worst that can happen because the same device ended up having a different bus id is a cleanly failed migration. I find it hard to reason about the impact of changed auto-generated bus ids on migration for command line users as I don't know which rules is such an user supposed to follow. Another pain-point is down- or upgrade of QEMU for command line users. The old way and the new way of doing vfio-ccw are mutually incompatible. Libvirt is only going to support the new way, so for libvirt users, the possible problems at QEMU downgrade are the following. If a domain contains virtual devices placed into a css different than 0xfe the domain will refuse to start with a QEMU not having this patch. Putting devices into a css different that 0xfe however won't make much sense in the near future (guest support). Libvirt will refuse to do vfio-ccw with a QEMU not having this patch. This is business as usual. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20171206144438.28908-2-pasic@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: search for subregion inside the BARsPierre Morel2017-12-141-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When dispatching memory access to PCI BAR region, we must look for possible subregions, used by the PCI device to map different memory areas inside the same PCI BAR. Since the data offset we received is calculated starting at the region start address we need to adjust the offset for the subregion. The data offset inside the subregion is calculated by substracting the subregion's starting address from the data offset in the region. The access to the MSIX region is now handled in a generic way, we do not need the specific trap_msix() function anymore. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Message-Id: <1512046530-17773-8-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: move the memory region write from pcistgPierre Morel2017-12-141-10/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's move the memory region write from pcistg into a dedicated function. This allows us to prepare a later patch searching for subregions inside of the memory region. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1512046530-17773-7-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: move the memory region read from pcilgPierre Morel2017-12-141-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's move the memory region read from pcilg into a dedicated function. This allows us to prepare a later patch. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1512046530-17773-6-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: rework PCI STORE BLOCKPierre Morel2017-12-143-25/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enhance the fault detection. Fixup the precedence to check the destination path existance before checking for the source accessibility. Add the maxstbl entry to both the Query PCI Function Group response and the PCIBusDevice structure. Initialize the maxstbl to 128 per default until we get the actual data from the hardware. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Message-Id: <1512046530-17773-5-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: rework PCI LOADPierre Morel2017-12-141-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Enhance the fault detection, correction of the fault reporting. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Message-Id: <1512046530-17773-4-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: rework PCI STOREPierre Morel2017-12-142-16/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | Enhance the fault detection, correction of the fault reporting. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Message-Id: <1512046530-17773-3-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: factor out endianess conversionPierre Morel2017-12-141-26/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two places where the same endianness conversion is done. Let's factor this out into a static function. Note that the conversion must always be done for data in a register: The S390 BE guest converted date to le before issuing the instruction. After interception in a BE host: ZPCI VFIO using pwrite must make the conversion back for the BE kernel. Kernel will do BE to le translation when loading the register for the real instruction. After interception in a le host: TCG stores a BE register in le, swapping bytes. But since the data in the register was already le it is now BE ZPCI VFIO must convert it to le before writing to the PCI memory. In both cases ZPCI VFIO must swap the bytes from the register. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Message-Id: <1512046530-17773-2-git-send-email-pmorel@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x: handle exceptions during s390_cpu_virt_mem_rw() correctly (TCG)David Hildenbrand2017-12-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | s390_cpu_virt_mem_rw() must always return, so callers can react on an exception (e.g. see ioinst_handle_stcrw()). However, for TCG we always have to exit the cpu loop (and restore the cpu state before that) if we injected a program interrupt. So let's introduce and use s390_cpu_virt_mem_handle_exc() in code that is not purely KVM. Directly pass the retaddr we already have available in these functions. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171130162744.25442-8-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/pci: pass the retaddr to all PCI instructionsDavid Hildenbrand2017-12-142-47/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we wire up TCG, we will need the retaddr to correctly inject program interrupts. As we want to get rid of the function program_interrupt(), convert PCI code too. For KVM, we can simply use RA_IGNORED. Convert program_interrupt() to s390_program_interrupt() directly, making use of the passed address. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171130162744.25442-6-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x/tcg: rip out dead tpi codeDavid Hildenbrand2017-12-141-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is broken and not even wired up. We'll add a new handler soon, but that will live somewhere else. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171130162744.25442-4-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | s390x: introduce 2.12 compat machineCornelia Huck2017-12-141-1/+16
| |/ | | | | | | | | | | Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
* | Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.12-20171215' ↵Peter Maydell2017-12-1517-279/+532
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue 2017-12-15 First pull request for qemu-2.12. This has quite a bit of stuff accumulated while 2.11 was finalizing. Highlights are: * Some preliminary work towards implementing the "XIVE" POWER9 interrupt controller * Some fixes for problems during reboot with MTTCG * A substantial TCG performance improvement via tcg_get_lookup_and_goto_ptr * Numerous assorted cleanups and bugfixes that weren't urgent enough for 2.11 # gpg: Signature made Fri 15 Dec 2017 03:14:12 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.12-20171215: (24 commits) spapr: don't initialize PATB entry if max-cpu-compat < power9 spapr: Assume msi_nonbroken spapr: Rename machine init functions for clarity target/ppc: introduce the PPC_BIT() macro spapr_events: drop bogus cell from "interrupt-ranges" property spapr: fix LSI interrupt specifiers in the device tree spapr: replace numa_get_node() with lookup in pc-dimm list spapr: introduce a spapr_qirq() helper spapr: introduce a spapr_irq_set_lsi() helper spapr: move the IRQ allocation routines under the machine ppc/xics: assign of the CPU 'intc' pointer under the core ppc/xics: introduce an icp_create() helper spapr/rtas: do not reset the MSR in stop-self command spapr/rtas: fix reboot of a a SMP TCG guest spapr/rtas: disable the decrementer interrupt when a CPU is unplugged e500: fix pci host bridge class/type openpic: debug w/ info_report() pcc: define the Power-saving mode Exit Cause Enable bits in PowerPCCPUClass nvram: add AT24Cx i2c eeprom e500: name openpic and pci host bridge ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | spapr: don't initialize PATB entry if max-cpu-compat < power9Laurent Vivier2017-12-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if KVM is enabled and KVM capabilities MMU radix is available, the partition table entry (patb_entry) for the radix mode is initialized by default in ppc_spapr_reset(). It's a problem if we want to migrate the guest to a POWER8 host while the kernel is not started to set the value to the one expected for a POWER8 CPU. The "-machine max-cpu-compat=power8" should allow to migrate a POWER9 KVM host to a POWER8 KVM host, but because patb_entry is set, the destination QEMU tries to enable radix mode on the POWER8 host. This fails and cancels the migration: Process table config unsupported by the host error while loading state for instance 0x0 of device 'spapr' load of migration failed: Invalid argument This patch doesn't set the PATB entry if the user provides a CPU compatibility mode that doesn't support radix mode. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: Assume msi_nonbrokenDavid Gibson2017-12-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We conditionally adjust part of the guest device tree based on the global msi_nonbroken flag. However, the main machine type code initializes msi_nonbroken to true and there's nothing that would set it to false again. So replace the test with an assert(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
| * | spapr: Rename machine init functions for clarityDavid Gibson2017-12-151-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Machine objects have two init functions - the generic QOM level instance_init which should only do static object initialization, and the Machine specific MachineClass::init which does the actual construction of the machine. In spapr the functions implementing these two have names - ppc_machine_initfn() and ppc_spapr_init() - which don't correspond closely to either of those. To prevent people (read, me) from confusing which is which, rename them spapr_instance_init() and spapr_machine_init() to make it clearer which is which. While we're there rename ppc_spapr_reset() to spapr_machine_reset() to match. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
| * | spapr_events: drop bogus cell from "interrupt-ranges" propertyGreg Kurz2017-12-151-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to LoPAPR 1.1 B.6.12, the "/event-sources" node has an "interrupt- ranges" property, the format of which is described in B.6.9.1.2 as follows: “interrupt-ranges” Standard property name that defines the interrupt number(s) and range(s) handled by this unit. prop-encoded-array: List of (int-number, range) specifications. Int-number is encoded as with encode-int. Range is encoded as with encode-int. The first entry in this list shall contain the int-number associated with the first “reg” property entry. The int-num-ber is the value representing the interrupt source as would appear in the PowerPC External Interrupt Architecture XISR. The range shall be the number of sequential interrupt numbers which this unit can generate. There's no such thing as a cell count at the end of the array, like the one introduced by commit ffbb1705a33d in QEMU 2.8. It doesn't seem it had any impact on existing guests and I couldn't find any related workaround in linux. So, let's just drop the bogus lines. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: fix LSI interrupt specifiers in the device treeGreg Kurz2017-12-153-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LoPAPR 1.1 B.6.9.1.2 describes the "#interrupt-cells" property of the PowerPC External Interrupt Source Controller node as follows: “#interrupt-cells” Standard property name to define the number of cells in an interrupt- specifier within an interrupt domain. prop-encoded-array: An integer, encoded as with encode-int, that denotes the number of cells required to represent an interrupt specifier in its child nodes. The value of this property for the PowerPC External Interrupt option shall be 2. Thus all interrupt specifiers (as used in the standard “interrupts” property) shall consist of two cells, each containing an integer encoded as with encode-int. The first integer represents the interrupt number the second integer is the trigger code: 0 for edge triggered, 1 for level triggered. This patch fixes the interrupt specifiers in the "interrupt-map" property of the PHB node, that were setting the second cell to 8 (confusion with IRQ_TYPE_LEVEL_LOW ?) instead of 1. VIO devices and RTAS event sources use the same format for interrupt specifiers: while here, we introduce a common helper to handle the encoding details. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> -- v3: - reference public LoPAPR instead of internal PAPR+ in changelog - change helper name to spapr_dt_xics_irq() v2: - drop the erroneous changes to the "interrupts" prop in PCI device nodes - introduce a common helper to encode interrupt specifiers Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: replace numa_get_node() with lookup in pc-dimm listIgor Mammedov2017-12-152-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPAPR is the last user of numa_get_node() and a bunch of supporting code to maintain numa_info[x].addr list. Get LMB node id from pc-dimm list, which allows to remove ~80LOC maintaining dynamic address range lookup list. It also removes pc-dimm dependency on numa_[un]set_mem_node_id() and makes pc-dimms a sole source of information about which node it belongs to and removes duplicate data from global numa_info. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: introduce a spapr_qirq() helperCédric Le Goater2017-12-154-20/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xics_get_qirq() is only used by the sPAPR machine. Let's move it there and change its name to reflect its scope. It will be useful for XIVE support which will use its own set of qirqs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: introduce a spapr_irq_set_lsi() helperCédric Le Goater2017-12-151-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | It will make synchronisation easier with the XIVE interrupt mode when available. The 'irq' parameter refers to the global IRQ number space. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: move the IRQ allocation routines under the machineCédric Le Goater2017-12-157-125/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also change the prototype to use a sPAPRMachineState and prefix them with spapr_irq_. It will let us synchronise the IRQ allocation with the XIVE interrupt mode when available. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | ppc/xics: assign of the CPU 'intc' pointer under the coreCédric Le Goater2017-12-153-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'intc' pointer of the CPU references the interrupt presenter in the XICS interrupt mode. When the XIVE interrupt mode is available and activated, the machine will need to reassign this pointer to reflect the change. Moving this assignment under the realize routine of the CPU will ease the process when the interrupt mode is toggled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | ppc/xics: introduce an icp_create() helperCédric Le Goater2017-12-153-20/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sPAPR and the PowerNV core objects create the interrupt presenter object of the CPUs in a very similar way. Let's provide a common routine in which we use the presenter 'type' as a child identifier. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr/rtas: do not reset the MSR in stop-self commandCédric Le Goater2017-12-151-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CPU is stopped with the 'stop-self' RTAS call, its state 'halted' is switched to 1 and, in this case, the MSR is not taken into account anymore in the cpu_has_work() routine. Only the pending hardware interrupts are checked with their LPCR:PECE* enablement bit. The CPU is now also protected from the decrementer interrupt by the LPCR:PECE* bits which are disabled in the 'stop-self' RTAS call. Reseting the MSR is pointless. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr/rtas: fix reboot of a a SMP TCG guestCédric Le Goater2017-12-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like for hot unplug CPUs, when a guest is rebooted, the secondary CPUs can be awaken by the decrementer and start entering SLOF at the same time the boot CPU is. To be safe, let's disable on the secondaries all the exceptions which can cause an exit while the CPU is in power-saving mode. Based on previous work from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr/rtas: disable the decrementer interrupt when a CPU is unpluggedCédric Le Goater2017-12-151-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CPU is stopped with the 'stop-self' RTAS call, its state 'halted' is switched to 1 and, in this case, the MSR is not taken into account anymore in the cpu_has_work() routine. Only the pending hardware interrupts are checked with their LPCR:PECE* enablement bit. If the DECR timer fires after 'stop-self' is called and before the CPU 'stop' state is reached, the nearly-dead CPU will have some work to do and the guest will crash. This case happens very frequently with the not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is occasionally fired but after 'stop' state, so no work is to be done and the guest survives. I suspect there is a race between the QEMU mainloop triggering the timers and the TCG CPU thread but I could not quite identify the root cause. To be safe, let's disable in the LPCR all the exceptions which can cause an exit while the CPU is in power-saving mode and reenable them when the CPU is started. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | e500: fix pci host bridge class/typeMichael Davidsaver2017-12-151-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct some confusion wrt. the PCI facing side of the PCI host bridge (not PCIe root complex). The ref. manual for the mpc8533 (as well as mpc8540 and mpc8540) give the class code as PCI_CLASS_PROCESSOR_POWERPC. While the PCI_HEADER_TYPE field is oddly omitted, the tables in the "PCI Configuration Header" section shows a type 0 layout using all 6 BAR registers (as 2x 32, and 2x 64 bit regions) So 997505065dc92e533debf5cb23012ba4e673d387 seems to be in error. Although there was perhaps some confusion as the mpc8533 has a separate PCIe root complex. With PCIe, a root complex has PCI_HEADER_TYPE=1. Neither the PCI host bridge, nor the PCIe root complex advertise class PCI_CLASS_BRIDGE_PCI. This was confusing Linux guests, which try to interpret the host bridge as a pci-pci bridge, but get confused and re-enumerate the bus when the primary/secondary/subordinate bus registers don't have valid values. Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | openpic: debug w/ info_report()Michael Davidsaver2017-12-151-51/+51
| | | | | | | | | | | | | | | | | | | | | | | | Replace *printf() with *_report(). Remove trailing new lines. Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | nvram: add AT24Cx i2c eepromMichael Davidsaver2017-12-152-0/+206
| | | | | | | | | | | | | | | Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | e500: name openpic and pci host bridgeMichael Davidsaver2017-12-151-0/+4
| | | | | | | | | | | | | | | Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr_cpu_core: instantiate CPUs separatelyGreg Kurz2017-12-152-20/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code assumes that only the CPU core object holds a reference on each individual CPU object, and happily frees their allocated memory when the core is unrealized. This is dangerous as some other code can legitimely keep a pointer to a CPU if it calls object_ref(), but it would end up with a dangling pointer. Let's allocate all CPUs with object_new() and let QOM free them when their reference count reaches zero. This greatly simplify the code as we don't have to fiddle with the instance size anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | spapr: Add pseries-2.12 machine typeDavid Gibson2017-12-151-3/+23
| | | | | | | | | | | | | | | | | | | | | While we're at it fix a couple of small errors in the 2.11 and 2.10 models (they didn't have any real effect, but don't quite match the template). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * | ppc/xics: remove useless if conditionCédric Le Goater2017-12-151-4/+2
| |/ | | | | | | | | | | | | | | | | The previous code section uses a 'first < 0' test and returns. Therefore, there is no need to test the 'first' variable against '>= 0' afterwards. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20171214-tag' ↵Peter Maydell2017-12-154-145/+210
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Xen 2017/12/14 # gpg: Signature made Fri 15 Dec 2017 00:26:26 GMT # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * remotes/sstabellini/tags/xen-20171214-tag: xen/pt: Set is_express to avoid out-of-bounds write xenfb: activate input handlers for raw pointer devices xenfb: Add [feature|request]-raw-pointer xenfb: Use Input Handlers directly ui: generate qcode to linux mappings xen-disk: use an IOThread per instance Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * xen/pt: Set is_express to avoid out-of-bounds writeSimon Gaiser2017-12-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The passed-through device might be an express device. In this case the old code allocated a too small emulated config space in pci_config_alloc() since pci_config_size() returned the size for a non-express device. This leads to an out-of-bound write in xen_pt_config_reg_init(), which sometimes results in crashes. So set is_express as already done for KVM in vfio-pci. Shortened ASan report: ==17512==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000041648 at pc 0x55e0fdac51ff bp 0x7ffe4af07410 sp 0x7ffe4af07408 WRITE of size 2 at 0x611000041648 thread T0 #0 0x55e0fdac51fe in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 #1 0x55e0fdac51fe in stw_he_p include/qemu/bswap.h:330 #2 0x55e0fdac51fe in stw_le_p include/qemu/bswap.h:379 #3 0x55e0fdac51fe in pci_set_word include/hw/pci/pci.h:490 #4 0x55e0fdac51fe in xen_pt_config_reg_init hw/xen/xen_pt_config_init.c:1991 #5 0x55e0fdac51fe in xen_pt_config_init hw/xen/xen_pt_config_init.c:2067 #6 0x55e0fdabcf4d in xen_pt_realize hw/xen/xen_pt.c:830 #7 0x55e0fdf59666 in pci_qdev_realize hw/pci/pci.c:2034 #8 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] 0x611000041648 is located 8 bytes to the right of 256-byte region [0x611000041540,0x611000041640) allocated by thread T0 here: #0 0x7ff596a94bb8 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd9bb8) #1 0x7ff57da66580 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x50580) #2 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] Signed-off-by: Simon Gaiser <hw42@ipsumj.de> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
| * xenfb: activate input handlers for raw pointer devicesOwen Smith2017-12-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the frontend requests raw pointers, the input handlers must be activated to have the input events delivered to the xenfb backend. Without activation, the input events are delivered to handlers registered earlier, which would be the emulated USB tablet or emulated PS/2 mouse. HVM xen_kbdfront can incorrectly scale absolute coordinates when the display resolution is not 800x600. Signed-off-by: Owen Smith <owen.smith@citrix.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>