summary refs log tree commit diff stats
path: root/hw/ppc/spapr_hcall.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* hw/ppc/spapr: Use tb_invalidate_phys_range in h_page_initRichard Henderson2025-09-241-2/+2
| | | | | | | | | We only need invalidate tbs from a single page, not flush all translations. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cpus: properly kick CPUs out of inner execution loopPaolo Bonzini2025-09-171-4/+3
| | | | | | | | Now that cpu_exit() actually kicks all accelerators, use it whenever the message to another thread is processed in qemu_wait_io_event(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* accel: use atomic accesses for exit_requestPaolo Bonzini2025-09-171-3/+3
| | | | | | | | | | | | | | | CPU threads write exit_request as a "note to self" that they need to go out to a slow path. This write happens out of the BQL and can be a data race with another threads' cpu_exit(); use atomic accesses consistently. While at it, change the source argument from int ("1") to bool ("true"). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge tag 'pull-misc-2025-04-24' of https://repo.or.cz/qemu/armbru into stagingStefan Hajnoczi2025-04-241-1/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Miscellaneous patches for 2025-04-24 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmgJ7dYSHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTiZIP/1PFAg/s3SoiLQwH/ZrjyUkm1kiKnjOH # CC5Stw6I9tuYnDAhASAdSymofLv0NNydNe5ai6ZZAWRyRYjIcfNigKAGK4Di+Uhe # nYxT0Yk8hNGwMhl6NnBp4mmCUNCwcbjT9uXdiYQxFYO/qqYR1388xJjeN3c362l3 # AaLrE5bX5sqa6TAkTeRPjeIqxlyGT7jnCrN7I1hMhDvbc3ITF3AMfYFMjnmAQgr+ # mTWGS1QogqqkloODbR1DKD1CAWOlpK+0HibhNF+lz71P0HlwVvy+HPXso505Wf0B # dMwlSrZ1DnqNVF/y5IhMEMslahKajbjbFVhBjmrGl/8T821etCxxgB20c0vyFRy8 # qTyJGwBZaEo0VWr70unSmq45TRoeQvdHAw/e+GtilR0ci80q2ly4gbObnw7L8le+ # gqZo4IWmrwp2sbPepE57sYKQpEndwbRayf/kcFd0LPPpeINu9ZooXkYX0pOo6Cdg # vDKMaEB1/fmPhjSlknxkKN9LZdR+nDw8162S1CKsUdWanAOjmP8haN19aoHhIekZ # q+r2qUq/U827yNy9/qbInmsoFYDz9s6sAOE63jibd5rZZ9Anei6NOSgLzA4CqCR1 # +d0+TXp19gP9mLMFs7/ZclwkXCz47OQYhXYphjI3wM9x+xbdRcI4n+DOH5u5coKx # AsA6+2n0GF4Y # =GaoH # -----END PGP SIGNATURE----- # gpg: Signature made Thu 24 Apr 2025 03:52:54 EDT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-misc-2025-04-24' of https://repo.or.cz/qemu/armbru: cleanup: Drop pointless label at end of function cleanup: Drop pointless return at end of function cleanup: Re-run return_directly.cocci Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
| * cleanup: Drop pointless return at end of functionMarkus Armbruster2025-04-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | A few functions now end with a label. The next commit will clean them up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250407082643.2310002-3-armbru@redhat.com> [Straightforward conflict with commit 988ad4ccebb6 (hw/loongarch/virt: Fix cpuslot::cpu set at last in virt_cpu_plug()) resolved]
* | exec/cpu-all: remove exec/target_page includePierrick Bouvier2025-04-231-0/+1
|/ | | | | | Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* ppc: spapr: Enable 2nd DAWR on Power10 pSeries machineShivaprasad G Bhat2025-03-111-9/+18
| | | | | | | | | | | | | | | As per the PAPR, bit 0 of byte 64 in pa-features property indicates availability of 2nd DAWR registers. i.e. If this bit is set, 2nd DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to find whether kvm supports 2nd DAWR or not. If it's supported, allow user to set the pa-feature bit in guest DT using cap-dawr1 machine capability. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Message-ID: <173708681866.1678.11128625982438367069.stgit@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr: Restrict CONFER hypercall to TCGPhilippe Mathieu-Daudé2025-03-111-0/+2
| | | | | | | | | | KVM handles H_CONFER and does not pass it along to QEMU, so only vhyp (as used by TCG spapr) needs to handle it. [npiggin: Add changelog] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20250127102620.39159-2-philmd@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr: Restrict part of PAGE_INIT hypercall to TCGPhilippe Mathieu-Daudé2025-03-041-1/+3
| | | | | | | | Restrict the tb_flush() call to TCG. Assert we are using KVM or TCG. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-Id: <20250127102620.39159-3-philmd@linaro.org>
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-3/+3
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* spapr: nested: register nested-hv api hcalls only for cap-nested-hvHarsh Prateek Bora2024-03-131-2/+22
| | | | | | | | | | | | Since cap-nested-hv is an optional capability, it makes sense to register api specfic hcalls only when respective capability is enabled. This requires to introduce a new API to unregister hypercalls to maintain sanity across guest reboot since caps are re-applied across reboots and re-registeration of hypercalls would hit assert otherwise. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr_hcall: Rename {softmmu -> vhyp_mmu}_resize_hpt_prPhilippe Mathieu-Daudé2024-02-231-2/+2
| | | | | | | | | | | | | | | Since 'softmmu' is quite a loaded term in QEMU, rename the vhyp MMU facilities to use the vhyp_mmu_ prefix rather than softmmu_. vhyp_mmu_ is chosen because the code that manipulates the hash table via guest software hypercalls is QEMU's implementation of the PAPR hypervisor interface, called vhyp. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> [npiggin: Pick a different name, explain it in changelog.] Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr_hcall: Allow elision of softmmu_resize_hpt_prepPhilippe Mathieu-Daudé2024-02-231-4/+8
| | | | | | | | | | Check tcg_enabled() before calling softmmu_resize_hpt_prepare() and softmmu_resize_hpt_commit() to allow the compiler to elide their calls. The stubs are then unnecessary, remove them. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr_hcall: Remove unused 'exec/exec-all.h' included headerPhilippe Mathieu-Daudé2023-12-201-1/+0
| | | | | | | | Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20231212113640.30287-2-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
* ppc: spelling fixesMichael Tokarev2023-09-201-1/+1
| | | | | Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr: implement H_SET_MODE debug facilitiesNicholas Piggin2023-09-061-0/+57
| | | | | | | | Wire up the H_SET_MODE debug resources to the CIABR and DAWR0 debug facilities in TCG. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* hw/ppc/spapr: Use machine_memory_devices_init()David Hildenbrand2023-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | Let's use our new helper and stop always allocating ms->device_memory. There is no difference in common memory-device code anymore between ms->device_memory being NULL or the size being 0. So we only have to teach spapr code that ms->device_memory isn't always around. We can now modify two maxram_size checks to rely on ms->device_memory for detecting whether we have memory devices. Cc: Daniel Henrique Barboza <danielhb413@gmail.com> Cc: "Cédric Le Goater" <clg@kaod.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Greg Kurz <groug@kaod.org> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-5-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
* ppc/spapr: Move spapr nested HV to a new fileNicholas Piggin2023-06-251-414/+2
| | | | | | | | Create spapr_nested.c for most of the nested HV implementation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* ppc/spapr: load and store l2 state with helper functionsNicholas Piggin2023-06-251-79/+85
| | | | | | | | | | | Arguably this is just shuffling around register accesses, but one nice thing it does is allow the exit to save away the L2 state then switch the environment to the L1 before copying L2 data back to the L1, which logically flows more naturally and simplifies the error paths. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* ppc/spapr: Add a nested state structNicholas Piggin2023-06-251-38/+112
| | | | | | | | | | | | Rather than use a copy of CPUPPCState to store the host state while the environment has been switched to the L2, use a new struct for this purpose. Have helper functions to save and load this host state. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* ppc/spapr: H_ENTER_NESTED should restore host XER ca fieldNicholas Piggin2023-06-251-0/+1
| | | | | | | | | Fix missing env->ca restore when going from L2 back to the host. Fixes: 120f738a467 ("spapr: implement nested-hv capability for the virtual hypervisor") Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcallNicholas Piggin2023-05-281-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The behaviour of the Address Translation Mode on Interrupt resource is not consistently supported by all CPU versions or all KVM versions: KVM HV does not support mode 2, and does not support mode 3 on POWER7 or early POWER9 processesors. KVM PR only supports mode 0. TCG supports all modes (0, 2, 3) on CPUs with support for the corresonding LPCR[AIL] mode. This leads to inconsistencies in guest behaviour and could cause problems migrating guests. This was not noticable for Linux guests for a long time because the kernel only uses modes 0 and 3, and it used to consider AIL-3 to be advisory in that it would always keep the AIL-0 vectors around, so it did not matter whether or not interrupts were delivered according to the AIL mode. Recent Linux guests depend on AIL mode 3 working as specified in order to support the SCV facility interrupt. If AIL-3 can not be provided, then H_SET_MODE must return an error to Linux so it can disable the SCV facility (failure to do so can lead to userspace being able to crash the guest kernel). Add the ail-mode-3 capability to specify that AIL-3 is supported. AIL-0 is implied as the baseline, and AIL-2 is no longer supported by spapr. AIL-2 is not known to be used by any software, but support in TCG could be restored with an ail-mode-2 capability quite easily if a regression is reported. Modify the H_SET_MODE Address Translation Mode on Interrupt resource handler to check capabilities and correctly return error if not supported. KVM has a cap to advertise support for AIL-3. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20230515160216.394612-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* ppc: spapr: cleanup cr get/set with helpers.Harsh Prateek Bora2023-05-051-16/+2
| | | | | | | | | | | | | | The bits in cr reg are grouped into eight 4-bit fields represented by env->crf[8] and the related calculations should be abstracted to keep the calling routines simpler to read. This is a step towards cleaning up the related/calling code for better readability. Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230503093619.2530487-2-harshpb@linux.ibm.com> [danielhb: add 'const' modifier to fix linux-user build] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* includes: move tb_flush into its own headerAlex Bennée2023-03-071-0/+1
| | | | | | | | | | | This aids subsystems (like gdbstub) that want to trigger a flush without pulling target specific headers. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230302190846.2593720-8-alex.bennee@linaro.org> Message-Id: <20230303025805.625589-8-richard.henderson@linaro.org>
* target/ppc: introduce ppc_maybe_interruptMatheus Ferst2022-10-281-0/+6
| | | | | | | | | | | | This new method will check if any pending interrupt was unmasked and then call cpu_interrupt/cpu_reset_interrupt accordingly. Code that raises/lowers or masks/unmasks interrupts should call this method to keep CPU_INTERRUPT_HARD coherent with env->pending_interrupts. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221021142156.4134411-2-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* hw/ppc: set machine->fdt in spapr machineDaniel Henrique Barboza2022-10-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The pSeries machine never bothered with the common machine->fdt attribute. We do all the FDT related work using spapr->fdt_blob. We're going to introduce a QMP/HMP command to dump the FDT, which will rely on setting machine->fdt properly to work across all machine archs/types. Let's set machine->fdt in two places where we manipulate the FDT: spapr_machine_reset() and CAS. There are other places where the FDT is manipulated in the pSeries machines, most notably the hotplug/unplug path. For now we'll acknowledge that we won't have the most accurate representation of the FDT, depending on the current machine state, when using this QMP/HMP fdt command. Making the internal FDT representation always match the actual FDT representation that the guest is using is a problem for another day. spapr->fdt_blob is left untouched for now. To replace it with machine->fdt, since we're migrating spapr->fdt_blob, we would need to migrate machine->fdt as well. This is something that we would like to to do keep our code simpler but it's also a work we'll leave for later. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220926173855.1159396-14-danielhb413@gmail.com>
* ppc: Check partition and process table alignmentLeandro Lupori2022-07-181-0/+9
| | | | | | | | | | | Check if partition and process tables are properly aligned, in their size, according to PowerISA 3.1B, Book III 6.7.6 programming note. Hardware and KVM also raise an exception in these cases. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220628133959.15131-2-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* spapr: Move nested KVM hypercalls under a TCG only config.Fabiano Rosas2022-04-201-6/+20
| | | | | | | | | | These are the spapr virtual hypervisor implementation of the nested KVM API. They only make sense when running with TCG. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220325221113.255834-3-farosas@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* spapr: Move hypercall_register_softmmuFabiano Rosas2022-04-201-25/+25
| | | | | | | | | | | | | I'm moving this because next patch will add more code under the ifdef and it will be cleaner if we keep them together. Also switch the ifdef branches to make it more convenient to add code under CONFIG_TCG in the next patch. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220325221113.255834-2-farosas@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* Use g_new() & friends where that makes obvious senseMarkus Armbruster2022-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
* spapr: implement nested-hv capability for the virtual hypervisorNicholas Piggin2022-02-181-0/+333
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* spapr: move FORM1 verifications to post CASDaniel Henrique Barboza2021-09-301-0/+6
| | | | | | | | | | | | | | | | | FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less) NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM device that has a different latency than the original NUMA node from the regular memory. FORM2 is also able to deal with asymmetric NUMA distances gracefully, something that our FORM1 implementation doesn't do. Move these FORM1 verifications to a new function and wait until after CAS, when we're sure that we're sticking with FORM1, to enforce them. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr_numa.c: rename numa_assoc_array to FORM1_assoc_arrayDaniel Henrique Barboza2021-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Introducing a new NUMA affinity, FORM2, requires a new mechanism to switch between affinity modes after CAS. Also, we want FORM2 data structures and functions to be completely separated from the existing FORM1 code, allowing us to avoid adding new code that inherits the existing complexity of FORM1. The idea of switching values used by the write_dt() functions in spapr_numa.c was already introduced in the previous patch, and the same approach will be used when dealing with the FORM1 and FORM2 arrays. We can accomplish that by that by renaming the existing numa_assoc_array to FORM1_assoc_array, which now is used exclusively to handle FORM1 affinity data. A new helper get_associativity() is then introduced to be used by the write_dt() functions to retrieve the current ibm,associativity array of a given node, after considering affinity selection that might have been done during CAS. All code that was using numa_assoc_array now needs to retrieve the array by calling this function. This will allow for an easier plug of FORM2 data later on. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-5-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Fix implementation of Open Firmware client interfaceAlexey Kardashevskiy2021-07-091-3/+2
| | | | | | | | | | | | | | | | | | | This addresses the comments from v22. The functional changes are (the VOF ones need retesting with Pegasos2): (VOF) setprop will start failing if the machine class callback did not handle it; (VOF) unit addresses are lowered in path_offset(); (SPAPR) /chosen/bootargs is initialized from kernel_cmdline if the client did not change it. Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface") Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210708065625.548396-1-aik@ozlabs.ru> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bitsNicholas Piggin2021-07-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | There are several new L1D cache flush bits added to the hcall which reflect hardware security features for speculative cache access issues. These behaviours are now being specified as negative in order to simplify patched kernel compatibility with older firmware (a new problem found in existing systems would automatically be vulnerable). [dwg: Technically this changes behaviour for existing machine types. After discussion with Nick, we've determined this is safe, because the worst that will happen if a guest gets the wrong information due to a migration is that it will perform some unnecessary workarounds, but will remain correct and secure (well, as secure as it was going to be anyway). In addition the change only affects cap-cfpc=safe which is not enabled by default, and in fact is not possible to set on any current hardware (though it's expected it will be possible on POWER10)] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210615044107.1481608-1-npiggin@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Implement Open Firmware client interfaceAlexey Kardashevskiy2021-07-091-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which is enabled by a new "x-vof" pseries machine option (stands for "Virtual Open Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching and simple memory allocator - "claim" (an OF CI memory allocator) and updates "/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guest kernel with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> [dwg: Adjusted some includes which broke compile in some more obscure compilation setups] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/ppc: moved has_spr to cpu.hLucas Mateus Castro (alqotel)2021-05-191-9/+3
| | | | | | | | | | | Moved has_spr to cpu.h as ppc_has_spr and turned it into an inline function. Change spr verification in pnv.c and spapr.c to a version that can compile in a !TCG environment. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210507164146.67086-1-lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/ppc: moved hcalls that depend on softmmuLucas Mateus Castro (alqotel)2021-05-191-576/+32
| | | | | | | | | | | | | | The hypercalls h_enter, h_remove, h_bulk_remove, h_protect, and h_read, have been moved to spapr_softmmu.c with the functions they depend on. The functions is_ram_address and push_sregs_to_kvm_pr are not static anymore as functions on both spapr_hcall.c and spapr_softmmu.c depend on them. The hypercalls h_resize_hpt_prepare and h_resize_hpt_commit have been divided, the KVM part stayed in spapr_hcall.c while the softmmu part was moved to spapr_softmmu.c Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210506163941.106984-2-lucas.araujo@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* hw/ppc/spapr.c: Extract MMU mode error reporting into a functionFabiano Rosas2021-05-191-12/+2
| | | | | | | | A following patch will make use of it. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-2-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210504' ↵Peter Maydell2021-05-051-1/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups # gpg: Signature made Tue 04 May 2021 06:52:39 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dg-gitlab/tags/ppc-for-6.1-20210504: (46 commits) hw/ppc/pnv_psi: Use device_cold_reset() instead of device_legacy_reset() hw/ppc/spapr_vio: Reset TCE table object with device_cold_reset() hw/intc/spapr_xive: Use device_cold_reset() instead of device_legacy_reset() target/ppc: removed VSCR from SPR registration target/ppc: Reduce the size of ppc_spr_t target/ppc: Clean up _spr_register et al target/ppc: Add POWER10 exception model target/ppc: rework AIL logic in interrupt delivery target/ppc: move opcode table logic to translate.c target/ppc: code motion from translate_init.c.inc to gdbstub.c spapr_drc.c: handle hotunplug errors in drc_unisolate_logical() spapr.h: increase FDT_MAX_SIZE spapr.c: do not use MachineClass::max_cpus to limit CPUs ppc: Rename current DAWR macros and variables target/ppc: POWER10 supports scv target/ppc: Fix POWER9 radix guest HV interrupt AIL behaviour docs/system: ppc: Add documentation for ppce500 machine roms/u-boot: Bump ppce500 u-boot to v2021.04 to fix broken pci support roms/Makefile: Update ppce500 u-boot build directory name ppc/spapr: Add support for implement support for H_SCM_HEALTH ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * target/ppc: Add POWER10 exception modelNicholas Piggin2021-05-041-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | POWER10 adds a new bit that modifies interrupt behaviour, LPCR[HAIL], and it removes support for the LPCR[AIL]=0b10 mode. Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210501072436.145444-3-npiggin@gmail.com> [dwg: Corrected tab indenting] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * target/ppc: rework AIL logic in interrupt deliveryNicholas Piggin2021-05-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The AIL logic is becoming unmanageable spread all over powerpc_excp(), and it is slated to get even worse with POWER10 support. Move it all to a new helper function. Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210501072436.145444-2-npiggin@gmail.com> [dwg: Corrected tab indenting] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Do not include cpu.h if it's not really necessaryThomas Huth2021-05-021-1/+0
|/ | | | | | | | Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* spapr_hcall.c: make do_client_architecture_support staticDaniel Henrique Barboza2021-01-191-0/+1
| | | | | | | | The function is called only inside spapr_hcall.c. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210114180628.1675603-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Introduce spapr_drc_reset_all()Greg Kurz2021-01-061-34/+6
| | | | | | | | | | No need to expose the way DRCs are traversed outside of spapr_drc.c. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-4-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Fix reset of transient DR connectorsGreg Kurz2021-01-061-1/+7
| | | | | | | | | | | | | | | | | | | | | | Documentation of object_property_iter_init() clearly stipulates that "it is forbidden to modify the property list while iterating". But this is exactly what we do when resetting transient DR connectors during CAS. The call to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a PCI bridge, both of which will then in turn destroy their PCI DRCs. This could potentially invalidate the iterator. It is pure luck that this haven't caused any issues so far. Change spapr_drc_reset() to return true if it caused a device to be removed. Restart from scratch in this case. This can potentially increase the overall DRC reset time, especially with a high maxmem which generates a lot of LMB DRCs. But this kind of setup is rare, and so is the use case of rebooting a guest while doing hot-unplug. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-3-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Call spapr_drc_reset() for all DRCs at CASGreg Kurz2021-01-061-3/+4
| | | | | | | | | | | | | | Non-transient DRCs are either in the empty or the ready state, which means spapr_drc_reset() doesn't change their state. It is thus not needed to do any checking. Call spapr_drc_reset() unconditionally and squash spapr_drc_transient() into its only user, spapr_drc_needed(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-2-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Pass sPAPR machine state down to spapr_pci_switch_vga()Greg Kurz2020-12-141-3/+4
| | | | | | | | This allows to drop a user of qdev_get_machine(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201209170052.1431440-4-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Convert hpt_prepare_thread() to use qemu_try_memalign()Greg Kurz2020-11-051-1/+1
| | | | | | | | | | | | | | | HPT resizing is asynchronous: the guest first kicks off the creation of a new HPT, then it waits for that new HPT to be actually created and finally it asks the current HPT to be replaced by the new one. In the case of a userland allocated HPT, this currently relies on calling qemu_memalign() which aborts on OOM and never returns NULL. Since we seem to have path to report the failure to the guest with an H_NO_MEM return value, use qemu_try_memalign() instead of qemu_memalign(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160398563636.32380.1747166034877173994.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Simplify error handling in do_client_architecture_support()Greg Kurz2020-10-091-4/+3
| | | | | | | | | | | Use the return value of ppc_set_compat_all() to check failures, which is preferred over hijacking local_err. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-7-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>