summary refs log tree commit diff stats
path: root/hw/ppc/spapr_cpu_core.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* qom: Constify TypeInfo::class_dataPhilippe Mathieu-Daudé2025-04-251-1/+1
| | | | | | | | All callers now correctly expect a const class data. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-5-philmd@linaro.org>
* qom: Have class_init() take a const data argumentPhilippe Mathieu-Daudé2025-04-251-1/+1
| | | | | | | | | | Mechanical change using gsed, then style manually adapted to pass checkpatch.pl script. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-4-philmd@linaro.org>
* ppc/spapr: Fix RTAS stopped stateNicholas Piggin2025-03-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change takes the CPUPPCState 'quiesced' field added for powernv hardware CPU core controls (used to stop and start cores), and extends it to spapr to model the "RTAS stopped" state. This prevents the schedulers attempting to run stopped CPUs unexpectedly, which can cause hangs and possibly other unexpected behaviour. The detail of the problematic situation is this: A KVM spapr guest boots with all secondary CPUs defined to be in the "RTAS stopped" state. In this state, the CPU is only responsive to the start-cpu RTAS call. This behaviour is modeled in QEMU with the start_powered_off feature, which sets ->halted on secondary CPUs at boot. ->halted=true looks like an idle / sleep / power-save state which typically is responsive to asynchronous interrupts, but spapr clears wake-on-interrupt bits in the LPCR SPR. This more-or-less works. Commit e8291ec16da8 ("target/ppc: fix timebase register reset state") recently caused the decrementer to expire sooner at boot, causing a decrementer exception on secondary CPUs in RTAS stopped state. This was not a problem on TCG, but KVM limits how a guest can modify LPCR, in particular it prevents the clearing of wake-on-interrupt bits, and so in the course of CPU register synchronisation, the LPCR as set by spapr to model the RTAS stopped state is overwritten with KVM's LPCR value, and that then causes QEMU's interrupt code to notice the expired decrementer exception, turn that into an interrupt, and set CPU_INTERRUPT_HARD. That causes the CPU to be kicked, and the KVM vCPU thread to loop calling kvm_cpu_exec(). kvm_cpu_exec() calls kvm_arch_process_async_events(), which on ppc just returns ->halted. This is still true, so it returns immediately with EXCP_HLT, and the vCPU never goes to sleep because qemu_wait_io_event() sees CPU_INTERRUPT_HARD is set. All this while the vCPU holds the bql. This causes the boot CPU to eventually lock up when it needs the bql. So make 'quiesced' represent the "RTAS stopped" state, and have it explicitly not respond to exceptions (interrupt conditions) rather than rely on machine register state to model that state. This matches the powernv quiesced state very well because it essentially turns off the CPU core via a side-band control unit. There are still issues with QEMU and KVM idea of LPCR diverging and that is quite ugly and fragile that should be fixed. spapr should synchronize its LPCR properly with KVM, and not try to use values that KVM does not support. Reported-by: Misbah Anjum N <misanjum@linux.ibm.com> Tested-by: Misbah Anjum N <misanjum@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* spapr: Generate random HASHPKEYR for spapr machinesNicholas Piggin2025-03-111-0/+2
| | | | | | | | | | | | | The hypervisor is expected to create a value for the HASHPKEY SPR for each partition. Currently it uses zero for all partitions, use a random number instead, which in theory might make kernel ROP protection more secure. Signed-of-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20241219034035.1826173-4-npiggin@gmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* Merge tag 'exec-20241220' of https://github.com/philmd/qemu into stagingStefan Hajnoczi2024-12-211-5/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Accel & Exec patch queue - Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander) - Add '-d invalid_mem' logging option (Zoltan) - Create QOM containers explicitly (Peter) - Rename sysemu/ -> system/ (Philippe) - Re-orderning of include/exec/ headers (Philippe) Move a lot of declarations from these legacy mixed bag headers: . "exec/cpu-all.h" . "exec/cpu-common.h" . "exec/cpu-defs.h" . "exec/exec-all.h" . "exec/translate-all" to these more specific ones: . "exec/page-protection.h" . "exec/translation-block.h" . "user/cpu_loop.h" . "user/guest-host.h" . "user/page-protection.h" # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe/// # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo= # =cjz8 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits) util/qemu-timer: fix indentation meson: Do not define CONFIG_DEVICES on user emulation system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header system/numa: Remove unnecessary 'exec/cpu-common.h' header hw/xen: Remove unnecessary 'exec/cpu-common.h' header target/mips: Drop left-over comment about Jazz machine target/mips: Remove tswap() calls in semihosting uhi_fstat_cb() target/xtensa: Remove tswap() calls in semihosting simcall() helper accel/tcg: Un-inline translator_is_same_page() accel/tcg: Include missing 'exec/translation-block.h' header accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h' accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h' qemu/coroutine: Include missing 'qemu/atomic.h' header exec/translation-block: Include missing 'qemu/atomic.h' header accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h' exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined target/sparc: Move sparc_restore_state_to_opc() to cpu.c target/sparc: Uninline cpu_get_tb_cpu_state() target/loongarch: Declare loongarch_cpu_dump_state() locally user: Move various declarations out of 'exec/exec-all.h' ... Conflicts: hw/char/riscv_htif.c hw/intc/riscv_aplic.c target/s390x/cpu.c Apply sysemu header path changes to not in the pull request. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
| * include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* | include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LISTRichard Henderson2024-12-191-1/+0
|/ | | | | | | | | | | | | | Now that all of the Property arrays are counted, we can remove the terminator object from each array. Update the assertions in device_class_set_props to match. With struct Property being 88 bytes, this was a rather large form of terminator. Saves 30k from qemu-system-aarch64. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hw/ppc: Constify all PropertyRichard Henderson2024-12-151-1/+1
| | | | | Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/ppc: Fix THREAD_SIBLING_FOREACH for multi-socketGlenn Miles2024-11-271-0/+1
| | | | | | | | | | | | The THREAD_SIBLING_FOREACH macro wasn't excluding threads from other chips. Add chip_index field to the thread state and add a check for the new field in the macro. Fixes: b769d4c8f4c6 ("target/ppc: Add initial flags and helpers for SMT support") Signed-off-by: Glenn Miles <milesg@linux.ibm.com> [npiggin: set chip_index for spapr too] Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* ppc/pseries: Add Power11 cpu typeAditya Gupta2024-11-041-0/+1
| | | | | | | | | | Add sPAPR CPU Core definition for Power11 Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com> Tested-by: Amit Machhiwal <amachhiw@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* ppc/spapr: remove deprecated machine pseries-2.12Harsh Prateek Bora2024-11-041-9/+3
| | | | | | | | | | | | | | Commit 0cac0f1b964 marked pseries-2.12 machines as deprecated with reasons mentioned in its commit log. Removing pseries-2.12 specific code with this patch. While at it, also remove pre-3.0-migration hacks introduced for backward compatibility which are now turned useless. Suggested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw: Use device_class_set_legacy_reset() instead of opencodingPeter Maydell2024-09-131-1/+1
| | | | | | | | | | | | | Use device_class_set_legacy_reset() instead of opencoding an assignment to DeviceClass::reset. This change was produced with: spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/device-reset.cocci \ --keep-comments --smpl-spacing --in-place --dir hw Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org
* ppc: Add has_smt_siblings property to CPUPPCStateNicholas Piggin2024-07-261-3/+9
| | | | | | | | | | | The decision to branch out to a slower SMT path in instruction emulation will become a bit more complicated with the way that "big-core" topology that will be implemented in subsequent changes. Hide these details from the wider CPU emulation code with a bool has_smt_siblings flag that can be set by machine initialisation. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* ppc: Add a core_index to CPUPPCState for SMT vCPUsNicholas Piggin2024-07-261-0/+4
| | | | | | | | | | | | The way SMT thread siblings are matched is clunky, using hard-coded logic that checks the PIR SPR. Change that to use a new core_index variable in the CPUPPCState, where all siblings have the same core_index. CPU realize routines have flexibility in setting core/sibling topology. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* ppc: Drop support for POWER9 and POWER10 DD1 chipsNicholas Piggin2024-03-131-2/+0
| | | | | | | | The POWER9 DD1 and POWER10 DD1 chips are not public and are no longer of any use in QEMU. Remove them. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* spapr: set MSR[ME] and MSR[FP] on client entryNicholas Piggin2024-03-131-1/+5
| | | | | | | | The initial MSR state for the OpenFirmware binding specifies MSR[ME] and MSR[FP] are set. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
* hw/ppc/spapr_cpu: Use qdev_is_realized() instead of QOM APIPhilippe Mathieu-Daudé2024-02-221-2/+1
| | | | | | | | Prefer QDev API for QDev objects, avoid the underlying QOM layer. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20240216110313.17039-4-philmd@linaro.org>
* target/ppc/cpu-models: Rename power5+ and power7+ for new QOM naming rulesThomas Huth2024-02-051-2/+2
| | | | | | | | | | | | | | | The character "+" is now forbidden in QOM device names (see commit b447378e1217 - "Limit type names to alphanumerical and some few special characters"). For the "power5+" and "power7+" CPU names, there is currently a hack in type_name_is_valid() to still allow them for compatibility reasons. However, there is a much nicer solution for this: Simply use aliases! This way we can still support the old names without the need for the ugly hack in type_name_is_valid(). Message-ID: <20240117141054.73841-2-thuth@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* hw/ppc/spapr_cpu_core: Access QDev properties with proper APIPhilippe Mathieu-Daudé2024-01-051-1/+1
| | | | | | | | | | | CPUState::start_powered_off field is part of the internal implementation of a QDev CPU. It is exposed as the QDev "start-powered-off" property. External components should use the qdev properties API to access it. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-Id: <20231123143813.42632-2-philmd@linaro.org>
* hw/ppc: Constify VMStateRichard Henderson2023-12-301-6/+6
| | | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231221031652.119827-48-richard.henderson@linaro.org>
* hw/ppc: Reset timebase facilities on machine resetNicholas Piggin2023-09-061-0/+2
| | | | | | | | | | | | | | Lower interrupts, delete timers, and set time facility registers back to initial state on machine reset. This is not so important for record-replay since timebase and decrementer are migrated, but it gives a cleaner reset state. Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: checkpatch.pl fixes ] Signed-off-by: Cédric Le Goater <clg@kaod.org>
* target/ppc: Add LPAR-per-core vs per-thread mode flagNicholas Piggin2023-07-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Power ISA has the concept of sub-processors: Hardware is allowed to sub-divide a multi-threaded processor into "sub-processors" that appear to privileged programs as multi-threaded processors with fewer threads. POWER9 and POWER10 have two modes, either every thread is a sub-processor or all threads appear as one multi-threaded processor. In the user manuals these are known as "LPAR per thread" / "Thread LPAR", and "LPAR per core" / "1 LPAR", respectively. The practical difference is: in thread LPAR mode, non-hypervisor SPRs are not shared between threads and msgsndp can not be used to message siblings. In 1 LPAR mode, some SPRs are shared and msgsndp is usable. Thrad LPAR allows multiple partitions to run concurrently on the same core, and is a requirement for KVM to run on POWER9/10 (which does not gang-schedule an LPAR on all threads of a core like POWER8 KVM). Traditionally, SMT in PAPR environments including PowerVM and the pseries QEMU machine with KVM acceleration behaves as in 1 LPAR mode. In OPAL systems, Thread LPAR is used. When adding SMT to the powernv machine, it is therefore preferable to emulate Thread LPAR. To account for this difference between pseries and powernv, an LPAR mode flag is added such that SPRs can be implemented as per-LPAR shared, and that becomes either per-thread or per-core depending on the flag. Reviewed-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-ID: <20230705120631.27670-2-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* spapr: TCG allow up to 8-thread SMT on POWER8 and newer CPUsNicholas Piggin2023-06-251-2/+5
| | | | | | | | | | | | | | | | | PPC TCG supports SMT CPU configurations for non-hypervisor state, so permit POWER8-10 pseries machines to enable SMT. This requires PIR and TIR be set, because that's how sibling thread matching is done by TCG. spapr's nested-HV capability does not currently coexist with SMT, so that combination is prohibited (interestingly somewhat analogous to LPAR-per-core mode on real hardware which also does not support KVM). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: Also test smp_threads when checking for POWER8 CPU and above ] Signed-off-by: Cédric Le Goater <clg@kaod.org>
* target/ppc: Add POWER9 DD2.2 modelNicholas Piggin2023-05-281-0/+1
| | | | | | | | | | | | | | | | | POWER9 DD2.1 and earlier had significant limitations when running KVM, including lack of "mixed mode" MMU support (ability to run HPT and RPT mode on threads of the same core), and a translation prefetch issue which is worked around by disabling "AIL" mode for the guest. These processors are not widely available, and it's difficult to deal with all these quirks in qemu +/- KVM, so create a POWER9 DD2.2 CPU and make it the default POWER9 CPU. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-Id: <20230515160201.394587-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* hw/ppc: free env->tb_env in spapr_unrealize_vcpu()Daniel Henrique Barboza2022-04-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timebase is allocated during spapr_realize_vcpu() and it's not freed. This results in memory leaks when doing vcpu unplugs: ==636935== ==636935== 144 (96 direct, 48 indirect) bytes in 1 blocks are definitely lost in loss record 6 ,461 of 8,135 ==636935== at 0x4897468: calloc (vg_replace_malloc.c:760) ==636935== by 0x5077213: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6400.4) ==636935== by 0x507757F: g_malloc0_n (in /usr/lib64/libglib-2.0.so.0.6400.4) ==636935== by 0x93C3FB: cpu_ppc_tb_init (ppc.c:1066) ==636935== by 0x97BC2B: spapr_realize_vcpu (spapr_cpu_core.c:268) ==636935== by 0x97C01F: spapr_cpu_core_realize (spapr_cpu_core.c:337) ==636935== by 0xD4626F: device_set_realized (qdev.c:531) ==636935== by 0xD55273: property_set_bool (object.c:2273) ==636935== by 0xD523DF: object_property_set (object.c:1408) ==636935== by 0xD588B7: object_property_set_qobject (qom-qobject.c:28) ==636935== by 0xD52897: object_property_set_bool (object.c:1477) ==636935== by 0xD4579B: qdev_realize (qdev.c:333) ==636935== This patch adds a cpu_ppc_tb_free() helper in hw/ppc/ppc.c to allow us to free the timebase. This leak is then solved by calling cpu_ppc_tb_free() in spapr_unrealize_vcpu(). Fixes: 6f4b5c3ec590 ("spapr: CPU hot unplug support") Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20220329124545.529145-2-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* spapr: prevent hdec timer being set up under virtual hypervisorNicholas Piggin2022-02-181-3/+3
| | | | | | | | | | | The spapr virtual hypervisor does not require the hdecr timer. Remove it. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220216102545.1808018-3-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* spapr: Force 32bit when resetting a coreAlexey Kardashevskiy2022-01-281-0/+5
| | | | | | | | | | | | | | | | | "PowerPC Processor binding to IEEE 1275" says in "8.2.1. Initial Register Values" that the initial state is defined as 32bit so do it for both SLOF and VOF. This should not cause behavioral change as SLOF switches to 64bit very early anyway. As nothing enforces LE anywhere, this drops it for VOF. The goal is to make VOF work with TCG as otherwise it barfs with qemu: fatal: TCG hflags mismatch (current:0x6c000004 rebuilt:0x6c000000) Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220107072423.2278113-1-aik@ozlabs.ru> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* target/ppc: introduce PMUEventType and PMU overflow timersDaniel Henrique Barboza2021-12-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch starts an IBM Power8+ compatible PMU implementation by adding the representation of PMU events that we are going to sample, PMUEventType. This enum represents a Perf event that is being sampled by a specific counter 'sprn'. Events that aren't available (i.e. no event was set in MMCR1) will be of type 'PMU_EVENT_INVALID'. Events that are inactive due to frozen counter bits state are of type 'PMU_EVENT_INACTIVE'. Other types added in this patch are PMU_EVENT_CYCLES and PMU_EVENT_INSTRUCTIONS. More types will be added later on. Let's also add the required PMU cycle overflow timers. They will be used to trigger cycle overflows when cycle events are being sampled. This timer will call cpu_ppc_pmu_timer_cb(), which in turn calls fire_PMC_interrupt(). Both functions are stubs that will be implemented later on when EBB support is added. Two new helper files are created to host this new logic. cpu_ppc_pmu_init() will init all overflow timers during CPU init time. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20211201151734.654994-2-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
* ppc/spapr: Add a POWER10 DD2 CPUCédric Le Goater2021-09-291-0/+1
| | | | | | | Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210901094153.227671-3-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr_cpu_core.c: use g_auto* in spapr_create_vcpu()Daniel Henrique Barboza2021-01-191-9/+3
| | | | | | | | | Use g_autoptr() with Object and g_autofree with the string to avoid the need of a cleanup path. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210114180628.1675603-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Simplify spapr_cpu_core_realize() and spapr_cpu_core_unrealize()Greg Kurz2020-10-281-13/+3
| | | | | | | | | | | | Now that the error path of spapr_cpu_core_realize() is just to call idempotent spapr_cpu_core_unrealize() for rollback, no need to create and realize the vCPUs in two separate loops. Merge them and do them same in spapr_cpu_core_unrealize() for symmetry. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160279673321.1808373.2248221100790367912.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Make spapr_cpu_core_unrealize() idempotentGreg Kurz2020-10-281-20/+21
| | | | | | | | | | | | | | | | | spapr_cpu_core_realize() has a rollback path which partially duplicates the code of spapr_cpu_core_unrealize(). Let's make spapr_cpu_core_unrealize() idempotent and call it instead. This requires to: - move the registration and unregistration of the reset handler around but it is harmless, - allocate the array of vCPUs with g_new0() to be able to filter out unused slots, - make sure to only unrealize vCPUs that have been already realized. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160279672626.1808373.14142129300586424514.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Drop spapr_delete_vcpu() unused argumentGreg Kurz2020-10-281-3/+3
| | | | | | | | The 'sc' argument is unused. Drop it. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160279671929.1808373.10333672533575251075.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Unrealize vCPUs with qdev_unrealize()Greg Kurz2020-10-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we introduced CPU hot-unplug in sPAPR, we don't unrealize the vCPU objects explicitly. Instead, we let QOM handle that for us under object_property_del_all() when the CPU core object is finalized. The only thing we do is calling cpu_remove_sync() to tear the vCPU thread down. This happens to work but it is ugly because: - we call qdev_realize() but the corresponding qdev_unrealize() is buried deep in the QOM code - we call cpu_remove_sync() to undo qemu_init_vcpu() called by ppc_cpu_realize() in target/ppc/translate_init.c.inc - the CPU init and teardown paths aren't really symmetrical The latter didn't bite us so far but a future patch that greatly simplifies the CPU core realize path needs it to avoid a crash in QOM. For all these reasons, have ppc_cpu_unrealize() to undo the changes of ppc_cpu_realize() by calling cpu_remove_sync() at the right place, and have the sPAPR CPU core code to call qdev_unrealize(). This requires to add a missing stub because translate_init.c.inc is also compiled for user mode. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160279671236.1808373.14732005038172874990.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Fix leak of CPU machine specific dataGreg Kurz2020-10-281-10/+12
| | | | | | | | | | | | | When a CPU core is being removed, the machine specific data of each CPU thread object is leaked. Fix this by calling the dedicated helper we have for that instead of simply unparenting the CPU object. Call it from a separate loop in spapr_cpu_core_unrealize() for symmetry with spapr_cpu_core_realize(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160279670540.1808373.17319746576919615623.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Simplify error handling in spapr_cpu_core_realize()Greg Kurz2020-10-091-9/+7
| | | | | | | | | | | | As recommended in "qapi/error.h", add a bool return value to spapr_realize_vcpu() and use it in spapr_cpu_core_realize() in order to get rid of the error propagation overhead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-12-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Add a return value to spapr_set_vcpu_id()Greg Kurz2020-10-091-4/+1
| | | | | | | | | | | As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-11-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Fix error leak in spapr_realize_vcpu()Greg Kurz2020-10-091-2/+1
| | | | | | | | | | | | | If spapr_irq_cpu_intc_create() fails, local_err isn't propagated and thus leaked. Fixes: 992861fb1e4c ("error: Eliminate error_propagate() manually") Cc: armbru@redhat.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-2-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/spapr: Use start-powered-off CPUState propertyThiago Jung Bauermann2020-09-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PowerPC sPAPR CPUs start in the halted state, and spapr_reset_vcpu() attempts to implement this by setting CPUState::halted to 1. But that's too late for the case of hotplugged CPUs in a machine configure with 2 or more threads per core. By then, other parts of QEMU have already caused the vCPU to run in an unitialized state a couple of times. For example, ppc_cpu_reset() calls ppc_tlb_invalidate_all(), which ends up calling async_run_on_cpu(). This kicks the new vCPU while it has CPUState::halted = 0, causing QEMU to issue a KVM_RUN ioctl on the new vCPU before the guest is able to make the start-cpu RTAS call to initialize its register state. This problem doesn't seem to cause visible issues for regular guests, but on a secure guest running under the Ultravisor it does. The Ultravisor relies on being able to snoop on the start-cpu RTAS call to map vCPUs to guests, and this issue causes it to see a stray vCPU that doesn't belong to any guest. Fix by setting the start-powered-off CPUState property in spapr_create_vcpu(), which makes cpu_common_reset() initialize CPUState::halted to 1 at an earlier moment. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Message-Id: <20200826055535.951207-4-bauerman@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* error: Eliminate error_propagate() manuallyMarkus Armbruster2020-07-101-10/+4
| | | | | | | | | | | When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous two commits did that for sufficiently simple cases with Coccinelle. Do it for several more manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-37-armbru@redhat.com>
* qdev: Use returned bool to check for qdev_realize() etc. failureMarkus Armbruster2020-07-101-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their wrappers isa_realize_and_unref(), pci_realize_and_unref(), sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref(). Coccinelle script: @@ identifier fun = { isa_realize_and_unref, pci_realize_and_unref, qbus_realize, qdev_realize, qdev_realize_and_unref, sysbus_realize, sysbus_realize_and_unref, usb_realize_and_unref }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error message "no position information". Nothing to convert there; skipped. Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200707160613.848843-5-armbru@redhat.com>
* qdev: Convert bus-less devices to qdev_realize() with CoccinelleMarkus Armbruster2020-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All remaining conversions to qdev_realize() are for bus-less devices. Coccinelle script: // only correct for bus-less @dev! @@ expression errp; expression dev; @@ - qdev_init_nofail(dev); + qdev_realize(dev, NULL, &error_fatal); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(OBJECT(dev), true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(dev, true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); Note that Coccinelle chokes on ARMSSE typedef vs. macro in hw/arm/armsse.c. Worked around by temporarily renaming the macro for the spatch run. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
* ppc/spapr: add a POWER10 CPU modelCédric Le Goater2020-05-271-0/+1
| | | | | | Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200507073855.2485680-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* qdev: Unrealize must not failMarkus Armbruster2020-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
* qom: Drop parameter @errp of object_property_add() & friendsMarkus Armbruster2020-05-151-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]
* ppc/spapr: Move GPRs setup to one placeAlexey Kardashevskiy2020-03-171-1/+5
| | | | | | | | | | | | | | | | | At the moment "pseries" starts in SLOF which only expects the FDT blob pointer in r3. As we are going to introduce a OpenFirmware support in QEMU, we will be booting OF clients directly and these expect a stack pointer in r1, Linux looks at r3/r4 for the initramdisk location (although vmlinux can find this from the device tree but zImage from distro kernels cannot). This extends spapr_cpu_set_entry_state() to take more registers. This should cause no behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20200310050733.29805-2-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr, ppc: Remove VPM0/RMLS hacks for POWER9David Gibson2020-03-171-9/+1
| | | | | | | | | | | | | | | | | | | | | | For the "pseries" machine, we use "virtual hypervisor" mode where we only model the CPU in non-hypervisor privileged mode. This means that we need guest physical addresses within the modelled cpu to be treated as absolute physical addresses. We used to do that by clearing LPCR[VPM0] and setting LPCR[RMLS] to a high limit so that the old offset based translation for guest mode applied, which does what we need. However, POWER9 has removed support for that translation mode, which meant we had some ugly hacks to keep it working. We now explicitly handle this sort of translation for virtual hypervisor mode, so the hacks aren't necessary. We don't need to set VPM0 and RMLS from the machine type code - they're now ignored in vhyp mode. On the cpu side we don't need to allow LPCR[RMLS] to be set on POWER9 in vhyp mode - that was only there to allow the hack on the machine side. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
* qdev: set properties with device_class_set_props()Marc-André Lureau2020-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* ppc: Add intc_destroy() handlers to SpaprInterruptController/PnvChipGreg Kurz2019-11-181-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | SpaprInterruptControllerClass and PnvChipClass have an intc_create() method that calls the appropriate routine, ie. icp_create() or xive_tctx_create(), to establish the link between the VCPU and the presenter component of the interrupt controller during realize. There aren't any symmetrical call to be called when the VCPU gets unrealized though. It is assumed that object_unparent() is the only thing to do. This is questionable because the parenting logic around the CPU and presenter objects is really an implementation detail of the interrupt controller. It shouldn't be open-coded in the machine code. Fix this by adding an intc_destroy() method that undoes what was done in intc_create(). Also NULLify the presenter pointers to avoid having stale pointers around. This will allow to reliably check if a vCPU has a valid presenter. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
* ppc: Reset the interrupt presenter from the CPU reset handlerCédric Le Goater2019-10-241-1/+4
| | | | | | | | | | | | | | | | | | | | | On the sPAPR machine and PowerNV machine, the interrupt presenters are created by a machine handler at the core level and are reset independently. This is not consistent and it raises issues when it comes to handle hot-plugged CPUs. In that case, the presenters are not reset. This is less of an issue in XICS, although a zero MFFR could be a concern, but in XIVE, the OS CAM line is not set and this breaks the presenting algorithm. The current code has workarounds which need a global cleanup. Extend the sPAPR IRQ backend and the PowerNV Chip class with a new cpu_intc_reset() handler called by the CPU reset handler and remove the XiveTCTX reset handler which is now redundant. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191022163812.330-6-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>