summary refs log tree commit diff stats
path: root/accel/tcg/cpu-exec.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* | accel/tcg: Use cpu_loop_exit_requested() in cpu_loop_exec_tb()Philippe Mathieu-Daudé2024-05-061-5/+2
|/ | | | | | | | Do not open-code cpu_loop_exit_requested(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240428214915.10339-9-philmd@linaro.org>
* accel/tcg: Un-inline retaddr helpers to 'user-retaddr.h'Philippe Mathieu-Daudé2024-04-261-0/+3
| | | | | | | | | | | | | | | set_helper_retaddr() is only used in accel/tcg/user-exec.c. clear_helper_retaddr() is only used in accel/tcg/cpu-exec.c and accel/tcg/user-exec.c. No need to expose their definitions to all user-emulation files including "exec/cpu_ldst.h", move them to a new "user-retaddr.h" header (restricted to accel/tcg/). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231211212003.21686-19-philmd@linaro.org>
* bulk: Call in place single use cpu_env()Philippe Mathieu-Daudé2024-03-121-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | Avoid CPUArchState local variable when cpu_env() is used once. Mechanical patch using the following Coccinelle spatch script: @@ type CPUArchState; identifier env; expression cs; @@ { - CPUArchState *env = cpu_env(cs); ... when != env - env + cpu_env(cs) ... when != env } Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240129164514.73104-5-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
* accel/tcg: Set can_do_io at at start of lookup_tb_ptr helperPeter Maydell2024-02-291-0/+8
| | | | | | | | | | | | | | | | | If a page table is in IO memory and lookup_tb_ptr probes the TLB it can result in a page table walk for the instruction fetch. If this hits IO memory and io_prepare falsely assumes it needs to do a TLB recompile. Avoid that by setting can_do_io at the start of lookup_tb_ptr. Link: https://lore.kernel.org/qemu-devel/CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com/#t Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240219173153.12114-2-Jonathan.Cameron@huawei.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/i386: Extract x86_cpu_exec_halt() from accel/tcg/Philippe Mathieu-Daudé2024-01-291-12/+0
| | | | | | | | | | | Move this x86-specific code out of the generic accel/tcg/. Reported-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124101639.30056-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Introduce TCGCPUOps::cpu_exec_halt() handlerPhilippe Mathieu-Daudé2024-01-291-0/+5
| | | | | | | | | | | In order to make accel/tcg/ target agnostic, introduce the cpu_exec_halt() handler. Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124101639.30056-9-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Inline need_replay_interruptRichard Henderson2024-01-291-15/+2
| | | | | | | The function is now trivial, and with inlining we can re-use the calling function's tcg_ops variable. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target/i386: Extract x86_need_replay_interrupt() from accel/tcg/Philippe Mathieu-Daudé2024-01-291-4/+0
| | | | | | | | | | | Move this x86-specific code out of the generic accel/tcg/. Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124101639.30056-8-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Introduce TCGCPUOps::need_replay_interrupt() handlerPhilippe Mathieu-Daudé2024-01-291-3/+5
| | | | | | | | | | | | In order to make accel/tcg/ target agnostic, introduce the need_replay_interrupt() handler. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <20240124101639.30056-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Use CPUState.cc instead of CPU_GET_CLASS in cpu-exec.cRichard Henderson2024-01-291-49/+52
| | | | | | | CPU_GET_CLASS does runtime type checking; use the cached copy of the class instead. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Un-inline icount_exit_request() for clarityPhilippe Mathieu-Daudé2024-01-291-4/+12
| | | | | | | | | | | Convert packed logic to dumb icount_exit_request() helper. No functional change intended. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20240124101639.30056-5-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg/cpu-exec: Use RCU_READ_LOCK_GUARDPhilippe Mathieu-Daudé2024-01-291-3/+1
| | | | | | | | | | Replace the manual rcu_read_(un)lock calls in cpu_exec(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124074201.8239-2-philmd@linaro.org> [rth: Use RCU_READ_LOCK_GUARD not WITH_RCU_READ_LOCK_GUARD] Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cpu-exec: simplify jump cache managementPaolo Bonzini2024-01-291-43/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unless I'm missing something egregious, the jmp cache is only every populated with a valid entry by the same thread that reads the cache. Therefore, the contents of any valid entry are always consistent and there is no need for any acquire/release magic. Indeed ->tb has to be accessed with atomics, because concurrent invalidations would otherwise cause data races. But ->pc is only ever accessed by one thread, and accesses to ->tb and ->pc within tb_lookup can never race with another tb_lookup. While the TranslationBlock (especially the flags) could be modified by a concurrent invalidation, store-release and load-acquire operations on the cache entry would not add any additional ordering beyond what you get from performing the accesses within a single thread. Because of this, there is really nothing to win in splitting the CF_PCREL and !CF_PCREL paths. It is easier to just always use the ->pc field in the jump cache. I noticed this while working on splitting commit 8ed558ec0cb ("accel/tcg: Introduce TARGET_TB_PCREL", 2022-10-04) into multiple pieces, for the sake of finding a more fine-grained bisection result for https://gitlab.com/qemu-project/qemu/-/issues/2092. It does not (and does not intend to) fix that issue; therefore it may make sense to not commit it until the root cause of issue #2092 is found. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240122153409.351959-1-pbonzini@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()Stefan Hajnoczi2024-01-081-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void bql_lock(void) - void bql_unlock(void) - bool bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Fabiano Rosas <farosas@suse.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20240102153529.486531-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* accel/tcg: Remove CF_LAST_IORichard Henderson2023-11-141-1/+1
| | | | | | | | | | | | | | | In cpu_exec_step_atomic, we did not set CF_LAST_IO, which lead to a loop with cpu_io_recompile. But since 18a536f1f8 ("Always require can_do_io") we no longer need a flag to indicate when the last insn should have can_do_io set, so remove the flag entirely. Reported-by: Clément Chigot <chigot@adacore.com> Tested-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Claudio Fontana <cfontana@suse.de> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1961 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Make monitor.c a target-agnostic unitPhilippe Mathieu-Daudé2023-10-041-0/+1
| | | | | | | | | | | | Move target-agnostic declarations from "internal-target.h" to a new "internal-common.h" header. monitor.c now don't include target specific headers and can be compiled once in system_ss[]. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20230914185718.76241-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Rename target-specific 'internal.h' -> 'internal-target.h'Philippe Mathieu-Daudé2023-10-041-1/+1
| | | | | | | | | | | | accel/tcg/internal.h contains target specific declarations. Unit files including it become "target tainted": they can not be compiled as target agnostic. Rename using the '-target' suffix to make this explicit. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20230914185718.76241-9-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Replace CPUState.env_ptr with cpu_env()Richard Henderson2023-10-041-4/+4
| | | | | Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Remove cpu_neg()Richard Henderson2023-10-031-7/+7
| | | | | | | | Now that CPUNegativeOffsetState is part of CPUState, we can reference it directly. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Move can_do_io to CPUNegativeOffsetStateRichard Henderson2023-10-031-1/+1
| | | | | | | | | Minimize the displacement to can_do_io, since it may be touched at the start of each TranslationBlock. It fits into other padding within the substructure. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Have tcg_exec_realizefn() return a booleanPhilippe Mathieu-Daudé2023-10-031-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have tcg_exec_realizefn() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20231003123026.99229-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Always set CF_LAST_IO with CF_NOIRQRichard Henderson2023-09-281-1/+1
| | | | | | | | Without this we can get see loops through cpu_io_recompile, in which the cpu makes no progress. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Zero-pad PC in TCG CPU exec trace linesPeter Maydell2023-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit f0a08b0913befbd we changed the type of the PC from target_ulong to vaddr. In doing so we inadvertently dropped the zero-padding on the PC in trace lines (the second item inside the [] in these lines). They used to look like this on AArch64, for instance: Trace 0: 0x7f2260000100 [00000000/0000000040000000/00000061/ff200000] and now they look like this: Trace 0: 0x7f4f50000100 [00000000/40000000/00000061/ff200000] and if the PC happens to be somewhere low like 0x5000 then the field is shown as /5000/. This is because TARGET_FMT_lx is a "%08x" or "%016x" specifier, depending on TARGET_LONG_SIZE, whereas VADDR_PRIx is just PRIx64 with no width specifier. Restore the zero-padding by adding an 016 width specifier to this tracing and a couple of others that were similarly recently changed to use VADDR_PRIx without a width specifier. We can't unfortunately restore the "32-bit guests are padded to 8 hex digits and 64-bit guests to 16 hex digits" behaviour so easily. Fixes: f0a08b0913befbd ("accel/tcg/cpu-exec.c: Widen pc to vaddr") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-id: 20230711165434.4123674-1-peter.maydell@linaro.org
* accel/tcg: Always lock pages before translationRichard Henderson2023-07-151-0/+20
| | | | | | | | | | | We had done this for user-mode by invoking page_protect within the translator loop. Extend this to handle system mode as well. Move page locking out of tb_link_page. Reported-by: Liren Wei <lrwei@bupt.edu.cn> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Richard W.M. Jones <rjones@redhat.com>
* accel/tcg: Split out cpu_exec_longjmp_cleanupRichard Henderson2023-07-151-24/+19
| | | | | | | | | | Share the setjmp cleanup between cpu_exec_step_atomic and cpu_exec_setjmp. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg/cpu-exec.c: Widen pc to vaddrAnton Johansson2023-06-261-17/+17
| | | | | | | Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-7-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* target: Widen pc/cs_base in cpu_get_tb_cpu_stateAnton Johansson2023-06-261-3/+6
| | | | | | | Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-4-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg/cpu-exec: Use generic 'helper-proto-common.h' headerPhilippe Mathieu-Daudé2023-06-201-1/+1
| | | | | | | | | We only need lookup_tb_ptr() prototype. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230611085846.21415-3-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Check for USER_ONLY definition instead of SOFTMMU onePhilippe Mathieu-Daudé2023-06-201-2/+2
| | | | | | | | | | | | Since we *might* have user emulation with softmmu, replace the system emulation check by !user emulation one. Invert some if() ladders for clarity. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* util/log: Add vector registers to logIvan Klokov2023-06-131-0/+3
| | | | | | | | | Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu. Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
* atomics: eliminate mb_read/mb_setPaolo Bonzini2023-06-061-1/+1
| | | | | | | | | | | | | | | qatomic_mb_read and qatomic_mb_set were the very first atomic primitives introduced for QEMU; their semantics are unclear and they provide a false sense of safety. The last use of qatomic_mb_read() has been removed, so delete it. qatomic_mb_set() instead can survive as an optimized qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but rename it to qatomic_set_mb() to match the order of the two operations. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec-all: Widen TranslationBlock pc and cs_base to 64-bitsRichard Henderson2023-06-051-1/+1
| | | | | | | | | This makes TranslationBlock agnostic to the address size of the guest. Use vaddr for pc, since that's always a virtual address. Use uint64_t for cs_base, since usage varies between guests. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: include cs_base in our hash calculationsAlex Bennée2023-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tcg: remove the final vestiges of dstateAlex Bennée2023-06-011-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we no longer have dynamic state affecting things we can remove the additional fields in cpu.h and simplify the TB hash calculation. For the benchmark: hyperfine -w 2 -m 20 \ "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \ -machine type=virt,highmem=off \ -display none -m 2048 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \ -device scsi-hd,drive=hd -smp 4 \ -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \ -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \ -snapshot" It has a marginal effect on runtime, before: Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s] Range (min … max): 24.420 s … 32.565 s 20 runs after: Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s] Range (min … max): 21.663 s … 29.937 s 20 runs Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358 Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230526165401.574474-10-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tcg: Remove DEBUG_DISASRichard Henderson2023-05-231-2/+0
| | | | | | | | This had been set since the beginning, is never undefined, and it would seem to be harmful to debugging to do so. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Use one_insn_per_tb global instead of old singlestep globalPeter Maydell2023-05-021-1/+1
| | | | | | | | | | | | | | | | | | | The only place left that looks at the old 'singlestep' global variable is the TCG curr_cflags() function. Replace the old global with a new 'one_insn_per_tb' which is defined in tcg-all.c and declared in accel/tcg/internal.h. This keeps it restricted to the TCG code, unlike 'singlestep' which was available to every file in the system and defined in multiple different places for softmmu vs linux-user vs bsd-user. While we're making this change, use qatomic_read() and qatomic_set() on the accesses to the new global, because TCG will read it without holding a lock. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230417164041.684562-4-peter.maydell@linaro.org
* accel/tcg: Fix jump cache set in cpu_exec_loopRichard Henderson2023-04-041-4/+13
| | | | | | | | | Assign pc and use store_release to assign tb. Fixes: 2dd5b7a1b91 ("accel/tcg: Move jmp-cache `CF_PCREL` checks to caller") Reported-by: Weiwei Li <liweiwei@iscas.ac.cn> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Clear plugin_mem_cbs on TB exitRichard Henderson2023-03-221-4/+1
| | | | | | | | | | | | Do this in cpu_tb_exec (normal exit) and cpu_loop_exit (exception), adjacent to where we reset can_do_io. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1381 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230310195252.210956-2-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230315174331.2959-12-alex.bennee@linaro.org> Reviewed-by: Emilio Cota <cota@braap.org>
* accel/tcg: Replace `tb_pc()` with `tb->pc`Anton Johansson2023-03-011-3/+3
| | | | | | | Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-13-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Move jmp-cache `CF_PCREL` checks to callerAnton Johansson2023-03-011-15/+41
| | | | | | | | | | | | | tb-jmp-cache.h contains a few small functions that only exist to hide a CF_PCREL check, however the caller often already performs such a check. This patch moves CF_PCREL checks from the callee to the caller, and also removes these functions which now only hide an access of the jmp-cache. Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-12-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* accel/tcg: Replace `TARGET_TB_PCREL` with `CF_PCREL`Anton Johansson2023-03-011-4/+4
| | | | | | | Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-5-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* replay: Extract core API to 'exec/replay-core.h'Philippe Mathieu-Daudé2023-02-271-1/+1
| | | | | | | | | | | | | | | | | | | replay API is used deeply within TCG common code (common to user and system emulation). Unfortunately "sysemu/replay.h" requires some QAPI headers for few system-specific declarations, example: void replay_input_event(QemuConsole *src, InputEvent *evt); Since commit c2651c0eaa ("qapi/meson: Restrict UI module to system emulation and tools") the QAPI header defining the InputEvent is not generated anymore. To keep it simple, extract the 'core' replay prototypes to a new "exec/replay-core.h" header which we include in the TCG code that doesn't need the rest of the replay API. Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <20221219170806.60580-5-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* accel/tcg: Restrict 'qapi-commands-machine.h' to system emulationPhilippe Mathieu-Daudé2023-02-271-86/+2
| | | | | | | | | | | | | | Since commit a0e61807a3 ("qapi: Remove QMP events and commands from user-mode builds") we don't generate the "qapi-commands-machine.h" header in a user-emulation-only build. Rename 'hmp.c' as 'monitor.c' and move the QMP functions from cpu-exec.c (which is always compiled) to monitor.c (which is only compiled when system-emulation is selected). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-4-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* exec: Remove unused 'qemu/timer.h' timerPhilippe Mathieu-Daudé2023-02-271-1/+0
| | | | | | Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-2-philmd@linaro.org>
* Don't include headers already included by qemu/osdep.hMarkus Armbruster2023-02-081-1/+0
| | | | | | | | | This commit was created with scripts/clean-includes. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
* cpu-exec: assert that plugin_mem_cbs is NULL after executionEmilio Cota2023-02-021-0/+2
| | | | | | | | | | | | Fixes: #1381 Signed-off-by: Emilio Cota <cota@braap.org> Message-Id: <20230108165107.62488-1-cota@braap.org> [AJB: manually applied follow-up fix] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230124180127.1881110-35-alex.bennee@linaro.org>
* cpu: free cpu->tb_jmp_cache with RCUEmilio Cota2023-02-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the appended use-after-free. The root cause is that during tb invalidation we use CPU_FOREACH, and therefore to safely free a vCPU we must wait for an RCU grace period to elapse. $ x86_64-linux-user/qemu-x86_64 tests/tcg/x86_64-linux-user/munmap-pthread ================================================================= ==1800604==ERROR: AddressSanitizer: heap-use-after-free on address 0x62d0005f7418 at pc 0x5593da6704eb bp 0x7f4961a7ac70 sp 0x7f4961a7ac60 READ of size 8 at 0x62d0005f7418 thread T2 #0 0x5593da6704ea in tb_jmp_cache_inval_tb ../accel/tcg/tb-maint.c:244 #1 0x5593da6704ea in do_tb_phys_invalidate ../accel/tcg/tb-maint.c:290 #2 0x5593da670631 in tb_phys_invalidate__locked ../accel/tcg/tb-maint.c:306 #3 0x5593da670631 in tb_invalidate_phys_page_range__locked ../accel/tcg/tb-maint.c:542 #4 0x5593da67106d in tb_invalidate_phys_range ../accel/tcg/tb-maint.c:614 #5 0x5593da6a64d4 in target_munmap ../linux-user/mmap.c:766 #6 0x5593da6dba05 in do_syscall1 ../linux-user/syscall.c:10105 #7 0x5593da6f564c in do_syscall ../linux-user/syscall.c:13329 #8 0x5593da49e80c in cpu_loop ../linux-user/x86_64/../i386/cpu_loop.c:233 #9 0x5593da6be28c in clone_func ../linux-user/syscall.c:6633 #10 0x7f496231cb42 in start_thread nptl/pthread_create.c:442 #11 0x7f49623ae9ff (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff) 0x62d0005f7418 is located 28696 bytes inside of 32768-byte region [0x62d0005f0400,0x62d0005f8400) freed by thread T148 here: #0 0x7f49627b6460 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52 #1 0x5593da5ac057 in cpu_exec_unrealizefn ../cpu.c:180 #2 0x5593da81f851 (/home/cota/src/qemu/build/qemu-x86_64+0x484851) Signed-off-by: Emilio Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230111151628.320011-2-cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230124180127.1881110-27-alex.bennee@linaro.org>
* tcg: Remove TCG_TARGET_HAS_direct_jumpRichard Henderson2023-01-171-12/+11
| | | | | | | | | We now have the option to generate direct or indirect goto_tb depending on the dynamic displacement, thus the define is no longer necessary or completely accurate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Change tb_target_set_jmp_target argumentsRichard Henderson2023-01-171-3/+8
| | | | | | | Replace 'tc_ptr' and 'addr' with 'tb' and 'n'. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add TranslationBlock.jmp_insn_offsetRichard Henderson2023-01-171-3/+2
| | | | | | | | | | | Stop overloading jmp_target_arg for both offset and address, depending on TCG_TARGET_HAS_direct_jump. Instead, add a new field to hold the jump insn offset and always set the target address in jmp_target_addr[]. This will allow a tcg backend to use either direct or indirect depending on displacement. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>