graphic: 0.835 performance: 0.790 architecture: 0.686 device: 0.621 mistranslation: 0.440 semantic: 0.435 network: 0.407 ppc: 0.388 vnc: 0.371 files: 0.370 socket: 0.322 user-level: 0.315 VMM: 0.292 i386: 0.284 permissions: 0.261 TCG: 0.260 x86: 0.256 kernel: 0.251 risc-v: 0.221 arm: 0.215 virtual: 0.214 register: 0.212 PID: 0.199 boot: 0.197 peripherals: 0.177 hypervisor: 0.170 debug: 0.132 KVM: 0.122 assembly: 0.066 Performance Regression in QEMU (amd64 Emulating LoongArch64) from 8.0.4 to 9.0.2 Description of problem: Previous Performance: In May 2023, we were using QEMU 8.0.4 for qemu-user emulation, and the performance was satisfactory. The setup did not include LSX (Loongson SIMD Extensions) support in either the system or QEMU. Performance results are shown in Figure A. Current Performance: Recently, we upgraded to QEMU 9.0.2. Both the system and QEMU now support vectorized instruction sets. However, we observed a performance decline to less than 60% of the previous benchmarks. We found that the slowdown is not caused by LSX. Disabling LSX compilation in the new version results in even worse performance. However, there are indeed significant differences between the two systems in terms of LSX support. Additional information: We use systemd-nspawn and qemu-binfmt for containerized operations. You can get a clean chroot from lcpu release [here](https://github.com/lcpu-club/loongarchlinux-dockerfile/releases/download/20240806/base-devel-loong64-20240806.tar.zst) Figure A, performance in May 2023 ![Figure A](/uploads/037647ca85b0fe4fd9d77a3c91b29e7d/微信图片_20240808124226.jpg) Figure B, performance nowadays ![Figure B](/uploads/d5837ea77ca5a998fa1ca45070862331/微信图片_20240808124233.jpg)