graphic: 0.835
performance: 0.790
architecture: 0.686
device: 0.621
mistranslation: 0.440
semantic: 0.435
network: 0.407
ppc: 0.388
vnc: 0.371
files: 0.370
socket: 0.322
user-level: 0.315
VMM: 0.292
i386: 0.284
permissions: 0.261
TCG: 0.260
x86: 0.256
kernel: 0.251
risc-v: 0.221
arm: 0.215
virtual: 0.214
register: 0.212
PID: 0.199
boot: 0.197
peripherals: 0.177
hypervisor: 0.170
debug: 0.132
KVM: 0.122
assembly: 0.066

Performance Regression in QEMU (amd64 Emulating LoongArch64) from 8.0.4 to 9.0.2
Description of problem:
Previous Performance: In May 2023, we were using QEMU 8.0.4 for qemu-user emulation, and the performance was satisfactory. The setup did not include LSX (Loongson SIMD Extensions) support in either the system or QEMU. Performance results are shown in Figure A.

Current Performance: Recently, we upgraded to QEMU 9.0.2. Both the system and QEMU now support vectorized instruction sets. However, we observed a performance decline to less than 60% of the previous benchmarks.

We found that the slowdown is not caused by LSX. Disabling LSX compilation in the new version results in even worse performance. However, there are indeed significant differences between the two systems in terms of LSX support.
Additional information:
We use systemd-nspawn and qemu-binfmt for containerized operations. You can get a clean chroot from lcpu release [here](https://github.com/lcpu-club/loongarchlinux-dockerfile/releases/download/20240806/base-devel-loong64-20240806.tar.zst)

Figure A, performance in May 2023
![Figure A](/uploads/037647ca85b0fe4fd9d77a3c91b29e7d/微信图片_20240808124226.jpg)

Figure B, performance nowadays
![Figure B](/uploads/d5837ea77ca5a998fa1ca45070862331/微信图片_20240808124233.jpg)