summary refs log tree commit diff stats
path: root/results/scraper/fex/4535
diff options
context:
space:
mode:
authorChristian Krinitsin <mail@krinitsin.com>2025-07-17 09:10:43 +0200
committerChristian Krinitsin <mail@krinitsin.com>2025-07-17 09:10:43 +0200
commitf2ec263023649e596c5076df32c2d328bc9393d2 (patch)
tree5dd86caab46e552bd2e62bf9c4fb1a7504a44db4 /results/scraper/fex/4535
parent63d2e9d409831aa8582787234cae4741847504b7 (diff)
downloadqemu-analysis-main.tar.gz
qemu-analysis-main.zip
add downloaded fex bug-reports HEAD main
Diffstat (limited to 'results/scraper/fex/4535')
-rw-r--r--results/scraper/fex/45359
1 files changed, 9 insertions, 0 deletions
diff --git a/results/scraper/fex/4535 b/results/scraper/fex/4535
new file mode 100644
index 000000000..916cee444
--- /dev/null
+++ b/results/scraper/fex/4535
@@ -0,0 +1,9 @@
+Work towards reducing codegen for Interpreter fallbacks
+Remove the ABI spilling and everything from the JIT blocks themselves and move it to the Dispatcher. Will alleviate some of the problems that #4486 is hitting but not expected to remove it.
+
+A bit counter-intuitively, generating a bunch of ABI handling code inline in the JIT is actually /worse/ for performance than one might expect. It's actually better to generate all this code in the dispatcher, and then eat an additional branch to jump to it instead. This is something that I found out years ago in the Dolphin JIT.
+In a multithreaded environment (without code sharing) it also means that we only have this ABI handling consuming cachelines from one memory region, making it more likely to already be in L2 for cores.
+
+I already started working towards this a couple months ago when I was keeping FPRs in vector registers for ABI calls, which reduces the burden of putting things in GPRs (especially because of win32 ABI).
+
+Basic gist is to move all the ABI handling for the 16 different fallback ABIs to the dispatcher, and then the JIT only needs to do some /very/ minor work instead. This'll cut the number of instructions that the JIT needs to emit from 34-60ish to something like less than a dozen? Haven't implemented it yet so I don't have hard numbers yet.
\ No newline at end of file