add downloaded fex bug-reports HEAD main

author: Christian Krinitsin <mail@krinitsin.com> 2025-07-17 09:10:43 +0200
committer: Christian Krinitsin <mail@krinitsin.com> 2025-07-17 09:10:43 +0200
commit: f2ec263023649e596c5076df32c2d328bc9393d2 (patch)
tree: 5dd86caab46e552bd2e62bf9c4fb1a7504a44db4 /results/scraper/fex/2464
parent: 63d2e9d409831aa8582787234cae4741847504b7 (diff)
download: qemu-analysis-main.tar.gz
qemu-analysis-main.zip
1 files changed, 11 insertions, 0 deletions
diff --git a/results/scraper/fex/2464 b/results/scraper/fex/2464
new file mode 100644
index 000000000..8334aaef0
--- /dev/null
+++ b/results/scraper/fex/2464
@@ -0,0 +1,11 @@
+ARM64JIT: Optimize Memset operation.
+When switching over to the new IR operation, the primary concern was about changing IR semantics rather than optimizing.
+
+With inline constant on the IR operation we can detect zero being stored and optimize to `DC ZVA`, but also we can do the same optimization that compilers do and unwind the loop to 128-bit stores on the non-TSO variant. Getting closer to native memset perf is ideal.
+
+And additional step in the future will be to have an additional optimization in order to use the new MOPS instructions that ARM provides.
+![cortex-x1c](https://user-images.githubusercontent.com/1018829/222933188-00817ec0-9b10-4029-a7d8-f11da96ac853.png)
+
+[ ] - Loop unwind (Only need to have a tight 128-bit loop. Enough to saturate the store pipelines)
+[ ] - DC ZVA optimization
+[ ] - ARM MOPS implementation
author	Christian Krinitsin <mail@krinitsin.com>	2025-07-17 09:10:43 +0200
committer	Christian Krinitsin <mail@krinitsin.com>	2025-07-17 09:10:43 +0200
commit	f2ec263023649e596c5076df32c2d328bc9393d2 (patch)
tree	5dd86caab46e552bd2e62bf9c4fb1a7504a44db4 /results/scraper/fex/2464
parent	63d2e9d409831aa8582787234cae4741847504b7 (diff)
download	qemu-analysis-main.tar.gz qemu-analysis-main.zip