diff options
Diffstat (limited to 'results/scraper/fex/2464')
| -rw-r--r-- | results/scraper/fex/2464 | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/results/scraper/fex/2464 b/results/scraper/fex/2464 new file mode 100644 index 000000000..8334aaef0 --- /dev/null +++ b/results/scraper/fex/2464 @@ -0,0 +1,11 @@ +ARM64JIT: Optimize Memset operation. +When switching over to the new IR operation, the primary concern was about changing IR semantics rather than optimizing. + +With inline constant on the IR operation we can detect zero being stored and optimize to `DC ZVA`, but also we can do the same optimization that compilers do and unwind the loop to 128-bit stores on the non-TSO variant. Getting closer to native memset perf is ideal. + +And additional step in the future will be to have an additional optimization in order to use the new MOPS instructions that ARM provides. + + +[ ] - Loop unwind (Only need to have a tight 128-bit loop. Enough to saturate the store pipelines) +[ ] - DC ZVA optimization +[ ] - ARM MOPS implementation |