summary refs log tree commit diff stats
path: root/results/scraper/fex/3788
diff options
context:
space:
mode:
Diffstat (limited to 'results/scraper/fex/3788')
-rw-r--r--results/scraper/fex/378821
1 files changed, 21 insertions, 0 deletions
diff --git a/results/scraper/fex/3788 b/results/scraper/fex/3788
new file mode 100644
index 000000000..8f03370fd
--- /dev/null
+++ b/results/scraper/fex/3788
@@ -0,0 +1,21 @@
+AVX128: 256-bit stores to memory can be converted to store pairs
+Example:

+```json

+    "vmovdqa [rax], ymm0": {

+      "ExpectedInstructionCount": 3,

+      "Comment": [

+        "Map 1 0b01 0x7f 128-bit"

+      ],

+      "ExpectedArm64ASM": [

+        "ldr q2, [x28, #16]",

+        "str q16, [x4]",

+        "str q2, [x4, #16]"

+      ]

+    },

+ ```

+ 

+A little quirky since some care should be taken for the address calculation overhead, but today we currently implement these stores as two 128-bit stores. When vector TSO emulation is enabled, this turns in to dmb+str+dmb+str, if we were using pair stores this would turn in to a single dmb+stp pairing, eating a little bit of ALU work if we can't synthesize the SIB addressing directly.

+

+Additionally FEAT_LRCPC3 adds the STLUR and LDAPUR x86-TSO instructions, which we would probably want to choose to use in the backend instead when Vector TSO is enabled. For example the `dmb+stp` instruction pairing would turn in to `stlur+stlur` since that would be more efficient in that case. Similar story for the load side, but we don't have a good solution for returning a pair of registers atm. 

+

+No hardware supports FEAT_LRCPC3 today so that particular edge case isn't very interesting.
\ No newline at end of file