summary refs log tree commit diff stats
path: root/results/scraper/fex/3657
blob: be52587d3d3585bc6f654855a3e23f2e01cdbb67 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Redundant context loads when distinct pshufd's in a block
bytemark has a sequence like:

```
        "pshufd xmm2,xmm3,0xf5",
        "addpd  xmm1,xmm12",
        "pmuludq xmm3,xmm9",
        "pshufd xmm3,xmm3,0xe8",
```

because the pshufd indices differ, we can't cache the constant. and we end up generating code

```
        "ldr x0, [x28, #1760]",
        "ldr q3, [x0, #3920]",
        "tbl v18.16b, {v19.16b}, v3.16b",
        "fadd v17.2d, v17.2d, v28.2d",
        "uzp1 v4.4s, v19.4s, v19.4s",
        "uzp1 v5.4s, v25.4s, v25.4s",
        "umull v19.2d, v4.2s, v5.2s",
        "ldr x0, [x28, #1760]",
        "ldr q4, [x0, #3712]",
        "tbl v19.16b, {v19.16b}, v4.16b",
```

Notice that we reload the base vector addresses in the same block, instead of caching it. At minimum we should probably fix that. But I'm also questioning if there's maybe a better way to do pshufd? Not sure