results/scraper/fex/3805


1
2

AVX128: Prescale gathers with 64-bit index and scale factor of 2 or 4
We know that overflow behaviour with 64-bit indexes just throws away the top bits. Instead of falling back to the ASIMD implementation when the scale factor is 2 or 4, just prescale the values in the vector registers and then pass scale to the IR operation as 1. This will let it continue to use the faster SVE codepath while just eating two more instructions compared to 1 or 8 scaling.