AVX128: Extend 32-bit indices in the case of `VPGATHERDQ ymm1, vm32x, ymm2` For the particular code path that `VPGATHERDQ ymm1, vm32x, ymm2` hits, we can sign-extend the indices provided in vm32x to be 64-bit indices. This only really works because we split the IR operation in to two halves anyway, so 4 32-bit indices in vm32x loading 4 64-bit indices in ymm1 works out to two SVE gather loads using 64-bit indices quite well. We can also then use the scaling optimization that #3805 proposes since the 32-bit signed indices are guaranteed to overflow correctly in this case.