Improve performance of the register allocator The register allocator becomes a fairly large time sink in large functions. This needs to be improved quite substantially. Additionally the cmpxchg instruction is going to add a paired register class which will add register class interference testing, which could drive up CPU usage more.