diff options
Diffstat (limited to 'results/classifier/deepseek-2-tmp/output/performance/1740219')
| -rw-r--r-- | results/classifier/deepseek-2-tmp/output/performance/1740219 | 60 |
1 files changed, 0 insertions, 60 deletions
diff --git a/results/classifier/deepseek-2-tmp/output/performance/1740219 b/results/classifier/deepseek-2-tmp/output/performance/1740219 deleted file mode 100644 index 84c708fe4..000000000 --- a/results/classifier/deepseek-2-tmp/output/performance/1740219 +++ /dev/null @@ -1,60 +0,0 @@ - -static linux-user ARM emulation has several-second startup time - -static linux-user emulation has several-second startup time - -My problem: I'm a Parabola packager, and I'm updating our -qemu-user-static package from 2.8 to 2.11. With my new -statically-linked 2.11, running `qemu-arm /my/arm-chroot/bin/true` -went from taking 0.006s to 3s! This does not happen with the normal -dynamically linked 2.11, or the old static 2.8. - -What happens is it gets stuck in -`linux-user/elfload.c:init_guest_space()`. What `init_guest_space` -does is map 2 parts of the address space: `[base, base+guest_size]` -and `[base+0xffff0000, base+0xffff0000+page_size]`; where it must find -an acceptable `base`. Its strategy is to `mmap(NULL, guest_size, -...)` decide where the first range is, and then check if that -+0xffff0000 is also available. If it isn't, then it starts trying -`mmap(base, ...)` for the entire address space from low-address to -high-address. - -"Normally," it finds an accaptable `base` within the first 2 tries. -With a static 2.11, it's taking thousands of tries. - ----- - -Now, from my understanding, there are 2 factors working together to -cause that in static 2.11 but not the other builds: - - - 2.11 increased the default `guest_size` from 0xf7000000 to 0xffff0000 - - PIE (and thus ASLR) is disabled for static builds - -For some reason that I don't understand, with the smaller -`guest_size` the initial `mmap(NULL, guest_size, ...)` usually -returns an acceptable address range; but larger `guest_size` makes it -consistently return a block of memory that butts right up against -another already mapped chunk of memory. This isn't just true on the -older builds, it's true with the 2.11 builds if I use the `-R` flag to -shrink the `guest_size` back down to 0xf7000000. That is with -linux-hardened 4.13.13 on x86-64. - -So then, it it falls back to crawling the entire address space; so it -tries base=0x00001000. With ASLR, that probably succeeds. But with -ASLR being disabled on static builds, the text segment is at -0x60000000; which is does not leave room for the needed -0xffff1000-size block before it. So then it tries base=0x00002000. -And so on, more than 6000 times until it finally gets to and passes -the text segment; calling mmap more than 12000 times. - ----- - -I'm not sure what the fix is. Perhaps try to mmap a continuous chunk -of size 0xffff1000, then munmap it and then mmap the 2 chunks that we -actually need. The disadvantage to that is that it does not support -the sparse address space that the current algorithm supports for -`guest_size < 0xffff0000`. If `guest_size < 0xffff0000` *and* the big -mmap fails, then it could fall back to a sparse search; though I'm not -sure the current algorithm is a good choice for it, as we see in this -bug. Perhaps it should inspect /proc/self/maps to try to find a -suitable range before ever calling mmap? \ No newline at end of file |