summary refs log tree commit diff stats
path: root/results/scraper/launchpad/1885332
blob: 8dc6dac6d5e9a0be14536d57dacdf160c2d8b981 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
Error in user-mode calculation of ELF aux vector's AT_PHDR


I have an (admittedly strange) statically-linked ELF binary for Linux that runs just fine on top of the Linux kernel in QEMU full-system emulation, but crashes before main in user-mode emulation. Specifically, it crashes when initializing thread-local storage in glibc's _dl_aux_init, because it reads out a strange value from the AT_PHDR entry of the ELF aux vector.

The binary has these program headers:

  Program Headers:
    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    EXIDX          0x065874 0x00075874 0x00075874 0x00570 0x00570 R   0x4
    PHDR           0x0a3000 0x00900000 0x00900000 0x00160 0x00160 R   0x1000
    LOAD           0x0a3000 0x00900000 0x00900000 0x00160 0x00160 R   0x1000
    LOAD           0x000000 0x00010000 0x00010000 0x65de8 0x65de8 R E 0x10000
    LOAD           0x066b7c 0x00086b7c 0x00086b7c 0x02384 0x02384 RW  0x10000
    NOTE           0x000114 0x00010114 0x00010114 0x00044 0x00044 R   0x4
    TLS            0x066b7c 0x00086b7c 0x00086b7c 0x00010 0x00030 R   0x4
    GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x8
    GNU_RELRO      0x066b7c 0x00086b7c 0x00086b7c 0x00484 0x00484 R   0x1
    LOAD           0x07e000 0x00089000 0x00089000 0x03f44 0x03f44 R E 0x1000
    LOAD           0x098000 0x00030000 0x00030000 0x01000 0x01000 RW  0x1000

If I build the Linux kernel with the following patch to the very end of create_elf_tables in fs/binfmt_elf.c

  /* Put the elf_info on the stack in the right place.  */
  elf_addr_t *my_auxv = (elf_addr_t *) mm->saved_auxv;
  int i;
  for (i = 0; i < 15; i++) {
    printk("0x%x = 0x%x", my_auxv[2*i], my_auxv[(2*i)+ 1]);
  }
  if (copy_to_user(sp, mm->saved_auxv, ei_index * sizeof(elf_addr_t)))
      return -EFAULT;
  return 0;

and run it like this:

  qemu-system-arm \
    -M versatilepb \
    -nographic \
    -dtb ./dts/versatile-pb.dtb \
    -kernel zImage \
    -M versatilepb \
    -m 128M \
    -append "earlyprintk=vga,keep" \
    -initrd initramfs

after I've built the kernel initramfs like this (where "init" is the binary in question):

  make ARCH=arm versatile_defconfig
  make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- all -j10
  cp "$1" arch/arm/boot/init
  cd arch/arm/boot
  echo init | cpio -o --format=newc > initramfs

then I get the following output. This is the kernel's view of the aux vector for this binary:

  0x10 = 0x1d7
  0x6 = 0x1000
  0x11 = 0x64
  0x3 = 0x900000
  0x4 = 0x20
  0x5 = 0xb
  0x7 = 0x0
  0x8 = 0x0
  0x9 = 0x101b8
  0xb = 0x0
  0xc = 0x0
  0xd = 0x0
  0xe = 0x0
  0x17 = 0x0
  0x19 = 0xbec62fb5

However, if I run "qemu-arm -g 12345 binary" and use GDB to peek at the aux vector at the beginning of __libc_start_init (for example, using this Python GDB API script: https://gist.github.com/langston-barrett/5573d64ae0c9953e2fa0fe26847a5e1e), then I see the following values:

  AT_PHDR = 0xae000
  AT_PHENT = 0x20
  AT_PHNUM = 0xb
  AT_PAGESZ = 0x1000
  AT_BASE = 0x0
  AT_FLAGS = 0x0
  AT_ENTRY = 0x10230
  AT_UID = 0x3e9
  AT_EUID = 0x3e9
  AT_GID = 0x3e9
  AT_EGID = 0x3e9
  AT_HWCAP = 0x1fb8d7
  AT_CLKTCK = 0x64
  AT_RANDOM = -0x103c0
  AT_HWCAP2 = 0x1f
  AT_NULL = 0x0

The crucial difference is in AT_PHDR (0x3), which is indeed the virtual address of the PHDR segment when the kernel calculates it, but is not when QEMU calculates it.

qemu-arm --version
qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.26)

I just confirmed that this is still a problem on git tag v5.0.0, where I applied the following:

  diff --git a/linux-user/elfload.c b/linux-user/elfload.c
  index 619c054cc4..093656d059 100644
  --- a/linux-user/elfload.c
  +++ b/linux-user/elfload.c
  @@ -2016,6 +2016,7 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc,
      /* There must be exactly DLINFO_ITEMS entries here, or the assert
        * on info->auxv_len will trigger.
        */
  +    printf("PHDR: %x\n", (abi_ulong)(info->load_addr + exec->e_phoff));
      NEW_AUX_ENT(AT_PHDR, (abi_ulong)(info->load_addr + exec->e_phoff));
      NEW_AUX_ENT(AT_PHENT, (abi_ulong)(sizeof (struct elf_phdr)));
      NEW_AUX_ENT(AT_PHNUM, (abi_ulong)(exec->e_phnum));

and saw:

  PHDR: ae000

Taking a peek at how Linux and QEMU calculate AT_PHDR for static binaries reveals the following. Both involve the program headers' offset (e_phoff) added to a value I'll call load_addr (as in the kernel).

In the kernel, load_addr is

  elf_ppnt->p_vaddr - elf_ppnt->p_offset

where elf_ppnt is the program header entry of the first segment with type LOAD: https://github.com/torvalds/linux/blob/242b23319809e05170b3cc0d44d3b4bd202bb073/fs/binfmt_elf.c#L1120

In QEMU, load_addr is set to an earlier value loaddr, which is set to

  min_i(phdr[i].p_vaddr - phdr[i].p_offset)

where min_i is the minimum over indices "i" of LOAD segments. https://github.com/qemu/qemu/blob/9e7f1469b9994d910fc1b185c657778bde51639c/linux-user/elfload.c#L2407. If you perform this calculation by hand for the program headers posted at the beginning of this thread, you'll get ae000, as expected.

The problem here is that QEMU takes a minimum where Linux just takes the first value. Presumably, changing QEMU's behavior to match that of the kernel wouldn't break anything that wouldn't be broken if it really ran on Linux. Unfortunately, Linux's ELF loader is much more picky than the ELF standard, but that's a whole other story...

@langston0 Thanks for detailed explanation, got the same problem for qemu-s390.


The way to reproduce (linux kernel >= 4.8, for example: Ubuntu 18.04):
# Register qemu binfmt_misc handlers
$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

$ cat Dockerfile.s390x 
FROM s390x/ubuntu
RUN apt-get update && \
    apt-get install -y \
    gcc make libpcre3-dev libreadline-dev

RUN cd /home && git clone https://github.com/nginx/njs

RUN cd /home/njs && ./configure --cc-opt='-O0 -static -lm -lrt -pthread -Wl,--whole-archive -lpthread -ltinfo -Wl,--no-whole-archive' && make njs

$ docker build -t njs/390x -f Dockerfile.s390x .

# check the binary (WORKS!)
# inside docker s390 binaries are executed using qemu-s390-static from the host
$ docker run  -t njs/390x /home/njs/build/njs -c 'console.log("hello")'
hello

# copy binary to host
$ docker run  -v `pwd`:/m -ti njs/390x cp /home/njs/build/njs /m/njs-s390

# deregister binfmt handler
$ sudo bash -c "echo -1 > /proc/sys/fs/binfmt_misc/qemu-s390x"

# run qemu gdb
$ qemu-s390x  -g 12345 ./njs-s390

# in a separate terminal
$ gdb-multiarch ./njs-s390 -ex 'target remote localhost:12345'
0x0000000001000520 in _start ()
(gdb) si
0x0000000001000524 in _start ()
(gdb) si
0x000000000100052a in _start ()
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x00000000011a418c in _dl_aux_init ()
(gdb) bt
#0  0x00000000011a418c in _dl_aux_init ()
#1  0x00000000011663f0 in __libc_start_main ()
#2  0x0000000001000564 in _start ()

qemu-s390x --version
qemu-s390x version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.28)




BTW, before "sudo bash -c "echo -1 > /proc/sys/fs/binfmt_misc/qemu-s390x"

njs-s390 also works on the host:

$ ./njs-s390 -c 'console.log("hello")'
hello

$ file njs-s390
njs-s390: ELF 64-bit MSB executable, IBM S/390, version 1 (GNU/Linux), statically linked, BuildID[sha1]=e37618578fb0a8c60f426826167a800e4f314ef3, for GNU/Linux 3.2.0, with debug_info, not stripped

> runs just fine on top of the Linux kernel in QEMU full-system emulation, but crashes before main in user-mode emulation

So it seems system vs user-mode is not the issue here, probably it is related to gdb mode in user-mode qemu.

@Dimitry To confirm that this is really the same issue (and not an unrelated crash in the same function), could you post:

 1. the ELF headers ("readelf -h"),
 2. the program headers ("readelf -l"), and
 3. the output (the AUX VECTOR section) from this GDB script (suitably modified for your program), when connecting to QEMU's GDB server? https://gist.github.com/langston-barrett/5573d64ae0c9953e2fa0fe26847a5e1e

@Langston  will do tomorrow. s390x ABI requires heavy changes to the python script.

When I switch to armv7 the issue goes away

$ cat Dockerfile.armv7 
FROM arm32v7/ubuntu
RUN apt-get update && \
    apt-get install -y \
    gcc make libpcre3-dev libreadline-dev git

RUN cd /home && git clone https://github.com/nginx/njs

RUN cd /home/njs && ./configure --cc-opt='-O0 -static -lm -lrt -pthread -Wl,--whole-archive -lpthread -ltinfo -Wl,--no-whole-archive' && make njs

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
$ docker build -t njs/armv7 -f Dockerfile.armv7 .
$ docker run -v `pwd`:/m -ti njs/armv7 cp /home/njs/build/njs /m/njs-armv7

$ readelf -l ./njs-armv7

Elf file type is EXEC (Executable file)
Entry point 0x12fb9
There are 7 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  EXIDX          0x1be338 0x001ce338 0x001ce338 0x009b8 0x009b8 R   0x4
  LOAD           0x000000 0x00010000 0x00010000 0x1becf4 0x1becf4 R E 0x10000
  LOAD           0x1bedfc 0x001dedfc 0x001dedfc 0x17674 0x1c2cc RW  0x10000
  NOTE           0x000114 0x00010114 0x00010114 0x00044 0x00044 R   0x4
  TLS            0x1bedfc 0x001dedfc 0x001dedfc 0x00038 0x00060 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  GNU_RELRO      0x1bedfc 0x001dedfc 0x001dedfc 0x0e204 0x0e204 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     .ARM.exidx 
   01     .note.ABI-tag .note.gnu.build-id .rel.dyn .init .iplt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata .stapsdt.base __libc_subfreeres __libc_IO_vtables __libc_atexit __libc_thread_subfreeres .ARM.extab .ARM.exidx .eh_frame 
   02     .tdata .init_array .fini_array .data.rel.ro .got .data .bss __libc_freeres_ptrs 
   03     .note.ABI-tag .note.gnu.build-id 
   04     .tdata .tbss 
   05     
   06     .tdata .init_array .fini_array .data.rel.ro 

$ readelf -h ./njs-armv7
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 03 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x12fb9
  Start of program headers:          52 (bytes into file)
  Start of section headers:          5696248 (bytes into file)
  Flags:                             0x5000400, Version5 EABI, hard-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         7
  Size of section headers:           40 (bytes)
  Number of section headers:         42
  Section header string table index: 41

$ qemu-arm -g 12345 ./njs-armv7 -c 'console.log("HH")'

$ gdb-multiarch ./njs-armv7 -ex 'source showstack.py'
ARGUMENTS
---------
argc = 3
arg 0 = ./njs-armv7
arg 1 = -c
arg 2 = console.log("HH")

...

AUX VECTOR
----------
AT_PHDR = 10034
AT_PHENT = 20
AT_PHNUM = 7
AT_PAGESZ = 1000
AT_BASE = 0
AT_FLAGS = 0
AT_ENTRY = 12fb9
AT_UID = 3e9
AT_EUID = 3e9
AT_GID = 3e9
AT_EGID = 3e9
AT_HWCAP = 1fb8d7
AT_CLKTCK = 64
AT_RANDOM = -104a0
AT_HWCAP2 = 1f
AT_NULL = 0

$ qemu-arm --version
qemu-arm version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.28)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Built the latest QEMU, the issue goes away


$ bin/debug/native/s390x-linux-user/qemu-s390x --version
qemu-s390x version 5.0.50 (v5.0.0-2358-g6c87d9f311-dirty)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers

$ bin/debug/native/s390x-linux-user/qemu-s390x ../njs/njs-s390 -c 'console.log("HI")'
HI

So my issue seems unrelated, sorry for bothering.

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" within the next 60 days (otherwise it will get
closed as "Expired"). We will then eventually migrate the ticket auto-
matically to the new system (but you won't be the reporter of the bug
in the new system and thus won't get notified on changes anymore).

Thank you and sorry for the inconvenience.



This is an automated cleanup. This bug report has been moved to QEMU's
new bug tracker on gitlab.com and thus gets marked as 'expired' now.
Please continue with the discussion here:

 https://gitlab.com/qemu-project/qemu/-/issues/275