summary refs log tree commit diff stats
path: root/results/scraper/launchpad/1594394
blob: b984e544d3e201906db04a13a546de8063d1f8f1 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
Using setreuid / setegid crashes x86_64 user-mode target

When setreuid() or setegid() are called from x86_64 target code in user mode, qemu crashes inside the NPTL signal handlers.  x86 targets do not directly use a syscall to handle setreuid() / setegid(); instead the x86 NPTL implementation sets up a temporary data region in memory (__xidcmd) and issues a signal (SIGRT1) to all threads, allowing the handler for that signal to issue the syscall.  Under qemu, __xidcmd remains null (see variable display below backtrace).

Backtrace:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3fff85c74fc0 (LWP 74517)]
0x000000006017491c in sighandler_setxid (sig=33, si=0x3fff85c72d08, ctx=0x3fff85c71f90) at nptl-init.c:263
263     nptl-init.c: No such file or directory.
(gdb) thread apply all bt

Thread 3 (Thread 0x3fff87e8efc0 (LWP 74515)):
#0  0x00000000601cc430 in syscall ()
#1  0x0000000060109080 in futex_wait (val=<optimized out>, ev=<optimized out>) at /build/qemu/util/qemu-thread-posix.c:292
#2  qemu_event_wait (ev=0x62367bb0 <rcu_call_ready_event>) at /build/qemu/util/qemu-thread-posix.c:399
#3  0x000000006010f73c in call_rcu_thread (opaque=<optimized out>) at /build/qemu/util/rcu.c:250
#4  0x0000000060176f8c in start_thread (arg=0x3fff87e8efc0) at pthread_create.c:336
#5  0x00000000601cebf4 in clone ()

Thread 2 (Thread 0x3fff85c74fc0 (LWP 74517)):
#0  0x000000006017491c in sighandler_setxid (sig=33, si=0x3fff85c72d08, ctx=0x3fff85c71f90) at nptl-init.c:263
#1  <signal handler called>
#2  0x00000000601cc42c in syscall ()
#3  0x0000000060044b08 in safe_futex (val3=<optimized out>, uaddr2=0x0, timeout=<optimized out>, val=<optimized out>, op=128, uaddr=<optimized out>) at /build/qemu/linux-user/syscall.c:748
#4  do_futex (val3=<optimized out>, uaddr2=275186650880, timeout=0, val=1129, op=128, uaddr=275186651116) at /build/qemu/linux-user/syscall.c:6201
#5  do_syscall (cpu_env=0x1000abfd350, num=<optimized out>, arg1=275186651116, arg2=<optimized out>, arg3=1129, arg4=0, arg5=275186650880, arg6=<optimized out>, arg7=0, arg8=0)
    at /build/qemu/linux-user/syscall.c:10651
#6  0x00000000600347b8 in cpu_loop (env=0x1000abfd350) at /build/qemu/linux-user/main.c:317
#7  0x0000000060036ae0 in clone_func (arg=0x3fffc4c2ca38) at /build/qemu/linux-user/syscall.c:5445
#8  0x0000000060176f8c in start_thread (arg=0x3fff85c74fc0) at pthread_create.c:336
#9  0x00000000601cebf4 in clone ()

Thread 1 (Thread 0x1000aa05000 (LWP 74511)):
#0  0x00000000601cc430 in syscall ()
#1  0x0000000060044b08 in safe_futex (val3=<optimized out>, uaddr2=0x0, timeout=<optimized out>, val=<optimized out>, op=128, uaddr=<optimized out>) at /build/qemu/linux-user/syscall.c:748
#2  do_futex (val3=<optimized out>, uaddr2=1, timeout=0, val=1, op=128, uaddr=275078324992) at /build/qemu/linux-user/syscall.c:6201
#3  do_syscall (cpu_env=0x1000aa23890, num=<optimized out>, arg1=275078324992, arg2=<optimized out>, arg3=1, arg4=0, arg5=1, arg6=<optimized out>, arg7=0, arg8=0) at /build/qemu/linux-user/syscall.c:10651
#4  0x00000000600347b8 in cpu_loop (env=0x1000aa23890) at /build/qemu/linux-user/main.c:317
#5  0x00000000600020e4 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /build/qemu/linux-user/main.c:4779
(gdb) p __xidcmd
$1 = (struct xid_command *) 0x0

https://patches.linaro.org/patch/63313/ may be relevant here.


Whoops, I meant http://patchwork.ozlabs.org/patch/590640/.


Sounds very relevant, yes.  Thanks for the link!

What's the status of this? This is causing aptitude in Debian chroots to reliably segfault under qemu-arm-user.

And to confirm, while somewhat of a hack, that patch does indeed fix aptitude as expected (as well as my minimal test case I wrote when debugging this before stumbling upon this bug report).

I guess this is happening to my attempts to use pbuilder for cross-building armel / mipsel packages

Did anything ever happen here? Trying to upgrade Ubuntu ARM container images using qemu-user on x86-64 from bionic to focal..

Steve, could you describe your problem with more details?
What is the version of qemu you are using?

Sorry, lost your reply in amongst the chaos of my life! OK, quick test case (type at command line, don't run as script!), host arch is x86-64, you need qemu-user-static installed..

wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-armhf-root.tar.xz
sudo -s
mkdir armcont
cd armcont
tar xf ../bionic-server-cloudimg-armhf-root.tar.xz
cp /usr/bin/qemu-arm-static armcont/usr/bin/
rm armcont/etc/resolv.conf; cp /etc/resolv.conf armcont/etc/
systemd-nspawn -D armcont/
do-release-upgrade -d # may need to drop the "-d" once 20.04.1 is released

Yields:

qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x601540af


(Need a "cd .." after the tar, doh.)

Actually, this is possibly not the same bug. I will add to the list to investigate further..

Sorry, heat is definitely getting to me - just realized this is an upstream bug, not an Ubuntu one! Ubuntu package version is 1:2.11+dfsg-1ubuntu7.26. At some point I will look at building qemu from source, but won't be an immediate thing. Apologies for noise .. though if anyone knows off-hand that a fix for this did or didn't get merged that would useful ..

OK, messing around with https://hub.docker.com/r/multiarch/ubuntu-core for quick testing, it looks like modern versions of qemu do not have this bug - I used the test case from Bug #1815911 (which I think is probably a duplicate) and cannot reproduce with:

REPOSITORY                            TAG           IMAGE ID      CREATED         SIZE
docker.io/multiarch/ubuntu-core       armhf-bionic  78064abebdab  41 minutes ago  52 MB

Which seems to use "qemu-arm version 5.0.0 (qemu-5.0.0-2.fc33)"

If copy in the ancient qemu-arm-static that ships with Bionic by default the error returns.

So I guess this can be closed, but would be nice to know which commit fixed this, for cherrypicking ..

Possibly https://github.com/qemu/qemu/commit/6bc024e713fd35eb5fddbe16acd8dc92d27872a9#diff-6389d258c2de2f974953be12cab45851 ?

Bionic's QEMU is 2.11. There were a lot of fixes to the linux-user handling and in particular to various race conditions in its handling of multi-threaded guest programs and also to the guest signal handling code -- I'm not sure it'd be feasible to identify and cherry-pick them all at this point... My stock advice for "any linux-user guest bug" plus "QEMU prior to 4.0" is "try again with a newer QEMU".


The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.

If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.

Thank you and sorry for the inconvenience.

[Expired for QEMU because there has been no activity for 60 days.]