results/classifier/zero-shot/105/graphic/1404690


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148

graphic: 0.890
instruction: 0.858
other: 0.855
semantic: 0.848
assembly: 0.767
KVM: 0.765
vnc: 0.761
mistranslation: 0.755
boot: 0.737
device: 0.737
network: 0.708
socket: 0.692

Qemu crashes with chrooted m68k

I'm using qemu-m68k 2.2.0 to chroot into a m68k coldfire linux, which works fine on the coldfire machine.

I've been able to use binfmt_msc and used the above code to use qemu with strace:

#include <unistd.h>
#include <string.h>

int main(int argc, char **argv, char **envp) {
        char *newargv[argc + 4];

        newargv[0] = argv[0];
        newargv[1] = "-cpu";
        newargv[2] = "cfv4e";
        newargv[3] = "-strace";

        memcpy(&newargv[4], &argv[1], sizeof(*argv) * (argc - 1));
        newargv[argc + 3] = NULL;
        return execve("/usr/bin/qemu-m68k", newargv, envp);
}

Everything works fine. I can run bash, busybox, ash, but when I try to run a ls or just type an invalid command, I got the attached sequence of messages, which end like so:

11351 waitpid(-1,0xf6fffa00,0x3) = -1 errno=10 (No child processes)
qemu: fatal: Illegal instruction: 0000 @ f6fffa30
D0 = ffffffff   A0 = f67dcf50   F0 = 0000000000000000 (           0)
D1 = 0000000a   A1 = f66e0898   F1 = 0000000000000000 (           0)
D2 = f6fffaa8   A2 = f67df268   F2 = 0000000000000000 (           0)
D3 = 00000000   A3 = 00000000   F3 = 0000000000000000 (           0)
D4 = 00000008   A4 = 800026c4   F4 = 0000000000000000 (           0)
D5 = 00000000   A5 = f67d98e0   F5 = 0000000000000000 (           0)
D6 = f6fffaa8   A6 = f6fffa7c   F6 = 0000000000000000 (           0)
D7 = 00000002   A7 = f6fffa24   F7 = 0000000000000000 (           0)
PC = f6fffa30   SR = 0000 ----- FPRESULT =            0
Aborted

How can I debug it further to try to figure out if this is a qemu issue or not? Thanks


On 21 December 2014 at 16:21, Michel Boaventura
<email address hidden> wrote:
> Everything works fine. I can run bash, busybox, ash, but when I try to
> run a ls or just type an invalid command, I got the attached sequence of
> messages, which end like so:
>
> 11351 waitpid(-1,0xf6fffa00,0x3) = -1 errno=10 (No child processes)
> qemu: fatal: Illegal instruction: 0000 @ f6fffa30
> D0 = ffffffff   A0 = f67dcf50   F0 = 0000000000000000 (           0)
> D1 = 0000000a   A1 = f66e0898   F1 = 0000000000000000 (           0)
> D2 = f6fffaa8   A2 = f67df268   F2 = 0000000000000000 (           0)
> D3 = 00000000   A3 = 00000000   F3 = 0000000000000000 (           0)
> D4 = 00000008   A4 = 800026c4   F4 = 0000000000000000 (           0)
> D5 = 00000000   A5 = f67d98e0   F5 = 0000000000000000 (           0)
> D6 = f6fffaa8   A6 = f6fffa7c   F6 = 0000000000000000 (           0)
> D7 = 00000002   A7 = f6fffa24   F7 = 0000000000000000 (           0)
> PC = f6fffa30   SR = 0000 ----- FPRESULT =            0
> Aborted
>
> How can I debug it further to try to figure out if this is a qemu issue
> or not? Thanks

This is almost certainly a QEMU bug -- m68k/Coldfire is
unmaintained upstream right now.

I would start debugging this with QEMU's -d and -D options:
you can use "-d in_asm,exec,cpu -D logfile.out" to write
a lot of logging information about the guest code QEMU executes.
(This can be a huge volume, so best to use the shortest possible
reproducing command. Take out 'exec' if it's really too slow.)
Then you can see what the code the guest executed was and figure
out what happened. Alternatively you can try using qemu's
debug stub: pass QEMU "-g 1234" and then connect a coldfire gdb
using gdb's "target remote :1234" command. Either way, what you
want to find out is what exactly happened in the instructions
between the waitpid and the crash.

My current guess is that either we've messed up the waitpid
(seems unlikely, it's not very complicated), or we're not
getting signal handling right (much more complicated and
likely to have bugs): your crash happens at about the point
where the shell is due to receive a SIGCHLD for the child
process it spawned. If you send the shell a signal in some
other way (eg type ctrl-C at it, or send it a SIGINT from
outside QEMU) does it die in a similar way?

PS: an easier way to enable strace for linux-user in a binfmt-misc
chroot is to use the QEMU_STRACE environment variable. (All the
linux-user command line options have environment variable versions;
check --help.)

You can also just use a command like
  'chroot my-chroot /usr/bin/qemu-m68k-static -strace /bin/sh -c /bin/ls'
for a one-off command run.

-- PMM


Oh, if you're able to put your chroot up for download somewhere so I can reproduce the problem, I can have a look at it for you. (I don't otherwise have a coldfire chroot or toolchain.)


Hi Peter,

Thanks for you time! I've attached my mini chroot environment. As you can see, it is very minimal, but enough to be chrooted and to test this. I will try your suggestions in a couple of days, but if you could please try it before, I will really appreciate.

Michel

I have a fix for this (our code for setting up the signal return trampoline used the wrong types and only worked on 32 bit hosts). 

I notice that /bin/ls can't ls directories (it seems ok with single files) but that's a different bug.


Patch fixing this: https://patchwork.ozlabs.org/patch/423460/


I've identified the cause of "ls" not returning any output, but I don't think we can fix it in QEMU.

This happens if the host fs is ext3 or ext4 on a 64 bit system. Here the "d_off" entry in a linux_dirent64 is actually a hashtable hash, and so can be a full 64 bits. Unfortunately the guest binary here is trying to convert getdents64() syscall return information into a dirent with only a 32 bit offset field, and so it (guest libc, I think) just ignores dirents with offsets >4GB, which is all of them.

Sadly although ext3/4 support an f_mode bit for "make hash offsets fit in 32 bit", this is only for the benefit of kernel internal APIs (it's used by NFS) and AFAICT can't be set by userspace. So I can't at the moment think of any way for QEMU to deal with this...


Hi Peter,

Thank you very much for your help, I really appreciate it. I've tested both your patch and your workaround to make ls work (I've created a xfs partition to put my image) and everything works greatly.

Merry Xmas.

Michel

Peter's patch had been included here:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=1669add752d9f2928
==> Fix released