1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
|
debug: 0.974
peripherals: 0.972
architecture: 0.970
TCG: 0.966
device: 0.965
arm: 0.965
mistranslation: 0.965
boot: 0.961
risc-v: 0.959
PID: 0.957
user-level: 0.956
permissions: 0.956
VMM: 0.943
performance: 0.941
hypervisor: 0.930
register: 0.929
ppc: 0.924
semantic: 0.907
files: 0.897
vnc: 0.896
graphic: 0.858
kernel: 0.812
assembly: 0.806
KVM: 0.803
socket: 0.800
network: 0.765
x86: 0.686
virtual: 0.686
i386: 0.666
Arm64 fails to run a binary which runs OK on real hardware
This binary:
http://oirase.annexia.org/tmp/test.gz
runs OK on real aarch64 hardware. It is a statically linked Linux binary which (if successful) will print "hello, world" and exit cleanly.
On qemu-arm64 userspace emulator it doesn't print anything and loops forever using 100% CPU.
----
The following section is only if you wish to compile this binary from source, otherwise you can ignore it.
First compile OCaml from:
https://github.com/ocaml/ocaml
(note you have to compile it on aarch64 or in qemu, it's not possible to cross-compile). You will have to apply the one-line patch from:
https://sympa.inria.fr/sympa/arc/caml-list/2013-12/msg00179.html
./configure
make -j1 world.opt
Then do:
echo 'print_endline "hello, world"' > test.ml
./boot/ocamlrun ./ocamlopt -I stdlib stdlib.cmxa test.ml -o test
./test
On 23 December 2013 18:38, Richard Jones <email address hidden> wrote:
> This binary:
>
> http://oirase.annexia.org/tmp/test.gz
>
> runs OK on real aarch64 hardware. It is a statically linked Linux
> binary which (if successful) will print "hello, world" and exit cleanly.
>
> On qemu-arm64 userspace emulator it doesn't print anything and loops
> forever using 100% CPU.
Does the equivalent binary run OK in 32 bit ARM QEMU?
Does the binary use multiple threads?
If you have the time to investigate more closely what the binary
is actually doing when it loops (eg by running under a host gdb,
or using the debug log tracing of TCG input and output code and
execution) that would be helpful. Otherwise it's likely to be quite a
long time before I get round to looking at this kind of thing, because
"runs complex binaries/runtimes like ocaml" is not very high up the
priority list, I'm afraid.
thanks
-- PMM
It's an Aarch64 binary so it won't run on 32 bit ARM at all. However I guess you meant does the equivalent program run on 32 bit ARM, and the answer is yes, but that doesn't tell us much because OCaml uses separate code generators for 32 and 64 bit ARM.
The binary is single threaded.
I enabled tracing on qemu and got this:
http://oirase.annexia.org/tmp/arm64-call-trace.txt
The associate disassembly of the binary is here:
http://oirase.annexia.org/tmp/arm64-disassembly.txt
I'm not exactly sure which instruction fails to be emulated properly, but it looks like one of the ones in the caml_c_call function.
One thing I notice is that caml_c_call is the only function that uses the instruction "ret xM" (in all other places the code uses the default "ret" with implicit x30). Hmmm .. do we emulate "ret xM"?
The attached patch fixes the ret xM variant of ret. I verified that it fixes the bug.
On 23 December 2013 21:27, Richard Jones <email address hidden> wrote:
> It's an Aarch64 binary so it won't run on 32 bit ARM at all. However I
> guess you meant does the equivalent program run on 32 bit ARM, and the
> answer is yes, but that doesn't tell us much because OCaml uses separate
> code generators for 32 and 64 bit ARM.
Yes, that's why I said "equivalent binary". It's a useful check because it
can tell us whether the program is using things our linux-user emulation
doesn't get right at all (examples: multiple threads; some interactions of
signals and blocking syscalls); so it divides the bug into "probably in
linux-user" vs "probably a target-arm bug".
I see you've tracked the issue down in this case, though.
thanks
-- PMM
>> runs OK on real aarch64 hardware.
May I know which hardware you are talking about. Is there an aarch64 hardware target available ?
The (re)implementation of this instruction for mainline never had this bug.
|