summary refs log tree commit diff stats
path: root/results/classifier/118/hypervisor/1908626
blob: bbbd6457515944f8afbbc711585854b089952627 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
hypervisor: 0.818
mistranslation: 0.800
graphic: 0.786
virtual: 0.771
TCG: 0.757
user-level: 0.757
ppc: 0.750
vnc: 0.747
peripherals: 0.737
debug: 0.728
risc-v: 0.722
assembly: 0.716
x86: 0.714
device: 0.714
VMM: 0.713
KVM: 0.707
architecture: 0.705
i386: 0.697
register: 0.695
performance: 0.680
arm: 0.678
semantic: 0.675
PID: 0.661
network: 0.649
permissions: 0.638
files: 0.586
socket: 0.570
boot: 0.545
kernel: 0.538

Atomic test-and-set instruction does not work on qemu-user

I try to compile and run PostgreSQL/Greenplum database inside docker container/qemu-aarch64-static:
```
 host: CentOS7 x86_64
 container: centos:centos7.9.2009 --platform linux/arm64/v8
 qemu-user-static: https://github.com/multiarch/qemu-user-static/releases/
```

However, GP/PG's spinlock always gets stuck and reports PANIC errors. It seems its spinlock
has something wrong.
```
https://github.com/greenplum-db/gpdb/blob/master/src/include/storage/s_lock.h
https://github.com/greenplum-db/gpdb/blob/master/src/backend/storage/lmgr/s_lock.c
```

So I extract its spinlock implementation into one test C source file (see attachment file),
and get reprodcued:

```
$ gcc spinlock_qemu.c
$ ./a.out 
C -- slock inited, lock value is: 0
parent 139642, child 139645
P -- slock lock before, lock value is: 0
P -- slock locked, lock value is: 1
P -- slock unlock after, lock value is: 0
C -- slock lock before, lock value is: 1
P -- slock lock before, lock value is: 1
C -- slock locked, lock value is: 1
C -- slock unlock after, lock value is: 0
C -- slock lock before, lock value is: 1
P -- slock locked, lock value is: 1
P -- slock unlock after, lock value is: 0
P -- slock lock before, lock value is: 1
C -- slock locked, lock value is: 1
C -- slock unlock after, lock value is: 0
P -- slock locked, lock value is: 1
C -- slock lock before, lock value is: 1
P -- slock unlock after, lock value is: 0
C -- slock locked, lock value is: 1
P -- slock lock before, lock value is: 1
C -- slock unlock after, lock value is: 0
P -- slock locked, lock value is: 1
C -- slock lock before, lock value is: 1
P -- slock unlock after, lock value is: 0
C -- slock locked, lock value is: 1
P -- slock lock before, lock value is: 1
C -- slock unlock after, lock value is: 0
P -- slock locked, lock value is: 1
C -- slock lock before, lock value is: 1
P -- slock unlock after, lock value is: 0
P -- slock lock before, lock value is: 1
spin timeout, lock value is 1 (pid 139642)
spin timeout, lock value is 1 (pid 139645)
spin timeout, lock value is 1 (pid 139645)
spin timeout, lock value is 1 (pid 139642)
spin timeout, lock value is 1 (pid 139645)
spin timeout, lock value is 1 (pid 139642)
...
...
...
```

NOTE: this code always works on PHYSICAL ARM64 server.



Interestingly, the spinlock test works after I change tas() implementation
FROM
  __sync_lock_test_and_set(lock, 1);
TO
  __sync_val_compare_and_swap(lock, 0, 1);

## gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)

====
__sync_lock_test_and_set(lock, 1) disassembly:

```
objdump -S a.out

000000000040073c <tas>:
  40073c:	d10043ff 	sub	sp, sp, #0x10
  400740:	f90007e0 	str	x0, [sp, #8]
  400744:	f94007e0 	ldr	x0, [sp, #8]
  400748:	52800021 	mov	w1, #0x1                   	// #1
  40074c:	885f7c02 	ldxr	w2, [x0]
  400750:	88037c01 	stxr	w3, w1, [x0]
  400754:	35ffffc3 	cbnz	w3, 40074c <tas+0x10>
  400758:	d5033bbf 	dmb	ish
  40075c:	2a0203e0 	mov	w0, w2
  400760:	910043ff 	add	sp, sp, #0x10
  400764:	d65f03c0 	ret
```

====
__sync_val_compare_and_swap(lock, 0, 1); disassembly:

```
objdump -S a.out

000000000040073c <tas>:
  40073c:	d10043ff 	sub	sp, sp, #0x10
  400740:	f90007e0 	str	x0, [sp, #8]
  400744:	f94007e0 	ldr	x0, [sp, #8]
  400748:	52800021 	mov	w1, #0x1                   	// #1
  40074c:	885f7c02 	ldxr	w2, [x0]
  400750:	35000062 	cbnz	w2, 40075c <tas+0x20>
  400754:	8803fc01 	stlxr	w3, w1, [x0]
  400758:	35ffffa3 	cbnz	w3, 40074c <tas+0x10>
  40075c:	7100005f 	cmp	w2, #0x0
  400760:	d5033bbf 	dmb	ish
  400764:	2a0203e0 	mov	w0, w2
  400768:	910043ff 	add	sp, sp, #0x10
  40076c:	d65f03c0 	ret
```

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).

Thank you and sorry for the inconvenience.


[Expired for QEMU because there has been no activity for 60 days.]


This is an automated cleanup. This bug report has been moved to QEMU's
new bug tracker on gitlab.com and thus gets marked as 'expired' now.
Please continue with the discussion here:

 https://gitlab.com/qemu-project/qemu/-/issues/509


Hi taos! Could you please check whether this has been fixed already in QEMU v6.1.0-rc1 ?

Thanks. Tested, the problem is gone.