summary refs log tree commit diff stats
path: root/gitlab/issues/target_missing/host_missing/accel_TCG/2632.toml
blob: a6411e6771c64ee64cbffd363ff4920e662138f5 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
id = 2632
title = "tcg optimization breaking memory access ordering"
state = "opened"
created_at = "2024-10-21T10:36:01.084Z"
closed_at = "n/a"
labels = ["accel: TCG"]
url = "https://gitlab.com/qemu-project/qemu/-/issues/2632"
host-os = "Linux"
host-arch = "ppc64le"
qemu-version = "v9.0.1"
guest-os = "Bare-metal"
guest-arch = "aarch64"
description = """The following code creates register dependency between 2 loads, which forces the first load to finish before the second:
```
movz\tw0, #0x2
str\tw0, [x1]
ldr\tw2, [x1]
eor\tw3, w2, w2
ldr\tw4, [x5, w3, sxtw]
```

While translating it to tcg IR, it keeps this dependency correctly.
But after running tcg optimizations, it optimized the tcg sequence for `eor\tw3, w2, w2` at `0000000000000144` to `mov_i64 x3,$0x0`. which then removes the dependency between the loads.

It results in incorrect behavior on the host on a multiple threaded program"""
reproduce = """1.
2.
3."""
additional = """```
OP:
 ld_i32 loc0,env,$0xfffffffffffffff0
 brcond_i32 loc0,$0x0,lt,$L0
 st8_i32 $0x0,env,$0xfffffffffffffff4

 ---- 0000000000000134 0000000000000000 0000000000000000
 add_i64 x28,x28,$0x2

 ---- 0000000000000138 0000000000000000 0000000000000000
 mov_i64 x0,$0x2

 ---- 000000000000013c 0000000000000000 0000000000001c00
 mov_i64 loc3,x1
 mov_i64 loc4,loc3
 qemu_st_a64_i64 x0,loc4,w16+un+leul,2

 ---- 0000000000000140 0000000000000000 0000000000001c10
 mov_i64 loc5,x1
 mov_i64 loc6,loc5
 qemu_ld_a64_i64 x2,loc6,w16+un+leul,2

 ---- 0000000000000144 0000000000000000 0000000000000000
 and_i64 loc7,x2,$0xffffffff
 xor_i64 x3,x2,loc7
 and_i64 x3,x3,$0xffffffff

 ---- 0000000000000148 0000000000000000 0000000000001c20
 mov_i64 loc9,x5
 mov_i64 loc10,x3
 ext32s_i64 loc10,loc10
 add_i64 loc9,loc9,loc10
 mov_i64 loc11,loc9
 qemu_ld_a64_i64 x4,loc11,w16+un+leul,2
 st8_i32 $0x1,env,$0xfffffffffffffff4
```


```
OP after optimization and liveness analysis:
 ld_i32 tmp0,env,$0xfffffffffffffff0      pref=0xffffffff
 brcond_i32 tmp0,$0x0,lt,$L0              dead: 0
 st8_i32 $0x0,env,$0xfffffffffffffff4     dead: 0

 ---- 0000000000000134 0000000000000000 0000000000000000
 add_i64 x28,x28,$0x2                     sync: 0  dead: 0 1  pref=0xffffffff

 ---- 0000000000000138 0000000000000000 0000000000000000
 mov_i64 x0,$0x2                          sync: 0  dead: 0  pref=0xffffffff

 ---- 000000000000013c 0000000000000000 0000000000001c00
 qemu_st_a64_i64 $0x2,x1,w16+un+leul,2    dead: 0

 ---- 0000000000000140 0000000000000000 0000000000001c10
 qemu_ld_a64_i64 x2,x1,w16+un+leul,2      sync: 0  dead: 0 1  pref=0xffffffff

 ---- 0000000000000144 0000000000000000 0000000000000000
 mov_i64 x3,$0x0                          sync: 0  dead: 0 1  pref=0xffffffff

 ---- 0000000000000148 0000000000000000 0000000000001c20
 qemu_ld_a64_i64 x4,x5,w16+un+leul,2      sync: 0  dead: 0 1  pref=0xffffffff
 st8_i32 $0x1,env,$0xfffffffffffffff4     dead: 0
```"""