blob: 58a5523b121b92c187ff84d8853923771054ac3b (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
|
architecture: 0.965
boot: 0.910
performance: 0.902
kernel: 0.883
graphic: 0.880
ppc: 0.834
device: 0.822
KVM: 0.796
arm: 0.763
semantic: 0.659
vnc: 0.585
socket: 0.584
hypervisor: 0.536
x86: 0.529
PID: 0.528
register: 0.526
user-level: 0.434
permissions: 0.433
risc-v: 0.420
network: 0.399
debug: 0.381
files: 0.370
TCG: 0.367
VMM: 0.354
virtual: 0.346
mistranslation: 0.337
assembly: 0.316
peripherals: 0.253
i386: 0.173
--------------------
boot: 0.972
arm: 0.970
TCG: 0.940
debug: 0.925
KVM: 0.912
kernel: 0.884
PID: 0.880
virtual: 0.856
VMM: 0.750
vnc: 0.714
risc-v: 0.641
device: 0.634
socket: 0.528
hypervisor: 0.516
files: 0.294
register: 0.238
semantic: 0.150
architecture: 0.033
assembly: 0.013
performance: 0.011
permissions: 0.009
user-level: 0.007
network: 0.002
graphic: 0.001
peripherals: 0.001
mistranslation: 0.001
x86: 0.000
i386: 0.000
ppc: 0.000
CPU boot sometimes fails on big.LITTLE CPUs with varying cache sizes
Description of problem:
The RK3588 SoC has three core clusters; one with A55 cores, and the other two have A76 cores. The big cores have more L2 cache than the little cores, so the value of `CCSIDR` depends on the core that it is read from.
In `write_list_to_kvmstate`, QEMU attempts to use `KVM_SET_ONE_REG` with an ID for `KVM_REG_ARM_DEMUX_ID_CCSIDR`, trying to set `CCSIDR` to a previously read value.
Normally, that works fine, but if the host kernel has moved QEMU from one core cluster to the other, then the value will be different and `demux_c15_set` will return `EINVAL`, causing the entire `arm_set_cpu_on` to fail, and the guest kernel to print an error.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kvm/sys_regs.c?h=v6.2#n2827
I tried changing the condition for the `ok = false` line in `write_list_to_kvmstate` to `ret && r.id >> 8 != 0x60200000001100`. This causes all CPUs to initialize correctly in the guest, but obviously that's a hack.
I assume that `CCSIDR` not being uniform across all CPUs means that the guest's copy of `CCSIDR` may be wrong, and so cache maintenance operations may not act on the entire cache. I do not know whether that could actually cause problems. Will QEMU need to find the maximum cache size across all CPUs and present that to guests?
Steps to reproduce:
On a SoC where big and little cores have different cache sizes (e.g. RK3588):
```text
$ qemu-system-aarch64 -M virt -accel kvm -cpu host -smp 4 -nographic -kernel arch/arm64/boot/Image -append quiet
[ 0.001399][ T1] psci: failed to boot CPU1 (-22)
[ 0.001407][ T1] CPU1: failed to boot: -22
[ 0.001685][ T1] psci: failed to boot CPU2 (-22)
[ 0.001691][ T1] CPU2: failed to boot: -22
[ 0.001809][ T1] psci: failed to boot CPU3 (-22)
[ 0.001814][ T1] CPU3: failed to boot: -22
```
The error is not always printed, because it depends on which core cluster the processes are scheduled on.
Using `taskset -c 0-3` or `taskset -c 4-7` to force QEMU to stick to the little or big cores respectively makes the bug not reproduce.
|