summary refs log tree commit diff stats
path: root/results/classifier/zero-shot/105/mistranslation/1896263
blob: 75fb78307ed627cef6319e35d34e551d75e0c1cf (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
mistranslation: 0.685
other: 0.651
vnc: 0.634
KVM: 0.608
device: 0.588
instruction: 0.562
semantic: 0.553
graphic: 0.545
assembly: 0.542
socket: 0.536
boot: 0.520
network: 0.517

The bios-tables-test test causes QEMU to crash (Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed) on AMD processors

QEMU release version: Any recent version (5.0.0, 5.1.0, git master)
Host CPU: AMD Ryzen 3900X

The following backtrace is from commit e883b492c221241d28aaa322c61536436090538a.

QTEST_QEMU_BINARY=./build/qemu-system-x86_64 gdb ./build/tests/qtest/bios-tables-test
GNU gdb (GDB) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./build/tests/qtest/bios-tables-test...
(gdb) run
Starting program: /home/mcournoyer/src/qemu/build/tests/qtest/bios-tables-test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libthread_db.so.1".
[New Thread 0x7ffff7af6700 (LWP 18955)]
# random seed: R02S5106b7afa2fd84a0353605795c04ab7d
1..19
# Start of x86_64 tests
# Start of acpi tests
# starting QEMU: exec ./build/qemu-system-x86_64 -qtest unix:/tmp/qtest-18951.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-18951.qmp,id=char0 -mon chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -accel kvm -accel tcg -net none -display none  -drive id=hd0,if=none,file=tests/acpi-test-disk-R3kbyc,format=raw -device ide-hd,drive=hd0  -accel qtest
[Attaching after Thread 0x7ffff7af7900 (LWP 18951) fork to child process 18956]
[New inferior 2 (process 18956)]
[Detaching after fork from parent process 18951]
[Inferior 1 (process 18951) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libthread_db.so.1".
[New Thread 0x7ffff7af6700 (LWP 18957)]
[Thread 0x7ffff7af6700 (LWP 18957) exited]
process 18956 is executing new program: /gnu/store/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16/bin/bash
process 18956 is executing new program: /home/mcournoyer/src/qemu/build/qemu-system-x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libthread_db.so.1".
[New Thread 0x7ffff48ed700 (LWP 18958)]
[New Thread 0x7fffeffff700 (LWP 18960)]
[New Thread 0x7fffef61c700 (LWP 18961)]
[New Thread 0x7fffed5ff700 (LWP 18962)]
qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

Thread 2.5 "qemu-system-x86" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffef61c700 (LWP 18961)]
0x00007ffff65dbaba in raise () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
(gdb) taas bt

Thread 2.6 (Thread 0x7fffed5ff700 (LWP 18962)):
#0  0x00007ffff6770c4d in pthread_cond_timedwait@@GLIBC_2.3.2 () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#1  0x0000555555cc8a0e in qemu_sem_timedwait (sem=sem@entry=0x55555662f758, ms=ms@entry=10000) at ../util/qemu-thread-posix.c:282
#2  0x0000555555cd91b5 in worker_thread (opaque=opaque@entry=0x55555662f6e0) at ../util/thread-pool.c:91
#3  0x0000555555cc7e86 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:521
#4  0x00007ffff6769f64 in start_thread () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#5  0x00007ffff669b9af in clone () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6

Thread 2.5 (Thread 0x7fffef61c700 (LWP 18961)):
#0  0x00007ffff65dbaba in raise () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#1  0x00007ffff65dcbf5 in abort () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#2  0x00007ffff65d470a in __assert_fail_base () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#3  0x00007ffff65d4782 in __assert_fail () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#4  0x0000555555a3e979 in kvm_buf_set_msrs (cpu=0x555556688a20) at ../target/i386/kvm.c:2714
#5  0x0000555555a438cc in kvm_put_msrs (level=3, cpu=0x555556688a20) at ../target/i386/kvm.c:3005
#6  kvm_arch_put_registers (cpu=cpu@entry=0x555556688a20, level=level@entry=3) at ../target/i386/kvm.c:3989
#7  0x0000555555af7b0e in do_kvm_cpu_synchronize_post_init (cpu=0x555556688a20, arg=...) at ../accel/kvm/kvm-all.c:2355
#8  0x00005555558ef8e2 in process_queued_cpu_work (cpu=cpu@entry=0x555556688a20) at ../cpus-common.c:343
#9  0x0000555555b6ac25 in qemu_wait_io_event_common (cpu=cpu@entry=0x555556688a20) at ../softmmu/cpus.c:1117
#10 0x0000555555b6ac84 in qemu_wait_io_event (cpu=cpu@entry=0x555556688a20) at ../softmmu/cpus.c:1157
#11 0x0000555555b6aec8 in qemu_kvm_cpu_thread_fn (arg=arg@entry=0x555556688a20) at ../softmmu/cpus.c:1193
#12 0x0000555555cc7e86 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:521
#13 0x00007ffff6769f64 in start_thread () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#14 0x00007ffff669b9af in clone () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6

Thread 2.4 (Thread 0x7fffeffff700 (LWP 18960)):
#0  0x00007ffff66919d9 in poll () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#1  0x00007ffff78f0051 in g_main_context_iterate.isra () from /gnu/store/n1mx1dp0hcrzm1akf8qdqa9gmybzazs2-profile/lib/libglib-2.0.so.0
#2  0x00007ffff78f0392 in g_main_loop_run () from /gnu/store/n1mx1dp0hcrzm1akf8qdqa9gmybzazs2-profile/lib/libglib-2.0.so.0
#3  0x000055555584b5a1 in iothread_run (opaque=opaque@entry=0x555556557720) at ../iothread.c:80
#4  0x0000555555cc7e86 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:521
#5  0x00007ffff6769f64 in start_thread () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#6  0x00007ffff669b9af in clone () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6

Thread 2.3 (Thread 0x7ffff48ed700 (LWP 18958)):
#0  0x00007ffff66657a1 in clock_nanosleep@GLIBC_2.2.5 () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#1  0x00007ffff666ac03 in nanosleep () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6
#2  0x00007ffff7919cdf in g_usleep () from /gnu/store/n1mx1dp0hcrzm1akf8qdqa9gmybzazs2-profile/lib/libglib-2.0.so.0
#3  0x0000555555cb3b04 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:250
#4  0x0000555555cc7e86 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:521
#5  0x00007ffff6769f64 in start_thread () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#6  0x00007ffff669b9af in clone () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6

Thread 2.1 (Thread 0x7ffff48f2c80 (LWP 18956)):
#0  0x00007ffff677094c in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpthread.so.0
#1  0x0000555555cc854f in qemu_cond_wait_impl (cond=0x5555563b0020 <qemu_work_cond>, mutex=0x5555563cd620 <qemu_global_mutex>, file=0x555555dbad06 "../cpus-common.c", line=154) at ../util/qemu-thread-posix.c:174
#2  0x00005555558ef484 in do_run_on_cpu (cpu=cpu@entry=0x555556688a20, func=func@entry=0x555555af7b00 <do_kvm_cpu_synchronize_post_init>, data=..., mutex=mutex@entry=0x5555563cd620 <qemu_global_mutex>) at ../cpus-common.c:154
#3  0x0000555555b6aa7c in run_on_cpu (cpu=cpu@entry=0x555556688a20, func=func@entry=0x555555af7b00 <do_kvm_cpu_synchronize_post_init>, data=..., data@entry=...) at ../softmmu/cpus.c:1085
#4  0x0000555555af8d4e in kvm_cpu_synchronize_post_init (cpu=cpu@entry=0x555556688a20) at ../accel/kvm/kvm-all.c:2361
#5  0x0000555555b6a94a in cpu_synchronize_post_init (cpu=0x555556688a20) at /home/mcournoyer/src/qemu/include/sysemu/hw_accel.h:55
#6  cpu_synchronize_all_post_init () at ../softmmu/cpus.c:953
#7  0x0000555555b0dca7 in qemu_init (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/vl.c:4387
#8  0x0000555555840609 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:49

On Fri, Sep 18, 2020 at 6:18 PM Apteryx <email address hidden> wrote:
> Host CPU: AMD Ryzen 3900X

I also hit this test failure.

Host CPU: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
Host kernel: Linux 5.8.6-201.fc32.x86_64

qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs:
Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.


I've reproduced the problem with HEAD of master, qemu-4.2.0 and qemu-4.0.0.

I think the problem comes from the the kernel.

Host CPU: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Host kernel: 5.8.4-200.fc32.x86_64


Le 21/09/2020 à 15:50, Laurent Vivier a écrit :
> I've reproduced the problem with HEAD of master, qemu-4.2.0 and
> qemu-4.0.0.
> 
> I think the problem comes from the the kernel.
> 

This is reproducible with vanilla kernel v5.8.0 but fixed in v5.9.0-rc5+.


Le 22/09/2020 à 11:05, Laurent Vivier a écrit :
> Le 21/09/2020 à 15:50, Laurent Vivier a écrit :
>> I've reproduced the problem with HEAD of master, qemu-4.2.0 and
>> qemu-4.0.0.
>>
>> I think the problem comes from the the kernel.
>>
> 
> This is reproducible with vanilla kernel v5.8.0 but fixed in
> v5.9.0-rc5+.
> 

It seems to be fixed in 5.9 kernel by:

commit d831de177217cd494bfb99f2c849a0d40c2a7890
Author: Vitaly Kuznetsov <email address hidden>
Date:   Fri Sep 11 11:31:47 2020 +0200

    KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN

    Even without in-kernel LAPIC we should allow writing '0' to
    MSR_KVM_ASYNC_PF_EN as we're not enabling the mechanism. In
    particular, QEMU with 'kernel-irqchip=off' fails to start
    a guest with

    qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0

    Fixes: 9d3c447c72fb2 ("KVM: X86: Fix async pf caused null-ptr-deref")
    Reported-by: Dr. David Alan Gilbert <email address hidden>
    Signed-off-by: Vitaly Kuznetsov <email address hidden>
    Message-Id: <email address hidden>
    [Actually commit the version proposed by Sean Christopherson. - Paolo]
    Signed-off-by: Paolo Bonzini <email address hidden>


This addresses the following crash when running Linux v5.8
with kernel-irqchip=off:

  qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
  qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

There is a kernel-side fix for the issue too (kernel commit
d831de177217 "KVM: x86: always allow writing '0' to
MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger
the bug if running an older kernel.

Fixes: https://bugs.launchpad.net/bugs/1896263
Signed-off-by: Eduardo Habkost <email address hidden>
---
 target/i386/kvm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9efb07e7c83..1492f41349f 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
         kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc);
         kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr);
         kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
-        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) {
+        /*
+         * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set
+         * at all if kernel-irqchip=off, so don't try to set it in that case.
+         */
+        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) &&
+            kvm_irqchip_in_kernel()) {
             kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr);
         }
         if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) {
-- 
2.26.2



Eduardo Habkost <email address hidden> writes:

> This addresses the following crash when running Linux v5.8
> with kernel-irqchip=off:
>
>   qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
>   qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
>
> There is a kernel-side fix for the issue too (kernel commit
> d831de177217 "KVM: x86: always allow writing '0' to
> MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger
> the bug if running an older kernel.
>
> Fixes: https://bugs.launchpad.net/bugs/1896263
> Signed-off-by: Eduardo Habkost <email address hidden>
> ---
>  target/i386/kvm.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 9efb07e7c83..1492f41349f 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>          kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc);
>          kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>          kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
> -        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) {
> +        /*
> +         * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set
> +         * at all if kernel-irqchip=off, so don't try to set it in that case.
> +         */
> +        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) &&
> +            kvm_irqchip_in_kernel()) {
>              kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr);
>          }
>          if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) {

I'm not sure kvm_irqchip_in_kernel() was required before we switched to
interrupt-based APF (as we were always injecting #PF) but with
kernel-5.8+ this should work. We'll need to merge this with

https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg02963.html
(queued by Paolo) and
https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06196.html
which fixes a bug in it.

as kvm_irqchip_in_kernel() should go around both KVM_FEATURE_ASYNC_PF
and KVM_FEATURE_ASYNC_PF_INT I believe.

-- 
Vitaly



On Tue, Sep 22, 2020 at 05:38:12PM +0200, Vitaly Kuznetsov wrote:
> Eduardo Habkost <email address hidden> writes:
> 
> > This addresses the following crash when running Linux v5.8
> > with kernel-irqchip=off:
> >
> >   qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
> >   qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> >
> > There is a kernel-side fix for the issue too (kernel commit
> > d831de177217 "KVM: x86: always allow writing '0' to
> > MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger
> > the bug if running an older kernel.
> >
> > Fixes: https://bugs.launchpad.net/bugs/1896263
> > Signed-off-by: Eduardo Habkost <email address hidden>
> > ---
> >  target/i386/kvm.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > index 9efb07e7c83..1492f41349f 100644
> > --- a/target/i386/kvm.c
> > +++ b/target/i386/kvm.c
> > @@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> >          kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc);
> >          kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr);
> >          kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
> > -        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) {
> > +        /*
> > +         * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set
> > +         * at all if kernel-irqchip=off, so don't try to set it in that case.
> > +         */
> > +        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) &&
> > +            kvm_irqchip_in_kernel()) {
> >              kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr);
> >          }
> >          if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) {
> 
> I'm not sure kvm_irqchip_in_kernel() was required before we switched to
> interrupt-based APF (as we were always injecting #PF) but with
> kernel-5.8+ this should work. [...]

Were guests able to set MSR_KVM_ASYNC_PF_EN to non-zero with
kernel-irqchip=off on hosts running Linux <= 5.7?  I am assuming
kvm-asyncpf never worked with kernel-irqchip=off (and enabling it
by default with kernel-irqchip=off was a mistake).


>                         [...] We'll need to merge this with
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg02963.html
> (queued by Paolo) and
> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06196.html
> which fixes a bug in it.
> 
> as kvm_irqchip_in_kernel() should go around both KVM_FEATURE_ASYNC_PF
> and KVM_FEATURE_ASYNC_PF_INT I believe.

Shouldn't we just disallow kvm-asyncpf-int=on if kernel-irqchip=off?

-- 
Eduardo



Eduardo Habkost <email address hidden> writes:

> On Tue, Sep 22, 2020 at 05:38:12PM +0200, Vitaly Kuznetsov wrote:
>> Eduardo Habkost <email address hidden> writes:
>> 
>> > This addresses the following crash when running Linux v5.8
>> > with kernel-irqchip=off:
>> >
>> >   qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
>> >   qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
>> >
>> > There is a kernel-side fix for the issue too (kernel commit
>> > d831de177217 "KVM: x86: always allow writing '0' to
>> > MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger
>> > the bug if running an older kernel.
>> >
>> > Fixes: https://bugs.launchpad.net/bugs/1896263
>> > Signed-off-by: Eduardo Habkost <email address hidden>
>> > ---
>> >  target/i386/kvm.c | 7 ++++++-
>> >  1 file changed, 6 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
>> > index 9efb07e7c83..1492f41349f 100644
>> > --- a/target/i386/kvm.c
>> > +++ b/target/i386/kvm.c
>> > @@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>> >          kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc);
>> >          kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>> >          kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
>> > -        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) {
>> > +        /*
>> > +         * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set
>> > +         * at all if kernel-irqchip=off, so don't try to set it in that case.
>> > +         */
>> > +        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) &&
>> > +            kvm_irqchip_in_kernel()) {
>> >              kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr);
>> >          }
>> >          if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) {
>> 
>> I'm not sure kvm_irqchip_in_kernel() was required before we switched to
>> interrupt-based APF (as we were always injecting #PF) but with
>> kernel-5.8+ this should work. [...]
>
> Were guests able to set MSR_KVM_ASYNC_PF_EN to non-zero with
> kernel-irqchip=off on hosts running Linux <= 5.7? 

lapic_in_kernel() check appeared in kernel with the following commit:

commit 9d3c447c72fb2337ca39f245c6ae89f2369de216
Author: Wanpeng Li <email address hidden>
Date:   Mon Jun 29 18:26:31 2020 +0800

    KVM: X86: Fix async pf caused null-ptr-deref

which was post-interrupt-based-APF. I *think* it was OK to enable APF
with !lapic_in_kernel() before (at least I don't see what would not
allow that).

> I am assuming
> kvm-asyncpf never worked with kernel-irqchip=off (and enabling it
> by default with kernel-irqchip=off was a mistake).
>
>
>>                         [...] We'll need to merge this with
>> 
>> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg02963.html
>> (queued by Paolo) and
>> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06196.html
>> which fixes a bug in it.
>> 
>> as kvm_irqchip_in_kernel() should go around both KVM_FEATURE_ASYNC_PF
>> and KVM_FEATURE_ASYNC_PF_INT I believe.
>
> Shouldn't we just disallow kvm-asyncpf-int=on if kernel-irqchip=off?

(Sarcasm: if disallowing 'kernel-irqchip=off' is not an option, then)
yes, we probably can, but kvm-asyncpf-int=on is the default we have so
we can't just implicitly change it underneath or migration will break...

-- 
Vitaly



On Tue, Sep 22, 2020 at 06:42:17PM +0200, Vitaly Kuznetsov wrote:
> Eduardo Habkost <email address hidden> writes:
> 
> > On Tue, Sep 22, 2020 at 05:38:12PM +0200, Vitaly Kuznetsov wrote:
> >> Eduardo Habkost <email address hidden> writes:
> >> 
> >> > This addresses the following crash when running Linux v5.8
> >> > with kernel-irqchip=off:
> >> >
> >> >   qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
> >> >   qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> >> >
> >> > There is a kernel-side fix for the issue too (kernel commit
> >> > d831de177217 "KVM: x86: always allow writing '0' to
> >> > MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger
> >> > the bug if running an older kernel.
> >> >
> >> > Fixes: https://bugs.launchpad.net/bugs/1896263
> >> > Signed-off-by: Eduardo Habkost <email address hidden>
> >> > ---
> >> >  target/i386/kvm.c | 7 ++++++-
> >> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> >> > index 9efb07e7c83..1492f41349f 100644
> >> > --- a/target/i386/kvm.c
> >> > +++ b/target/i386/kvm.c
> >> > @@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> >> >          kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc);
> >> >          kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr);
> >> >          kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
> >> > -        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) {
> >> > +        /*
> >> > +         * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set
> >> > +         * at all if kernel-irqchip=off, so don't try to set it in that case.
> >> > +         */
> >> > +        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) &&
> >> > +            kvm_irqchip_in_kernel()) {
> >> >              kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr);
> >> >          }
> >> >          if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) {
> >> 
> >> I'm not sure kvm_irqchip_in_kernel() was required before we switched to
> >> interrupt-based APF (as we were always injecting #PF) but with
> >> kernel-5.8+ this should work. [...]
> >
> > Were guests able to set MSR_KVM_ASYNC_PF_EN to non-zero with
> > kernel-irqchip=off on hosts running Linux <= 5.7? 
> 
> lapic_in_kernel() check appeared in kernel with the following commit:
> 
> commit 9d3c447c72fb2337ca39f245c6ae89f2369de216
> Author: Wanpeng Li <email address hidden>
> Date:   Mon Jun 29 18:26:31 2020 +0800
> 
>     KVM: X86: Fix async pf caused null-ptr-deref
> 
> which was post-interrupt-based-APF. I *think* it was OK to enable APF
> with !lapic_in_kernel() before (at least I don't see what would not
> allow that).

If it was possible, did KVM break live migration of
kernel-irqchip=off guests that enabled APF?  This would mean my
patch is replacing a crash with a silent migration bug.  Not nice
either way.

> 
> > I am assuming
> > kvm-asyncpf never worked with kernel-irqchip=off (and enabling it
> > by default with kernel-irqchip=off was a mistake).
> >
> >
> >>                         [...] We'll need to merge this with
> >> 
> >> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg02963.html
> >> (queued by Paolo) and
> >> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06196.html
> >> which fixes a bug in it.
> >> 
> >> as kvm_irqchip_in_kernel() should go around both KVM_FEATURE_ASYNC_PF
> >> and KVM_FEATURE_ASYNC_PF_INT I believe.
> >
> > Shouldn't we just disallow kvm-asyncpf-int=on if kernel-irqchip=off?
> 
> (Sarcasm: if disallowing 'kernel-irqchip=off' is not an option, then)

I'm working on it.  :-)

> yes, we probably can, but kvm-asyncpf-int=on is the default we have so
> we can't just implicitly change it underneath or migration will break...

kvm-asyncpf-int wasn't merged yet, was it?  This means we don't
have compatibility issues to care about.

-- 
Eduardo



On Tue, Sep 22, 2020 at 07:26:42PM +0200, Paolo Bonzini wrote:
> On 22/09/20 19:22, Eduardo Habkost wrote:
> > If it was possible, did KVM break live migration of
> > kernel-irqchip=off guests that enabled APF?  This would mean my
> > patch is replacing a crash with a silent migration bug.  Not nice
> > either way.
> 
> Let's drop kernel-irqchip=off completely so migration is not broken. :)
> 
> I'm actually serious, I don't think we need a deprecation period even.

I wasn't sure about this, but then I've noticed the man page says
"disabling the in-kernel irqchip completely is not recommended
except for debugging purposes."

Does this note apply to all architectures?

-- 
Eduardo



We don't need to use kernel-irqchip=off for irq0 override if IRQ
routing is supported by the host, which is the case since 2009
(IRQ routing was added to KVM in Linux v2.6.30).

This is a more straightforward fix for Launchpad bug #1896263, as
it doesn't require increasing the complexity of the MSR code.
kernel-irqchip=off is for debugging only and there's no need to
increase the complexity of the code just to work around an issue
that was already fixed in the kernel.

Fixes: https://bugs.launchpad.net/bugs/1896263
Signed-off-by: Eduardo Habkost <email address hidden>
---
 tests/qtest/bios-tables-test.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index a9c8d478aee..27e8f0a1cb7 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -663,8 +663,7 @@ static void test_acpi_one(const char *params, test_data *data)
             data->uefi_fl1, data->uefi_fl2, data->cd, params ? params : "");
 
     } else {
-        /* Disable kernel irqchip to be able to override apic irq0. */
-        args = g_strdup_printf("-machine %s,kernel-irqchip=off %s -accel tcg "
+        args = g_strdup_printf("-machine %s %s -accel tcg "
             "-net none -display none %s "
             "-drive id=hd0,if=none,file=%s,format=raw "
             "-device %s,drive=hd0 ",
-- 
2.26.2



https://git.qemu.org/?p=qemu.git;a=commitdiff;h=d1e2d46467b95b03279

Released with QEMU v5.2.0.