diff options
Diffstat (limited to 'classification_output/05/boot/51610399')
| -rw-r--r-- | classification_output/05/boot/51610399 | 316 |
1 files changed, 0 insertions, 316 deletions
diff --git a/classification_output/05/boot/51610399 b/classification_output/05/boot/51610399 deleted file mode 100644 index 6bcd4697f..000000000 --- a/classification_output/05/boot/51610399 +++ /dev/null @@ -1,316 +0,0 @@ -boot: 0.986 -graphic: 0.986 -assembly: 0.985 -instruction: 0.985 -other: 0.985 -semantic: 0.984 -device: 0.984 -mistranslation: 0.983 -socket: 0.978 -KVM: 0.975 -vnc: 0.974 -network: 0.973 - -[BUG][powerpc] KVM Guest Boot Failure – Hangs at "Booting Linux via __start()” - -Bug Description: -Encountering a boot failure when launching a KVM guest with -qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor -crashes. -Reproduction Steps: -# qemu-system-ppc64 --version -QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f) -Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers -# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine -pseries,accel=kvm \ --m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \ - -device virtio-scsi-pci,id=scsi \ --drive -file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2 -\ --device scsi-hd,drive=drive0,bus=scsi.0 \ - -netdev bridge,id=net0,br=virbr0 \ - -device virtio-net-pci,netdev=net0 \ - -serial pty \ - -device virtio-balloon-pci \ - -cpu host -QEMU 9.2.50 monitor - type 'help' for more information -char device redirected to /dev/pts/2 (label serial0) -(qemu) -(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but -unavailable: IRQ_XIVE capability must be present for KVM -Falling back to kernel-irqchip=off -** Qemu Hang - -(In another ssh session) -# screen /dev/pts/2 -Preparing to boot Linux version 6.10.4-200.fc40.ppc64le -(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801 -(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11 -15:20:17 UTC 2024 -Detected machine type: 0000000000000101 -command line: -BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le -root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M -Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) -Calling ibm,client-architecture-support... done -memory layout at init: - memory_limit : 0000000000000000 (16 MB aligned) - alloc_bottom : 0000000008200000 - alloc_top : 0000000030000000 - alloc_top_hi : 0000000800000000 - rmo_top : 0000000030000000 - ram_top : 0000000800000000 -instantiating rtas at 0x000000002fff0000... done -prom_hold_cpus: skipped -copying OF device tree... -Building dt strings... -Building dt structure... -Device tree strings 0x0000000008210000 -> 0x0000000008210bd0 -Device tree struct 0x0000000008220000 -> 0x0000000008230000 -Quiescing Open Firmware ... -Booting Linux via __start() @ 0x0000000000440000 ... -** Guest Console Hang - - -Git Bisect: -Performing git bisect points to the following patch: -# git bisect bad -e8291ec16da80566c121c68d9112be458954d90b is the first bad commit -commit e8291ec16da80566c121c68d9112be458954d90b (HEAD) -Author: Nicholas Piggin <npiggin@gmail.com> -Date: Thu Dec 19 13:40:31 2024 +1000 - - target/ppc: fix timebase register reset state -(H)DEC and PURR get reset before icount does, which causes them to -be -skewed and not match the init state. This can cause replay to not -match the recorded trace exactly. For DEC and HDEC this is usually -not -noticable since they tend to get programmed before affecting the - target machine. PURR has been observed to cause replay bugs when - running Linux. - - Fix this by resetting using a time of 0. - - Message-ID: <20241219034035.1826173-2-npiggin@gmail.com> - Signed-off-by: Nicholas Piggin <npiggin@gmail.com> - - hw/ppc/ppc.c | 11 ++++++++--- - 1 file changed, 8 insertions(+), 3 deletions(-) - - -Reverting the patch helps boot the guest. -Thanks, -Misbah Anjum N - -Thanks for the report. - -Tricky problem. A secondary CPU is hanging before it is started by the -primary via rtas call. - -That secondary keeps calling kvm_cpu_exec(), which keeps exiting out -early with EXCP_HLT because kvm_arch_process_async_events() returns -true because that cpu has ->halted=1. That just goes around he run -loop because there is an interrupt pending (DEC). - -So it never runs. It also never releases the BQL, and another CPU, -the primary which is actually supposed to be running, is stuck in -spapr_set_all_lpcrs() in run_on_cpu() waiting for the BQL. - -This patch just exposes the bug I think, by causing the interrupt. -although I'm not quite sure why it's okay previously (-ve decrementer -values should be causing a timer exception too). The timer exception -should not be taken as an interrupt by those secondary CPUs, and it -doesn't because it is masked, until set_all_lpcrs sets an LPCR value -that enables powersave wakeup on decrementer interrupt. - -The start_powered_off sate just sets ->halted, which makes it look -like a powersaving state. Logically I think it's not the same thing -as far as spapr goes. I don't know why start_powered_off only sets -->halted, and not ->stop/stopped as well. - -Not sure how best to solve it cleanly. I'll send a revert if I can't -get something working soon. - -Thanks, -Nick - -On Tue Mar 18, 2025 at 7:09 AM AEST, misanjum wrote: -> -Bug Description: -> -Encountering a boot failure when launching a KVM guest with -> -qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor -> -crashes. -> -> -> -Reproduction Steps: -> -# qemu-system-ppc64 --version -> -QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f) -> -Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers -> -> -# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine -> -pseries,accel=kvm \ -> --m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \ -> --device virtio-scsi-pci,id=scsi \ -> --drive -> -file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2 -> -> -\ -> --device scsi-hd,drive=drive0,bus=scsi.0 \ -> --netdev bridge,id=net0,br=virbr0 \ -> --device virtio-net-pci,netdev=net0 \ -> --serial pty \ -> --device virtio-balloon-pci \ -> --cpu host -> -QEMU 9.2.50 monitor - type 'help' for more information -> -char device redirected to /dev/pts/2 (label serial0) -> -(qemu) -> -(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but -> -unavailable: IRQ_XIVE capability must be present for KVM -> -Falling back to kernel-irqchip=off -> -** Qemu Hang -> -> -(In another ssh session) -> -# screen /dev/pts/2 -> -Preparing to boot Linux version 6.10.4-200.fc40.ppc64le -> -(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801 -> -(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11 -> -15:20:17 UTC 2024 -> -Detected machine type: 0000000000000101 -> -command line: -> -BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le -> -root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M -> -Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) -> -Calling ibm,client-architecture-support... done -> -memory layout at init: -> -memory_limit : 0000000000000000 (16 MB aligned) -> -alloc_bottom : 0000000008200000 -> -alloc_top : 0000000030000000 -> -alloc_top_hi : 0000000800000000 -> -rmo_top : 0000000030000000 -> -ram_top : 0000000800000000 -> -instantiating rtas at 0x000000002fff0000... done -> -prom_hold_cpus: skipped -> -copying OF device tree... -> -Building dt strings... -> -Building dt structure... -> -Device tree strings 0x0000000008210000 -> 0x0000000008210bd0 -> -Device tree struct 0x0000000008220000 -> 0x0000000008230000 -> -Quiescing Open Firmware ... -> -Booting Linux via __start() @ 0x0000000000440000 ... -> -** Guest Console Hang -> -> -> -Git Bisect: -> -Performing git bisect points to the following patch: -> -# git bisect bad -> -e8291ec16da80566c121c68d9112be458954d90b is the first bad commit -> -commit e8291ec16da80566c121c68d9112be458954d90b (HEAD) -> -Author: Nicholas Piggin <npiggin@gmail.com> -> -Date: Thu Dec 19 13:40:31 2024 +1000 -> -> -target/ppc: fix timebase register reset state -> -> -(H)DEC and PURR get reset before icount does, which causes them to -> -be -> -skewed and not match the init state. This can cause replay to not -> -match the recorded trace exactly. For DEC and HDEC this is usually -> -not -> -noticable since they tend to get programmed before affecting the -> -target machine. PURR has been observed to cause replay bugs when -> -running Linux. -> -> -Fix this by resetting using a time of 0. -> -> -Message-ID: <20241219034035.1826173-2-npiggin@gmail.com> -> -Signed-off-by: Nicholas Piggin <npiggin@gmail.com> -> -> -hw/ppc/ppc.c | 11 ++++++++--- -> -1 file changed, 8 insertions(+), 3 deletions(-) -> -> -> -Reverting the patch helps boot the guest. -> -Thanks, -> -Misbah Anjum N - |