Qemu PPC64 freezes with multi-core CPU I installed Debian 10 on a Qemu PPC64 VM running with the following flags: qemu-system-ppc64 \ -nographic -nodefaults -monitor pty -serial stdio \ -M pseries -cpu POWER9 -smp cores=4,threads=1 -m 4G \ -drive file=debian-ppc64el-qemu.qcow2,format=qcow2,if=virtio \ -netdev user,id=network01,$ports -device rtl8139,netdev=network01 \ Within a couple minutes on any operation (could be a Go application or simply changing the hostname with hostnamectl, the VM freezes and prints this on the console: ``` root@debian:~# [ 950.428255] rcu: INFO: rcu_sched self-detected stall on CPU [ 950.428453] rcu: 3-....: (5318 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=2544 [ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] Message from syslogd@debian at Mar 17 11:35:24 ... kernel:[ 976.244481] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [zsh:462] [ 980.110018] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-... } 5276 jiffies s: 93 root: 0x8/. [ 980.111177] rcu: blocking rcu_node structures: [ 1013.442268] rcu: INFO: rcu_sched self-detected stall on CPU [ 1013.442365] rcu: 3-....: (21071 ticks this GP) idle=8e2/1/0x4000000000000004 softirq=5957/5960 fqs=9342 ``` If I change to 1 core on the command line, I haven't seen these freezes. Is this with KVM or with TCG? What is your hardware configuration? It's soft emulation, running Qemu 4.2.50 (from master branch) on MacOS Mojave. Do you have the problem with 4.2.0? Can you identify the commit introducing the problem? I just reverted to 4.2.0 and it works fine. No freezes for the past hour. ❯ qemu-system-ppc64 --version QEMU emulator version 4.2.0 Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers Couldn't bisect to find the bad commit. Carlos Thank you for the test. I'm going to try to reproduce the problem and bisect. I'm not able to reproduce (kernel 4.19.0-8-powerpc64le, qemu id d649689a8ecb) What is the kernel version in the guest? What is the QEMU commit id you used to test with 4.2.50? Hi Laurent, I'm on a MacOS Mojave running Qemu installed by homebrew from master branch on the day I've opened the bug. The option to install was: `brew install --HEAD qemu -s --verbose`. Maybe it's a Mac related problem? Hi, any news about this? Can I provide any additional info since it might be a Mac issue. Thanks I just built from latest master and got the kernel trace below. ❯ qemu-system-ppc64 --version QEMU emulator version 4.2.90 (v4.2.0-2811-g83019e81d1-dirty) Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers qemu-system-ppc64 \ -nographic -nodefaults -monitor pty -serial stdio \ -M pseries -cpu POWER9 -smp cores=4,threads=1 -m 4G \ -drive file=debian-ppc64el-qemu.qcow2,format=qcow2,if=virtio \ -netdev user,id=network01,hostfwd=tcp::$LocalSSHPort-:22 -device rtl8139,netdev=network01 \ [ 376.219450] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [swapper/3:0] [ 376.226712] Modules linked in: ctr(E) vmx_crypto(E) gf128mul(E) sunrpc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) virtio_blk(E) 8139too(E) virtio_pci(E) virtio_ring(E) 8139cp(E) virtio(E) mii(E) [ 376.235692] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E 5.5.0-rc5-powerpc64le #1 Debian 5.5~rc5-1~exp1 [ 376.236245] NIP: c00000000000af8c LR: c000000000019664 CTR: c000000000af2c80 [ 376.236365] REGS: c0000000fffcf920 TRAP: 0901 Tainted: G E (5.5.0-rc5-powerpc64le Debian 5.5~rc5-1~exp1) [ 376.236376] MSR: 8000000000009033 CR: 44002248 XER: 00000000 [ 376.236479] CFAR: c000000000af2ce0 IRQMASK: 0 GPR00: c000000000af2e38 c0000000fffcfbb0 c000000001365700 0000000000000500 GPR04: 00000000fef90000 0000002be1f69c00 0000002beaa729fa c0000000fffec880 GPR08: 0000000400000000 00000000000080ff 0000000000000001 c0080000004c6ff0 GPR12: 0000000000002000 c0000000fffec880 [ 376.238452] NIP [c00000000000af8c] replay_interrupt_return+0x0/0x4 [ 376.238488] LR [c000000000019664] arch_local_irq_restore.part.0+0x54/0x70 [ 376.238984] Call Trace: [ 376.240707] [c0000000fffcfbb0] [c0000000008ce910] napi_gro_receive+0x1e0/0x210 (unreliable) [ 376.240824] [c0000000fffcfbd0] [c000000000af2e38] _raw_spin_unlock_irqrestore+0x98/0xb0 [ 376.242114] [c0000000fffcfbf0] [c0080000004c5588] cp_rx_poll+0x580/0x610 [8139cp] [ 376.242131] [c0000000fffcfcf0] [c0000000008cf6c8] net_rx_action+0x1f8/0x550 [ 376.242139] [c0000000fffcfe10] [c000000000af3a8c] __do_softirq+0x16c/0x3d8 [ 376.242172] [c0000000fffcff30] [c0000000001329e8] irq_exit+0xd8/0x120 [ 376.242181] [c0000000fffcff60] [c000000000019fb4] __do_irq+0x84/0x1c0 [ 376.242193] [c0000000fffcff90] [c00000000002cbec] call_do_irq+0x14/0x24 [ 376.242201] [c0000000fd4b7980] [c00000000001a178] do_IRQ+0x88/0xf0 [ 376.242209] [c0000000fd4b79c0] [c000000000008d98] hardware_interrupt_common+0x158/0x160 [ 376.242243] --- interrupt: 501 at plpar_hcall_norets+0x1c/0x28 LR = check_and_cede_processor+0x48/0x60 [ 376.243892] [c0000000fd4b7cc0] [c0000000fd4b7cf0] 0xc0000000fd4b7cf0 (unreliable) [ 376.243922] [c0000000fd4b7d20] [c00000000086c710] shared_cede_loop+0x50/0x160 [ 376.243942] [c0000000fd4b7d50] [c000000000868844] cpuidle_enter_state+0xa4/0x590 [ 376.243953] [c0000000fd4b7dd0] [c000000000868dcc] cpuidle_enter+0x4c/0x70 [ 376.243983] [c0000000fd4b7e10] [c000000000177d4c] call_cpuidle+0x4c/0x90 [ 376.243991] [c0000000fd4b7e30] [c000000000178358] do_idle+0x2f8/0x400 [ 376.243998] [c0000000fd4b7ed0] [c0000000001786a8] cpu_startup_entry+0x38/0x40 [ 376.244011] [c0000000fd4b7f00] [c00000000004e910] start_secondary+0x640/0x670 [ 376.244020] [c0000000fd4b7f90] [c00000000000b354] start_secondary_prolog+0x10/0x14 [ 376.244093] Instruction dump: [ 376.244751] 7d200026 618c8000 2c030900 4182e348 2c030500 4182dcd0 2c030f00 4182f318 [ 376.244797] 2c030a00 4182ffc8 60000000 60000000 <4e800020> 7c781b78 480003d9 480003f1 Could you try to change the network card, with something like "-device e1000e,netdev=network01" or "-device virtio-net-pci,netdev=network01" or "-device spapr-vlan,netdev=network01"? Hi Laurent, confirm that after changing the network adapter to the e1000e it worked flawlessly for hours with 4 cores on Macbook Pro. Thanks! The QEMU project is currently moving its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or please mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. [Expired for QEMU because there has been no activity for 60 days.]