Linux kernel fails to boot on powernv machines with nvme device on s390x hosts Description of problem: When running a powernv guest with nvme device on a s390x host, the guest linux kernel fails to boot with the following panic: ``` nvme nvme0: pci function 0002:01:00.0 nvme 0002:01:00.0: enabling device (0100 -> 0102) nvme nvme0: 1/0/0 default/read/poll queues nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 nvme nvme0: invalid id 0 completed on queue 0 BUG: Kernel NULL pointer dereference on read at 0x00000008 Faulting instruction address: 0xc0000000000c02ec Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: nvme powernv_flash(+) rtc_opal ibmpowernv mtd nvme_core CPU: 0 PID: 100 Comm: pb-console Not tainted 5.10.50-openpower1 #2 NIP: c0000000000c02ec LR: c00000000050d5dc CTR: c00000000024a2d0 REGS: c00000003ffdfa00 TRAP: 0300 Not tainted (5.10.50-openpower1) MSR: 9000000000009033 CR: 24000402 XER: 20000000 CFAR: c00000000000f3c0 DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000000ba058 c00000003ffdfc90 c00000000180db00 0000000000000008 GPR04: 000000000000000a 0000000000000000 c000000002740210 c00000000274e218 GPR08: c00000000183be00 0000000080000000 0000000000000000 c0080000003ba798 GPR12: c00000000024a2d0 c000000001a30000 0000000000000000 0000000000000100 GPR16: 0000000000000004 0000000000000020 0000000000000100 c00000000bbe8080 GPR20: 0000000000000028 c000000001830100 0000000000000001 0000000000000000 GPR24: c000000001831a00 c000000001410c00 00000000ffff9097 0000000000400040 GPR28: 000000000000000a 000000000000000a 0000000000000008 0000000000000000 NIP [c0000000000c02ec] __arch_spin_trylock+0x4/0x24 LR [c00000000050d5dc] _raw_spin_lock_irqsave+0x2c/0x78 Call Trace: [c00000003ffdfc90] [c00000003ffdfcc0] 0xc00000003ffdfcc0 (unreliable) [c00000003ffdfcd0] [c0000000000ba058] complete+0x24/0x64 [c00000003ffdfd10] [c00000000024a2f8] blk_end_sync_rq+0x28/0x3c [c00000003ffdfd30] [c00000000024f44c] __blk_mq_end_request+0x134/0x160 [c00000003ffdfd70] [c0080000003b481c] nvme_complete_rq+0xcc/0x13c [nvme_core] [c00000003ffdfda0] [c0080000000a1078] nvme_pci_complete_rq+0x78/0x108 [nvme] [c00000003ffdfdd0] [c00000000024de38] blk_done_softirq+0xc0/0xd0 [c00000003ffdfe30] [c00000000050da20] __do_softirq+0x238/0x28c [c00000003ffdff20] [c0000000000875d4] __irq_exit_rcu+0x80/0xc8 [c00000003ffdff50] [c000000000087844] irq_exit+0x18/0x30 [c00000003ffdff70] [c000000000011c4c] __do_irq+0x80/0xa0 [c00000003ffdff90] [c00000000001d7a4] call_do_irq+0x14/0x24 [c00000000bff3960] [c000000000011d20] do_IRQ+0xb4/0xbc [c00000000bff39f0] [c000000000008fac] hardware_interrupt_common_virt+0x1ac/0x1b0 --- interrupt: 500 at arch_local_irq_restore+0xac/0xe8 LR = __raw_spin_unlock_irq+0x34/0x40 [c00000000bff3cf0] [0000000000000000] 0x0 (unreliable) [c00000000bff3d20] [c0000000000a8344] __raw_spin_unlock_irq+0x34/0x40 [c00000000bff3d50] [c0000000000a84b0] finish_task_switch+0x160/0x228 [c00000000bff3df0] [c0000000000aa3d0] schedule_tail+0x20/0x8c [c00000000bff3e20] [c00000000000cb50] ret_from_fork+0x4/0x54 Instruction dump: a14d0b7a 7da96b78 2f8a0000 419e0010 39400000 b14d0b7a 7c0004ac a1490b78 394affff b1490b78 4e800020 812d0000 <7d401829> 2c0a0000 40c20010 7d20192d ---[ end trace 6b7a11c45e4fc465 ]--- Kernel panic - not syncing: Fatal exception Rebooting in 30 seconds.. ``` The issue has been noticed while running the avocado tests on a s390x host: ``` make check-venv ./tests/venv/bin/avocado run tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv8 ``` But they can also be reproduced manually: Steps to reproduce: 1. wget https://github.com/open-power/op-build/releases/download/v2.7/zImage.epapr 2. ./qemu-system-ppc64 -nographic -M powernv8 -kernel zImage.epapr -append "console=tty0 console=hvc0" -device pcie-pci-bridge,id=bridge1,bus=pcie.1,addr=0x0 -device nvme,bus=pcie.2,addr=0x0,serial=1234