other: 0.986 permissions: 0.978 graphic: 0.978 semantic: 0.978 debug: 0.976 device: 0.976 performance: 0.974 boot: 0.971 socket: 0.966 PID: 0.965 network: 0.961 vnc: 0.957 files: 0.946 KVM: 0.847 QEMU coroutines fail with LTO on non-x86_64 architectures I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days ago broke it. Now when I launch it, it hits an assertion: OpenSBI v0.6 ____ _____ ____ _____ / __ \ / ____| _ \_ _| | | | |_ __ ___ _ __ | (___ | |_) || | | | | | '_ \ / _ \ '_ \ \___ \| _ < | | | |__| | |_) | __/ | | |____) | |_) || |_ \____/| .__/ \___|_| |_|_____/|____/_____| | | |_| ... Found /boot/extlinux/extlinux.conf Retrieving file: /boot/extlinux/extlinux.conf 618 bytes read in 2 ms (301.8 KiB/s) RISC-V Qemu Boot Options 1: Linux kernel-5.5.0-dirty 2: Linux kernel-5.5.0-dirty (recovery mode) Enter choice: 1: Linux kernel-5.5.0-dirty Retrieving file: /boot/initrd.img-5.5.0-dirty qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: Assertion `qemu_coroutine_self() == pool->main_co' failed. ./run.sh: line 31: 1604 Aborted (core dumped) qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin -device virtio-blk-devi ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -drive file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04 (fully updated). Think you have everything already, but just in case: $ lsb_release -rd Description: Ubuntu Hirsute Hippo (development branch) Release: 21.04 $ uname -a Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux (note this is a VM running on macOS/M1) $ apt-cache policy qemu qemu: Installed: 1:5.2+dfsg-9ubuntu1 Candidate: 1:5.2+dfsg-9ubuntu1 Version table: *** 1:5.2+dfsg-9ubuntu1 500 500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64 Packages 100 /var/lib/dpkg/status ProblemType: Bug DistroRelease: Ubuntu 21.04 Package: qemu 1:5.2+dfsg-9ubuntu1 ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0 Uname: Linux 5.11.0-11-generic aarch64 ApportVersion: 2.20.11-0ubuntu61 Architecture: arm64 CasperMD5CheckResult: unknown CurrentDmesg: Error: command ['pkexec', 'dmesg'] failed with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon: GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie Error executing command as another user: Not authorized This incident has been reported. Date: Mon Mar 29 02:33:25 2021 Dependencies: KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND Lspci-vt: -[0000:00]-+-00.0 Apple Inc. Device f020 +-01.0 Red Hat, Inc. Virtio network device +-05.0 Red Hat, Inc. Virtio console +-06.0 Red Hat, Inc. Virtio block device \-07.0 Red Hat, Inc. Virtio RNG Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=C.UTF-8 SHELL=/bin/bash ProcKernelCmdLine: console=hvc0 root=/dev/vda SourcePackage: qemu UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago) acpidump: Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon: GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie Error executing command as another user: Not authorized This incident has been reported. FWIW, I just now built qemu-system-riscv64 from git ToT and that works fine. Hi Tommy, you reported that against "1:5.2+dfsg-9ubuntu1" which is odd. The only recent change was around a) package dependencies b) CVEs not touching your use-case IMHO Was the formerly working version 1:5.2+dfsg-6ubuntu2 as I'm assuming or did you upgrade from a different one? Could you also add the full commandline you use to start your qemu test case? If there are any images or such involved as far as you can share where one could fetch them please. And to be clear on your report - with the same 1:5.2+dfsg-9ubuntu1 @amd64 it works fine for you. Just the emulation of riscv64 on arm64 HW is what now fails for you correct? It also is interesting that you built qemu from git to have it work. Did you build tag v5.2.0 or the latest commit? If you built v5.2.0 it might be something in the Ubuntu Delta that I have to look for. If you've built the latest HEAD of qemu git then most likely the solution is a vommit since v5.2.0 - in that case would you be willing and able to maybe bisect that from v5.2.0..HEAD what the fix was? 0. Repro: $ wget https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz $ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz $ ./run_riscvVM.sh (wait ~ 20 s) [ OK ] Reached target Local File Systems (Pre). [ OK ] Reached target Local File Systems. Starting udev Kernel Device Manager... qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: Assertion `qemu_coroutine_self() == pool->main_co' failed. (root password is "riscv" fwiw) 1. "Was the formerly working version 1:5.2+dfsg-6ubuntu2?" I'm afraid I don't know, but I update a few times a week. If you can tell me know to try individual versions, I'll do that 2. "full commandline you use to start your qemu test case?" Probably the repo above is more useful, but FWIW: qemu-system-riscv64 \ -machine virt \ -nographic \ -smp 4 \ -m 4G \ -bios fw_payload.bin \ -device virtio-blk-device,drive=hd0 \ -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-device,rng=rng0 \ -drive file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 \ -device virtio-net-device,netdev=usernet \ -netdev user,id=usernet,$ports 3. "the same 1:5.2+dfsg-9ubuntu1 @amd64 it works fine for you? Just the emulation of riscv64 on arm64 HW is what now fails for you correct?" Yes x 2, confirmed with the above repro. $ apt-cache policy qemu qemu: Installed: 1:5.2+dfsg-9ubuntu1 Candidate: 1:5.2+dfsg-9ubuntu1 Version table: *** 1:5.2+dfsg-9ubuntu1 500 500 http://us.archive.ubuntu.com/ubuntu hirsute/universe amd64 Packages 100 /var/lib/dpkg/status 4. "It also is interesting that you built qemu from git to have it work. Did you build tag v5.2.0 or the latest commit?" latest. Rebuilding from the "vommit" tagged with v5.2.0 ... Self-built v5.2.0 qemu-system-riscv64 does _not_ produce the bug. 0. Repro: > ... > $ ./run_riscvVM.sh > ... Thanks, I was not able to reproduce with that using the most recent qemu 1:5.2+dfsg-9ubuntu1 on amd64 (just like you) Trying the same on armhf was slower and a bit odd. - I first got: qemu-system-riscv64: at most 2047 MB RAM can be simulated Reducing the memory to 2047M started up the system. - then I have let it boot, which took quite a while and eventually hung at [ 13.017716] mousedev: PS/2 mouse device common for all mice [ 13.065889] usbcore: registered new interface driver usbhid [ 13.070209] usbhid: USB HID core driver [ 13.092671] NET: Registered protocol family 10 So it hung on armhf, while working on a amd64 host. That isn't good, but there was no crash to be seen :-/ Maybe it depends on what arm platform (as there are often subtle differences) or which storage (as the assert is about storage) you run on. My CPU is an X-Gene and my Storage is a ZFS (that backs my container running hirsute and Hirsute's qemu). What is it for you? I've waited more, but no failure other than the hang was showing up. Is this failing 100% of the times for you, or just sometimes and maybe racy? --- 1. "Was the formerly working version 1:5.2+dfsg-6ubuntu2?" > I'm afraid I don't know, but I update a few times a week. A hint which versions to look at can be derived from $ grep -- qemu-system-misc /var/log/dpkg.log If you can tell me know to try individual versions, I'll do that You can go to https://launchpad.net/ubuntu/+source/qemu/+publishinghistory There you'll see every version of the package that existed. If you click on a version it allows you to download the debs which you can install with "dpkg -i ....deb" --- 2. "full commandline you use to start your qemu test case?" > Probably the repo above is more useful, but FWIW: Indeed, thanks! 3. "the same 1:5.2+dfsg-9ubuntu1 @amd64 it works fine for you? Just the emulation of riscv64 on arm64 HW is what now fails for you correct?" > Yes x 2, confirmed with the above repro. Thanks for the confirmation --- 4. "It also is interesting that you built qemu from git to have it work. Did you build tag v5.2.0 or the latest commit?" > Rebuilding from the "commit" tagged with v5.2.0 ... Very interesting, this short after a release this is mostly a few CVEs and integration of e.g. Ubuntu/Debian specific paths. Still chances are that you used a different toolchain than the packaging builds. Could you rebuild what you get with "apt source qemu". That will be 5.2 plus the Delta we have... If that doesn't fail then your build-env differs from our builds, and therein is the solution. If it fails we need to check which delta it is. Furthermore if indeed that fails while v5.2.0 worked I've pushed all our delta as one commit at a time to https://code.launchpad.net/~paelzer/ubuntu/+source/qemu/+git/qemu/+ref/hirsute-delta-as-commits-lp1921664 so you could maybe bisect that. But to be sure build from the first commit in there and verify that it works. If this fails as well we have to look what differs in those builds. FYI my qemu is still busy 1913 root 20 0 2833396 237768 7640 S 100.7 5.9 25:54.13 qemu-system-ris And after about 1000 seconds the guest moved a bit forward now reaching [ 13.070209] usbhid: USB HID core driver [ 13.092671] NET: Registered protocol family 10 [ 1003.282387] Segment Routing with IPv6 [ 1004.790268] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver [ 1009.002716] NET: Registered protocol family 17 [ 1012.612965] 9pnet: Installing 9P2000 support [ 1012.915223] Key type dns_resolver registered [ 1015.022864] registered taskstats version 1 [ 1015.324660] Loading compiled-in X.509 certificates [ 1036.408956] Freeing unused kernel memory: 264K [ 1036.410322] This architecture does not have kernel memory protection. [ 1036.710012] Run /init as init process Loading, please wait... I'll keep it running to check if I'll hit the assert later .... > Maybe it depends on what arm platform (as there are often subtle differences) or which storage (as > the assert is about storage) you run on. > My CPU is an X-Gene and my Storage is a ZFS (that backs my container running hirsute and Hirsute's > qemu). > What is it for you? Sorry, I thought I had already reported that, but it's not clear. My setup is special in a couple of ways: - I'm running Ubuntu/Arm64 (21.04 beta, fully up-to-date except kernel), but ... - it's a virtual machine on a macOS/Mac Mini M1 (fully up-to-date) - It's running the 5.8.0-36-generic which isn't the latest (for complicated reasons) I'll try to bring my Raspberry Pi 4 back up on Ubuntu and see if I can reproduce it there. > Is this failing 100% of the times for you, or just sometimes and maybe racy? 100% consistently reproducible with the official packages. 0% reproducible with my own build > A hint which versions to look at can be derived from > $ grep -- qemu-system-misc /var/log/dpkg.log Alas, I had critical space issues and /var/log was among the casualties > Could you rebuild what you get with "apt source qemu". That will be 5.2 plus the Delta we have... TIL. I tried `apt source --compile qemu` but it complains dpkg-checkbuilddeps: error: Unmet build dependencies: gcc-alpha-linux-gnu gcc-powerpc64-linux-gnu but these packages are not available [anymore?]. I don't currently have the time to figure this out. > FYI my qemu is still busy It's hung. The boot take ~ 20 seconds on my host. Multi-minutes is not normal. If I can reproduce this on a Raspberry Pi 4, then I'll proceed with your suggestions above, otherwise I'll pause this until I can run Ubuntu natively on the Mac Mini. Ok, thanks for all the further details. Let us chase this further down once you got to that test & bisect. I'll set the state to incomplete util then. On my 4 GB Raspberry Pi 4 QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-3ubuntu1) worked as expected as did, but QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-9ubuntu1) *did* reproduce the issue, but it took slightly longer to hit it (a few minutes): ``` ... [ OK ] Started Serial Getty on ttyS0. [ OK ] Reached target Login Prompts. Ubuntu 20.04 LTS Ubuntu-riscv64 ttyS0 Ubuntu-riscv64 login: qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed. ./run_riscvVM.sh: line 31: 2304 Aborted (core dumped) qemu-system-riscv64 -machine virt -nographic -smp 4 -m 3G -bios fw_payload.bin -device virtio-blk-device,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -drive file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -device virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports ``` Christian, I think I need some help. Like I said I couldn't build with apt source --compile qemu. I proceeded with $ git clone -b hirsute-delta-as-commits-lp1921664 git+ssh://