qemu-system-x86_64 crashed with SIGABRT when using option -vga qxl When using qemu-system-x86_64 with the option -vga qxl, it crashes. The easiest way to crash it is by trying to change the guest's resolution. However, the system may randomly crash too, not happening only when changing resolution. Here is the terminal output of one of these random crashes: -------- $ qemu-system-x86_64 -hda /dev/sdb -m 2048 -enable-kvm -cpu host -vga qxl -nodefaults -netdev user,id=hostnet0 -device virtio-net-pci,id=net0,netdev=hostnet0 WARNING: Image format was not specified for '/dev/sdb' and probing guessed raw. Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted. Specify the 'raw' format explicitly to remove the restrictions. (process:21313): Spice-WARNING **: 16:01:45.759: display-channel.c:2431:display_channel_validate_surface: canvas address is 0x7f8eb948ab18 for 0 (and is NULL) (process:21313): Spice-WARNING **: 16:01:45.759: display-channel.c:2432:display_channel_validate_surface: failed on 0 (process:21313): Spice-CRITICAL **: 16:01:45.759: display-channel.c:2035:display_channel_update: condition `display_channel_validate_surface(display, surface_id)' failed Abortado (imagem do núcleo gravada) -------- I was running QEMU as a normal user which is on the groups kvm and disk. Initially I supposed the problem was because I was running QEMU as root, but as a normal user this happens too. I have tested with guests with different Ubuntu version: 18.04, 17.10 and 16.04. It is happening with them all. ProblemType: Crash DistroRelease: Ubuntu 18.04 Package: qemu-system-x86 1:2.11+dfsg-1ubuntu4 ProcVersionSignature: Ubuntu 4.15.0-10.11-generic 4.15.3 Uname: Linux 4.15.0-10-generic x86_64 ApportVersion: 2.20.8-0ubuntu10 Architecture: amd64 CurrentDesktop: XFCE Date: Wed Mar 14 17:13:52 2018 ExecutablePath: /usr/bin/qemu-system-x86_64 InstallationDate: Installed on 2017-06-13 (273 days ago) InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412) KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND MachineType: LENOVO 80UG ProcCmdline: qemu-system-x86_64 -hda /dev/sdb -smp cpus=2 -m 512 -enable-kvm -cpu host -vga qxl -nodefaults -netdev user,id=hostnet0 -device virtio-net-pci,id=net0,netdev=hostnet0 ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-10-generic.efi.signed root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet Signal: 6 SourcePackage: qemu StacktraceTop: () at /usr/lib/x86_64-linux-gnu/libspice-server.so.1 () at /usr/lib/x86_64-linux-gnu/libspice-server.so.1 () at /usr/lib/x86_64-linux-gnu/libspice-server.so.1 () at /usr/lib/x86_64-linux-gnu/libspice-server.so.1 () at /usr/lib/x86_64-linux-gnu/libspice-server.so.1 Title: qemu-system-x86_64 crashed with SIGABRT UpgradeStatus: Upgraded to bionic on 2017-10-20 (145 days ago) UserGroups: adm bluetooth cdrom dialout dip disk kvm libvirt lpadmin netdev plugdev sambashare sudo dmi.bios.date: 07/10/2017 dmi.bios.vendor: LENOVO dmi.bios.version: 0XCN43WW dmi.board.asset.tag: NO Asset Tag dmi.board.name: Toronto 4A2 dmi.board.vendor: LENOVO dmi.board.version: SDK0J40679 WIN dmi.chassis.asset.tag: NO Asset Tag dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: Lenovo ideapad 310-14ISK dmi.modalias: dmi:bvnLENOVO:bvr0XCN43WW:bd07/10/2017:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK: dmi.product.family: IDEAPAD dmi.product.name: 80UG dmi.product.version: Lenovo ideapad 310-14ISK dmi.sys.vendor: LENOVO StacktraceTop: spice_logv (log_domain=0x7fb001524195 "Spice", args=0x7fafbf9fe600, format=0x7fb001525015 "condition `%s' failed", function=0x7fb001527ef0 <__func__.47520> "display_channel_update", strloc=0x7fb001527c0f "display-channel.c:2035", log_level=G_LOG_LEVEL_CRITICAL) at log.c:183 spice_log (log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, strloc=strloc@entry=0x7fb001527c0f "display-channel.c:2035", function=function@entry=0x7fb001527ef0 <__func__.47520> "display_channel_update", format=format@entry=0x7fb001525015 "condition `%s' failed") at log.c:196 display_channel_update (display=0x56421590aa30, surface_id=0, area=area@entry=0x56421590ee1c, clear_dirty=1, qxl_dirty_rects=qxl_dirty_rects@entry=0x7fafbf9fe770, num_dirty_rects=num_dirty_rects@entry=0x7fafbf9fe76c) at display-channel.c:2035 handle_dev_update_async (opaque=0x56421590ebe0, payload=0x56421590ee10) at red-worker.c:428 dispatcher_handle_single_read (dispatcher=0x56421590e080) at dispatcher.c:284 Subscribed Christian Ehrhardt, who might have an idea. Hmm, I have no good idea unfortunately. I tried it a few times (18.04 desktop guest 8 resolution changes) - showed no issue for me. Is this depending on the type of guests that you run? I drive ti through libvirt, which adds quite some variables in the new format. -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 You might experiment if some of them set would mitigate your issue. Looking at the warnings - they are from display_channel_validate_surface but they are "safe" which means they check, detect it is null and then skips. But maybe later on something else crashes on the same canvas being Null maybe? Are the warnings like: Spice-WARNING **: 16:01:45.759: display-channel.c:2431:display_channel_validate_surface: canvas address is 0x7f8eb948ab18 for 0 (and is NULL) appearing right before the crash or when you start? The crash itself seems to be: display_channel_update -> display_channel_validate_surface (which emits the warnings) -> spice_warning -> spice_log -> spice_logv This crashes if >= a certain log level - the warning above triggers it. So the question is why is the canvas Null? 2429 if (!display->priv->surfaces[surface_id].context.canvas) { 2430 spice_warning("canvas address is %p for %d (and is NULL)\n", 2431 &(display->priv->surfaces[surface_id].context.canvas), surface_id); 2432 spice_warning("failed on %d", surface_id); 2433 return FALSE; 2434 } handle_dev_update_async gets that value indirectly via RedWorker *worker = opaque; ... display_channel_update(worker->display_channel, So the update that kills the pointer might be anywhere in between in this async paths. I'm not a subject matter expert on this async UI updating :-/ If this really affects all releases way back including the latest we should try to build you the latest from source and if it affects this as well open the bug against upstream as well. As there we need a fix/discussion first then. Can you compile qemu from source for yourself for this check or do you need help with that. I was able to build it from source. In the first try I was unable to test because it hadn't the spice protocol enabled. The interface QEMU is using is different, as it has a menu bar with "Machine" and "View" with some options, but I could test and I could reproduce the crash. To clarify, I have recorded the VMs crashing. Please note I have the notebook screen and a external monitor, so the resolution is a bit strange. With the command: ./qemu-system-x86_64 -hda ~/Downloads/xubuntu-bionic-desktop-amd64-2018-04-22.iso -m 1536 -vga qxl -enable-kvm -cpu host -smp cpus=2 https://mega.nz/#!98ZiHY5b!ZOaNjb1OaZVj0V80GRjkqafOAL2UinVlAEiTP9aazdk With the command: ./qemu-system-x86_64 -hda ~/Downloads/xubuntu-bionic-desktop-amd64-2018-04-22.iso -m 1536 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x3 -enable-kvm -cpu host -smp cpus=2 https://mega.nz/#!lwRDRA7I!no-6S6cWxmRf8Q8cjSWfN269PM9DISjLmN7QM6LYBC4 The examples are with Live ISOs, Using -cdrom (the most correct option) instead of -hda still have the same crashes. I can reproduce with a installed VM and with a physical system started in QEMU. Additional effects I've seen are the guests freezing instead of crashing and a significant RAM memory peak when starting the VM using libvirt. If high amounts of memory are being used in the moment the VM is started, approximately 128 MB of memory goes to SWAP that is never freed. The only way to freed it is to umount the SWAP partition (I tested until SWAP had 658 MB of memory). What I meant is that with a Bionic host, the guests crash regardless of what the guest is. For now I've tested only with Ubuntu and its derivatives, from 12.04 to 18.04 and they consistently have the issue. I'm downloading CentOS but the download is slow, so it will take some time. I will add the CentOS guest test results after I test it. The download of CentOS 7 was finished and I downloaded Fedora Workstation 27 later. With both as guests the VM still crash, so it's not a issue exclusive to Ubuntu guests. The crash may happen after 8 resolution changes (CentOS) or in the first one (Fedora 27) or if the system is running and suddenly crashes (Xubuntu 17.10 in a external HDD). Thanks Leonardo, with that confirmed: - not dependent on the guest distribution - affecting latest upstream - good logs on the crash The videos are not needed but nice to proove your case (just as my pre-analysis of the code path is nice but likely not useful to a developer that regularly works on that code sections). Overall this should really be ready for upstreams attention, I added a Qemu task which will auto mirror this to the Mailing list. The bug traces so far had no private information, so I opened up the state to be visible to everyone. QEMU from git apparently is fixed, but Ubuntu's version is still problematic. Using an Xubuntu 18.04 guest, it's possible to reproduce the crash using: while true ; do xrandr --output Virtual-0 --mode 640x480 ; sleep 1 ; xrandr --output Virtual-0 --mode 1280x720 ; sleep 1 ; xrandr --output Virtual-0 --mode 1920x1080 ; sleep 1 ; done In less than 20 seconds the guest crash with: (process:16447): Spice-CRITICAL **: 15:34:52.047: display-channel.c:2035:display_channel_update: condition `display_channel_validate_surface(display, surface_id)' failed Abortado (imagem do núcleo gravada) Very interesting, Still not triggering for me :-/ Could you check if the PPA in [1] (with qemu 2.12 planned for Cosmic) already fixes it for you? [1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3306/+packages Link is better as https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3306 Unfortunately it did not work, the error is still the same although it used the GTK interface this time. The command I used was: $ qemu-system-x86_64 -vga qxl -enable-kvm -cpu host -smp cores=2,threads=2 -m 2048 -cdrom xubuntu-18.04-desktop-amd64.iso I have noticed Ubuntu's QEMU and this QEMU don't have the OpenGL option enabled, may this be related to the issue? Yeah we switched to gtk. With OpenGL which do you mean - the virtgl based support or some other config option? Since it still fails to reproduce for me, but you already have a self build qemu that is good. Could you based on this build 2.12 from source as well and confirm that it fails. If it does you could bisect from 2.12 to master what fixed it so that we can consider that patch. Otherwise if 2.12 from source in your own build works lets take a look at the build options that you mentioned. I meant the option --enable-opengl and, for old versions, --enable-gtk-gl. I know it is required to use virtual machines using Intel GVT-g with dma-buf, and is a option strangely absent from QEMU configuration from Ubuntu's build. Without it, virgl fails too, making virt-manager have an option that does not work (under Spice Display, OpenGL option does not work). The 2.12 build does crash when tested. That while loop is pretty efficient to trigger it. The git version can keep running it for hours straight with no problem, while the problematic versions crash in seconds. As QEMU configuration is a result mixed between packages found automatically and those manually set by me, here is the configure command and its results from my QEMU build: https://paste.ubuntu.com/p/z9vnFdTnkD/ Please note that I had no need to build other targets for my use, so the configuration is much smaller than the one used by Ubuntu's QEMU. The bisect is done and this is the result: 5bd5c27c7d284d01477c5cc022ce22438c46bf9f is the first new commit commit 5bd5c27c7d284d01477c5cc022ce22438c46bf9f Author: Gerd Hoffmann