diff options
Diffstat (limited to 'results/classifier/105/semantic/1670377')
| -rw-r--r-- | results/classifier/105/semantic/1670377 | 112 |
1 files changed, 112 insertions, 0 deletions
diff --git a/results/classifier/105/semantic/1670377 b/results/classifier/105/semantic/1670377 new file mode 100644 index 000000000..b9d41afaf --- /dev/null +++ b/results/classifier/105/semantic/1670377 @@ -0,0 +1,112 @@ +semantic: 0.751 +graphic: 0.735 +instruction: 0.719 +assembly: 0.707 +device: 0.683 +network: 0.681 +other: 0.665 +vnc: 0.624 +boot: 0.513 +KVM: 0.505 +socket: 0.486 +mistranslation: 0.437 + + VNC: short read for zlre data/RDR EndOfStream + +In openQA we have a custom VNC client (https://github.com/os-autoinst/os-autoinst/tree/master/consoles), which connects to QEMU guest and from there performs actions (sends keys, handles pointer, ...). We have several backends (https://github.com/os-autoinst/os-autoinst/tree/master/backend). With qemu backend we start QEMU guest *locally* on openQA worker which connects to it via VNC and sends commands. That works fine. + +However, with svirt backend we start QEMU on a KVM or Xen host and then connect to it remotely from openQA worker - the guest and worker are different systems. In this scenario fairly often happens that while system operates in Grub2, QEMU stops sending data via VNC: + +... +15:24:15.5341 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:50 called testapi::send_key +15:24:15.5342 27074 <<< testapi::send_key(key='c') +15:24:15.7361 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:51 called testapi::type_string +15:24:15.7362 27074 <<< testapi::type_string(string='gfxmode=1024x768; terminal_output console; terminal_output gfxterm +', max_interval=250, wait_screen_changes=0) +15:24:22.2243 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:53 called testapi::send_key +15:24:22.2244 27074 <<< testapi::send_key(key='esc') +15:24:22.4255 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:79 called testapi::send_key +15:24:22.4256 27074 <<< testapi::send_key(key='e') +15:24:22.6264 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:22.6265 27074 <<< testapi::send_key(key='down') +15:24:22.8273 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:22.8274 27074 <<< testapi::send_key(key='down') +15:24:23.0282 Debug: /var/lib/openqa/share/tests/sle-12-SP1/tests/installation/bootloader_uefi.pm:81 called testapi::send_key +15:24:23.0283 27074 <<< testapi::send_key(key='down') +DIE short read for zlre data 107132 - 995002 at /usr/lib/os-autoinst/consoles/VNC.pm line 978. + + at /usr/lib/os-autoinst/backend/baseclass.pm line 73. +... + +My observation is that it happens only while in Grub, when resolution happened a short while ago. See attached video and log. + +Prior to QEMU 2.8.0 I was able to reproduce a similar issue with vncviewer. I started QEMU with SLES JeOS image pressed several times a 'down' key in Grub and vncviewer (Tiger VNC 1.6.0 from openSUSE Leap 42.2) crashed with rdr::EndOfStream exception. This does not happen with QEMU 2.8.0, but I am still able to reproduce similar issue via openQA. + +/usr/bin/qemu-system-x86_64 -name guest=openQA-SUT-20,debug-threads=on -S -machine pc-i440fx-2.6,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 87535fc1-e693-41b9-813e-834d6fc4cb5a -no-user-config -nodefaults -rtc base=utc -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/openQA-SUT-20.img,format=qcow2,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev user,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:12:34:56,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device virtio-tablet-pci,id=input0,bus=pci.0,addr=0x6 -device virtio-keyboard-pci,id=input1,bus=pci.0,addr=0x7 -vnc 0.0.0.0:20,share=force-shared -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on -monitor stdio + +Host: openSUSE Leap 42.2 x86_64 KVM or Xen on x86_64 Intel with QEMU 2.6.0. +Guest: Leap 42.2. + +I can't reproduce the problem with QEMU 2.5.0, but I can with any QEMU version from 2.6 RC1 on. + + + + + +It isn't 100% clear from the info provided, but this is almost certainly fixed in 2.9.0 by + +commit 537848ee62195fc06c328b1cd64f4218f404a7f1 +Author: Michael Tokarev <email address hidden> +Date: Fri Feb 3 12:52:29 2017 +0300 + + vnc: do not disconnect on EAGAIN + + When qemu vnc server is trying to send large update to clients, + there might be a situation when system responds with something + like EAGAIN, indicating that there's no system memory to send + that much data (depending on the network speed, client and server + and what is happening). In this case, something like this happens + on qemu side (from strace): + + sendmsg(16, {msg_name(0)=NULL, + msg_iov(1)=[{"\244\"..., 729186}], + msg_controllen=0, msg_flags=0}, 0) = 103950 + sendmsg(16, {msg_name(0)=NULL, + msg_iov(1)=[{"lz\346"..., 1559618}], + msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN + sendmsg(-1, {msg_name(0)=NULL, + msg_iov(1)=[{"lz\346"..., 1559618}], + msg_controllen=0, msg_flags=0}, 0) = -1 EBADF + + qemu closes the socket before the retry, and obviously it gets EBADF + when trying to send to -1. + + This is because there WAS a special handling for EAGAIN, but now it doesn't + work anymore, after commit 04d2529da27db512dcbd5e99d0e26d333f16efcc, because + now in all error-like cases we initiate vnc disconnect. + + This change were introduced in qemu 2.6, and caused numerous grief for many + people, resulting in their vnc clients reporting sporadic random disconnects + from vnc server. + + Fix that by doing the disconnect only when necessary, i.e. omitting this + very case of EAGAIN. + + Hopefully the existing condition (comparing with QIO_CHANNEL_ERR_BLOCK) + is sufficient, as the original code (before the above commit) were + checking for other errno values too. + + Apparently there's another (semi?)bug exist somewhere here, since the + code tries to write to fd# -1, it probably should check if the connection + is open before. But this isn't important. + + Signed-off-by: Michael Tokarev <email address hidden> + Reviewed-by: Daniel P. Berrange <email address hidden> + Message-id: <email address hidden> + Fixes: 04d2529da27db512dcbd5e99d0e26d333f16efcc + Cc: Daniel P. Berrange <email address hidden> + Cc: Gerd Hoffmann <email address hidden> + Cc: <email address hidden> + Signed-off-by: Gerd Hoffmann <email address hidden> + + |