summary refs log tree commit diff stats
path: root/scripts/tracetool/backend/syslog.py (unfollow)
Commit message (Collapse)AuthorFilesLines
2020-07-28Update version for v5.1.0-rc2 releasePeter Maydell1-1/+1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-28block/nbd: nbd_co_reconnect_loop(): don't sleep if drainedVladimir Sementsov-Ogievskiy1-5/+6
We try to go to wakeable sleep, so that, if drain begins it will break the sleep. But what if nbd_client_co_drain_begin() already called and s->drained is already true? We'll go to sleep, and drain will have to wait for the whole timeout. Let's improve it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28block/nbd: on shutdown terminate connection attemptVladimir Sementsov-Ogievskiy1-4/+11
On shutdown nbd driver may be in a connecting state. We should shutdown it as well, otherwise we may hang in nbd_teardown_connection, waiting for conneciton_co to finish in BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang. Note, that this hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: allow drain during reconnect attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28block/nbd: allow drain during reconnect attemptVladimir Sementsov-Ogievskiy1-0/+14
It should be safe to reenter qio_channel_yield() on io/channel read/write path, so it's safe to reduce in_flight and allow attaching new aio context. And no problem to allow drain itself: connection attempt is not a guest request. Moreover, if remote server is down, we can hang in negotiation, blocking drain section and provoking a dead lock. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang trying to drain, for example, like this: #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at util/aio-posix.c:600 #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427 #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433 #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710 #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498 #8 bdrv_open_inherit (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0, flags=24578, parent=0x0, child_class=0x0, child_role=0, errp=0x7fffba154620) at block.c:3491 #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block.c:3513 #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block/block-backend.c:421 And connection_co stack like this: #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918, action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302 #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193 #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at io/channel.c:472 #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:110 #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:143 #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at io/channel.c:247 #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:365 #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:391 #8 nbd_start_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, structured_reply=true, zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904 #9 nbd_receive_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at nbd/client.c:1032 #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at block/nbd.c:1460 #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287 #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309 #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360 #14 coroutine_trampoline (i0=113190480, i1=22000) at util/coroutine-ucontext.c:173 Note, that the hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: on shutdown terminate connection attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28block/nbd: split nbd_establish_connection out of nbd_client_connectVladimir Sementsov-Ogievskiy2-26/+38
We are going to implement non-blocking version of nbd_establish_connection, which for a while will be used only for nbd_reconnect_attempt, not for nbd_open, so we need to call it separately. Refactor nbd_reconnect_attempt in a way which makes next commit simpler. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28iotests: Test convert to qcow2 compressed to NBDNir Soffer3-0/+159
Add test for "qemu-img convert -O qcow2 -c" to NBD target. The tests     create a OVA file and write compressed qcow2 disk content directly into the OVA file via qemu-nbd. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-5-nsoffer@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28iotests: Add more qemu_img helpersNir Soffer1-0/+6
Add 2 helpers for measuring and checking images: - qemu_img_measure() - qemu_img_check() Both use --output-json and parse the returned json to make easy to use in other tests. I'm going to use them in a new test, and I hope they will be useful in may other tests. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-4-nsoffer@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28iotests: Make qemu_nbd_popen() a contextmanagerNir Soffer3-50/+56
Instead of duplicating the code to wait until the server is ready and remember to terminate the server and wait for it, make it possible to use like this: with qemu_nbd_popen('-k', sock, image): # Access image via qemu-nbd socket... Only test 264 used this helper, but I had to modify the output since it did not consistently when starting and stopping qemu-nbd. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-3-nsoffer@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28block: nbd: Fix convert qcow2 compressed to nbdNir Soffer2-1/+31
When converting to qcow2 compressed format, the last step is a special zero length compressed write, ending in a call to bdrv_co_truncate(). This call always fails for the nbd driver since it does not implement bdrv_co_truncate(). For block devices, which have the same limits, the call succeeds since the file driver implements bdrv_co_truncate(). If the caller asked to truncate to the same or smaller size with exact=false, the truncate succeeds. Implement the same logic for nbd. Example failing without this change: In one shell start qemu-nbd: $ truncate -s 1g test.tar $ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar In another shell convert an image to qcow2 compressed via NBD: $ echo "disk data" > disk.raw $ truncate -s 1g disk.raw $ qemu-img convert -f raw -O qcow2 -c disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $? 1 qemu-img failed, but the conversion was successful: $ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock image: nbd+unix://?socket=/tmp/nbd.sock file format: qcow2 virtual size: 1 GiB (1073741824 bytes) ... $ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock No errors were found on the image. 1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters Image end offset: 393216 $ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock Images are identical. Fixes: https://bugzilla.redhat.com/1860627 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-2-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: typo fixes] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-28slirp: update to latest stable-4.2 branchMarc-André Lureau1-0/+0
Dr. David Alan Gilbert (1): ip_stripoptions use memmove Jindrich Novy (4): Fix possible infinite loops and use-after-free Use secure string copy to avoid overflow Be sure to initialize sockaddr structure Check lseek() for failure Marc-André Lureau (2): util: do not silently truncate Merge branch 'stable-4.2' into 'stable-4.2' Philippe Mathieu-Daudé (3): Fix win32 builds by using the SLIRP_PACKED definition Fix constness warnings Remove unnecessary break Ralf Haferkamp (2): Drop bogus IPv6 messages Fix MTU check Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2020-07-28test-char: abort on serial test errorMarc-André Lureau1-1/+1
We are having issues debugging and bisecting this issue that happen mostly on patchew. Let's make it abort where it failed to gather some new informations. Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2020-07-28nbd: Fix large trim/zero requestsEric Blake1-5/+23
Although qemu as NBD client limits requests to <2G, the NBD protocol allows clients to send requests almost all the way up to 4G. But because our block layer is not yet 64-bit clean, we accidentally wrap such requests into a negative size, and fail with EIO instead of performing the intended operation. The bug is visible in modern systems with something as simple as: $ qemu-img create -f qcow2 /tmp/image.img 5G $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img $ sudo blkdiscard /dev/nbd0 or with user-space only: $ truncate --size=3G file $ qemu-nbd -f raw file $ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)' Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0 on success, this is also a good time to fix our code to a more robust paradigm that treats all non-negative values as success. Alas, our iotests do not currently make it easy to add external dependencies on blkdiscard or nbdsh, so we have to rely on manual testing for now. This patch can be reverted when we later improve the overall block layer to be 64-bit clean, but for now, a minimal fix was deemed less risky prior to release. CC: qemu-stable@nongnu.org Fixes: 1f4d6d18ed Fixes: 1c6c4bb7f0 Fixes: https://github.com/systemd/systemd/issues/16242 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200722212231.535072-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rework success tests to use >=0]
2020-07-28iotests/197: Fix for non-qcow2 formatsMax Reitz2-4/+6
While 197 is very much a qcow2 test, and it looks like the partial cluster case at the end (introduced in b0ddcbbb36a66a6) is specifically a qcow2 case, the whole test scripts actually marks itself to work with generic formats (and generic protocols, even). Said partial cluster case happened to work with non-qcow2 formats as well (mostly by accident), but 1855536256 broke that, because it sets the compat option, which does not work for non-qcow2 formats. So go the whole way and force IMGFMT=qcow2 and IMGPROTO=file, as done in other places in this test. Fixes: 1855536256eb0a5708b04b85f744de69559ea323 ("iotests/197: Fix for compat=0.10") Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728131134.902519-1-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-28iotests/028: Add test for cross-base-EOF readsMax Reitz2-0/+30
Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728120806.265916-3-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Claudio Fontana <cfontana@suse.de>
2020-07-28block: Fix bdrv_aligned_p*v() for qiov_offset != 0Max Reitz1-4/+6
Since these functions take a @qiov_offset, they must always take it into account when working with @qiov. There are a couple of places where they do not, but they should. Fixes: 65cd4424b9df03bb5195351c33e04cbbecc0705c ("block/io: bdrv_aligned_preadv: use and support qiov_offset") Fixes: 28c4da28695bdbe04b336b2c9c463876cc3aaa6d ("block/io: bdrv_aligned_pwritev: use and support qiov_offset") Reported-by: Claudio Fontana <cfontana@suse.de> Reported-by: Bruce Rogers <brogers@suse.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728120806.265916-2-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Claudio Fontana <cfontana@suse.de> Tested-by: Bruce Rogers <brogers@suse.com>
2020-07-28net: forbid the reentrant RXJason Wang1-0/+3
The memory API allows DMA into NIC's MMIO area. This means the NIC's RX routine must be reentrant. Instead of auditing all the NIC, we can simply detect the reentrancy and return early. The queue->delivering is set and cleared by qemu_net_queue_deliver() for other queue helpers to know whether the delivering in on going (NIC's receive is being called). We can check it and return early in qemu_net_queue_flush() to forbid reentrant RX. Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-28virtio-net: check the existence of peer before accessing vDPA configJason Wang1-11/+19
We try to check whether a peer is VDPA in order to get config from there - with no peer, this leads to a NULL pointer dereference. Add a check before trying to access the peer type. No peer means not VDPA. Fixes: 108a64818e69b ("vhost-vdpa: introduce vhost-vdpa backend") Cc: Cindy Lu <lulu@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-28virtio-pci: fix wrong index in virtio_pci_queue_enabledYuri Benditovich1-1/+1
We should use the index passed by the caller instead of the queue_sel when checking the enablement of a specific virtqueue. This is reported in https://bugzilla.redhat.com/show_bug.cgi?id=1702608 Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method") Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-07-27qga/qapi-schema: Document -1 for invalid PCI address fieldsThomas Huth1-1/+1
The "guest-get-fsinfo" could also be used for non-PCI devices in the future. And the code in GuestPCIAddress() in qga/commands-win32.c seems to be using "-1" for fields that it can not determine already. Thus let's properly document "-1" as value for invalid PCI address fields. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2020-07-27qga-win: fix "guest-get-fsinfo" wrong filesystem typeBasil Salman1-6/+23
This patch handles the case where unmounted volumes exist, where in that case GetVolumePathNamesForVolumeName returns empty path, GetVolumeInformation will use the current working directory instead. This patch fixes the issue by opening a handle to the volumes, and using GetVolumeInformationByHandleW instead. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1746667 Signed-off-by: Basil Salman <bsalman@redhat.com> Signed-off-by: Basil Salman <basil@daynix.com> *fix crash when guest_build_fsinfo() sets errp multiple times *make new error message more distinct from existing ones Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2020-07-27migration: Fix typos in bitmap migration commentsEric Blake1-2/+2
Noticed while reviewing the file for newer patches. Fixes: b35ebdf076 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727203206.134996-1-eblake@redhat.com>
2020-07-27iotests: Adjust which migration tests are quickEric Blake1-6/+6
A quick run of './check -qcow2 -g migration' shows that test 169 is NOT quick, but meanwhile several other tests ARE quick. Let's adjust the test designations accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727195117.132151-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-07-27qemu-iotests/199: add source-killed case to bitmaps postcopyVladimir Sementsov-Ogievskiy2-2/+17
Previous patches fixes behavior of bitmaps migration, so that errors are handled by just removing unfinished bitmaps, and not fail or try to recover postcopy migration. Add corresponding test. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-22-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: add early shutdown case to bitmaps postcopyVladimir Sementsov-Ogievskiy2-2/+26
Previous patches fixed two crashes which may occur on shutdown prior to bitmaps postcopy finished. Check that it works now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-21-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: check persistent bitmapsVladimir Sementsov-Ogievskiy1-1/+15
Check that persistent bitmaps are not stored on source and that bitmaps are persistent on destination. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-20-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: prepare for new test-cases additionVladimir Sementsov-Ogievskiy1-13/+23
Move future common part to start_postcopy() method. Move checking number of bitmaps to check_bitmap(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-19-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/savevm: don't worry if bitmap migration postcopy failedVladimir Sementsov-Ogievskiy1-5/+32
First, if only bitmaps postcopy is enabled (and not ram postcopy) postcopy_pause_incoming crashes on an assertion assert(mis->to_src_file). And anyway, bitmaps postcopy is not prepared to be somehow recovered. The original idea instead is that if bitmaps postcopy failed, we just lose some bitmaps, which is not critical. So, on failure we just need to remove unfinished bitmaps and guest should continue execution on destination. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-18-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: cancel migration on shutdownVladimir Sementsov-Ogievskiy3-0/+31
If target is turned off prior to postcopy finished, target crashes because busy bitmaps are found at shutdown. Canceling incoming migration helps, as it removes all unfinished (and therefore busy) bitmaps. Similarly on source we crash in bdrv_close_all which asserts that all bdrv states are removed, because bdrv states involved into dirty bitmap migration are referenced by it. So, we need to cancel outgoing migration as well. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-17-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: relax error handling in incoming partVladimir Sementsov-Ogievskiy1-37/+127
Bitmaps data is not critical, and we should not fail the migration (or use postcopy recovering) because of dirty-bitmaps migration failure. Instead we should just lose unfinished bitmaps. Still we have to report io stream violation errors, as they affect the whole migration stream. While touching this, tighten code that was previously blindly calling malloc on a size read from the migration stream, as a corrupted stream (perhaps from a malicious user) should not be able to convince us to allocate an inordinate amount of memory. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727194236.19551-16-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: typo fixes, enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: keep bitmap state for all bitmapsVladimir Sementsov-Ogievskiy1-21/+43
Keep bitmap state for disabled bitmaps too. Keep the state until the end of the process. It's needed for the following commit to implement bitmap postcopy canceling. To clean-up the new list the following logic is used: We need two events to consider bitmap migration finished: 1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received 2. dirty_bitmap_mig_before_vm_start should be called These two events may come in any order, so we understand which one is last, and on the last of them we remove bitmap migration state from the list. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-15-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: simplify dirty_bitmap_load_completeVladimir Sementsov-Ogievskiy1-21/+4
bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in postcopy, bitmap successor must be enabled, and reclaim operation will enable the bitmap. So, actually we need just call _reclaim_ in both if branches, and making differences only to add an assertion seems not really good. The logic becomes simple: on load complete we do reclaim and that's all. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-14-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename finish_lock to just lockVladimir Sementsov-Ogievskiy1-6/+6
finish_lock is bad name, as lock used not only on process end. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-13-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: refactor state global variablesVladimir Sementsov-Ogievskiy1-80/+99
Move all state variables into one global struct. Reduce global variable usage, utilizing opaque pointer where possible. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-12-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_initVladimir Sementsov-Ogievskiy3-8/+1
No reasons to keep two public init functions. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200727194236.19551-11-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanupVladimir Sementsov-Ogievskiy1-4/+4
Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to stress that it is on save part. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-10-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename state structure typesVladimir Sementsov-Ogievskiy1-33/+37
Rename types to be symmetrical for load/save part and shorter. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-9-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_startVladimir Sementsov-Ogievskiy1-1/+1
Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking is wrong as we do not already own the mutex. Moreover, the adjacent call to bdrv_dirty_bitmap_enable_successor grabs the mutex. Fixes: 58f72b965e9e1q Cc: qemu-stable@nongnu.org # v3.0 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-8-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: increase postcopy periodVladimir Sementsov-Ogievskiy1-19/+39
The test wants to force a bitmap postcopy. Still, the resulting postcopy period is very small. Let's increase it by adding more bitmaps to migrate. Also, test disabled bitmaps migration. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-7-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: change discard patternsVladimir Sementsov-Ogievskiy1-18/+26
iotest 199 works too long because of many discard operations. At the same time, postcopy period is very short, in spite of all these efforts. So, let's use less discards (and with more interesting patterns) to reduce test timing. In the next commit we'll increase postcopy period. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: improve performance: set bitmap by discardVladimir Sementsov-Ogievskiy1-11/+20
Discard dirties dirty-bitmap as well as write, but works faster. Let's use it instead. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: better catch postcopy timeVladimir Sementsov-Ogievskiy1-15/+57
The test aims to test _postcopy_ migration, and wants to do some write operations during postcopy time. Test considers migrate status=complete event on source as start of postcopy. This is completely wrong, completion is completion of the whole migration process. Let's instead consider destination start as start of postcopy, and use RESUME event for it. Next, as migration finish, let's use migration status=complete event on target, as such method is closer to what libvirt or another user will do, than tracking number of dirty-bitmaps. Finally, add a possibility to dump events for debug. And if set debug to True, we see, that actual postcopy period is very small relatively to the whole test duration time (~0.2 seconds to >40 seconds for me). This means, that test is very inefficient in what it supposed to do. Let's improve it in following commits. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: drop extra constraintsVladimir Sementsov-Ogievskiy1-2/+1
We don't need any specific format constraints here. Still keep qcow2 for two reasons: 1. No extra calls of format-unrelated test 2. Add some check around persistent bitmap in future (require qcow2) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qemu-iotests/199: fix styleVladimir Sementsov-Ogievskiy1-6/+7
Mostly, satisfy pep8 complaints. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-2-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27qcow2: Fix capitalization of header extension constant.Andrey Shinkevich2-2/+2
Make the capitalization of the hexadecimal numbers consistent for the QCOW2 header extension constants in docs/interop/qcow2.txt. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1594973699-781898-2-git-send-email-andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27linux-user: Use getcwd syscall directlyAndreas Schwab1-8/+1
The glibc getcwd function returns different errors than the getcwd syscall, which triggers an assertion failure in the glibc getcwd function when running under the emulation. When the syscall returns ENAMETOOLONG, the glibc wrapper uses a fallback implementation that potentially handles an unlimited path length, and returns with ERANGE if the provided buffer is too small. The qemu emulation cannot distinguish the two cases, and thus always returns ERANGE. This is unexpected by the glibc wrapper. Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <mvmmu3qplvi.fsf@suse.de> [lv: updated description] Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-07-27linux-user: Fix syscall rt_sigtimedwait() implementationFilip Bozuta1-1/+3
Implementation of 'rt_sigtimedwait()' in 'syscall.c' uses the function 'target_to_host_timespec()' to transfer the value of 'struct timespec' from target to host. However, the implementation doesn't check whether this conversion succeeds and thus can cause an unaproppriate error instead of the 'EFAULT (Bad address)' which is supposed to be set if the conversion from target to host fails. This was confirmed with the LTP test for rt_sigtimedwait: "/testcases/kernel/syscalls/rt_sigtimedwait/rt_sigtimedwait01.c" which causes an unapropriate error in test case "test_bad_adress3" which is run with a bad adress for the 'struct timespec' argument: FAIL: test_bad_address3 (349): Unexpected failure: EAGAIN/EWOULDBLOCK (11) The test fails with an unexptected errno 'EAGAIN/EWOULDBLOCK' instead of the expected EFAULT. After the changes from this patch, the test case is executed successfully along with the other LTP test cases for 'rt_sigtimedwait()': PASS: test_bad_address3 (349): Test passed Signed-off-by: Filip Bozuta <Filip.Bozuta@syrmia.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20200724181651.167819-1-Filip.Bozuta@syrmia.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-07-27linux-user: Ensure mmap_min_addr is non-zeroRichard Henderson1-2/+14
When the chroot does not have /proc mounted, we can read neither /proc/sys/vm/mmap_min_addr nor /proc/sys/maps. The enforcement of mmap_min_addr in the host kernel is done by the security module, and so does not apply to processes owned by root. Which leads pgd_find_hole_fallback to succeed in probing a reservation at address 0. Which confuses pgb_reserved_va to believe that guest_base has not actually been initialized. We don't actually want NULL addresses to become accessible, so make sure that mmap_min_addr is initialized with a non-zero value. Buglink: https://bugs.launchpad.net/qemu/+bug/1888728 Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20200724212314.545877-1-richard.henderson@linaro.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-07-27virtio-pci: fix virtio_pci_queue_enabled()Laurent Vivier3-2/+8
In legacy mode, virtio_pci_queue_enabled() falls back to virtio_queue_enabled() to know if the queue is enabled. But virtio_queue_enabled() calls again virtio_pci_queue_enabled() if k->queue_enabled is set. This ends in a crash after a stack overflow. The problem can be reproduced with "-device virtio-net-pci,disable-legacy=off,disable-modern=true -net tap,vhost=on" And a look to the backtrace is very explicit: ... #4 0x000000010029a438 in virtio_queue_enabled () #5 0x0000000100497a9c in virtio_pci_queue_enabled () ... #130902 0x000000010029a460 in virtio_queue_enabled () #130903 0x0000000100497a9c in virtio_pci_queue_enabled () #130904 0x000000010029a460 in virtio_queue_enabled () #130905 0x0000000100454a20 in vhost_net_start () ... This patch fixes the problem by introducing a new function for the legacy case and calls it from virtio_pci_queue_enabled(). It also calls it from virtio_queue_enabled() to avoid code duplication. Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method") Cc: Jason Wang <jasowang@redhat.com> Cc: Cindy Lu <lulu@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20200727153319.43716-1-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-27target/arm: Improve IMPDEF algorithm for IRGRichard Henderson1-7/+30
When GCR_EL1.RRND==1, the choosing of the random value is IMPDEF, and the kernel is not expected to have set RGSR_EL1. Force a non-zero value into SEED, so that we do not continually return the same tag. Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200724163853.504655-4-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-27hw/arm/boot: Fix MTE for EL3 direct kernel bootRichard Henderson1-0/+3
When booting an EL3 cpu with -kernel, we set up EL3 and then drop down to EL2. We need to enable access to v8.5-MemTag tag allocation at EL3 before doing so. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200724163853.504655-3-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>