summary refs log tree commit diff stats
Commit message (Collapse)AuthorAgeFilesLines
* trace/simple: Allow enabling simple traces from command lineJosh DuBois2020-07-291-0/+1
| | | | | | | | | | | | | | The simple trace backend is enabled / disabled with a call to st_set_trace_file_enabled(). When initializing tracing from the command-line, this must be enabled on startup. (Prior to db25d56c014aa1a9, command-line initialization of simple trace worked because every call to st_set_trace_file enabled tracing.) Fixes: db25d56c014aa1a96319c663e0a60346a223b31e Signed-off-by: Josh DuBois <josh@joshdubois.com> Message-id: 20200723053359.256928-1-josh@joshdubois.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Update version for v5.1.0-rc2 releasePeter Maydell2020-07-281-1/+1
| | | | Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-07-28' into ↵Peter Maydell2020-07-2810-89/+342
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging nbd patches for 2020-07-28 - fix NBD handling of trim/zero requests larger than 2G - allow no-op resizes on NBD (in turn fixing qemu-img convert -c into NBD) - several deadlock fixes when using NBD reconnect # gpg: Signature made Tue 28 Jul 2020 15:59:42 BST # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2020-07-28: block/nbd: nbd_co_reconnect_loop(): don't sleep if drained block/nbd: on shutdown terminate connection attempt block/nbd: allow drain during reconnect attempt block/nbd: split nbd_establish_connection out of nbd_client_connect iotests: Test convert to qcow2 compressed to NBD iotests: Add more qemu_img helpers iotests: Make qemu_nbd_popen() a contextmanager block: nbd: Fix convert qcow2 compressed to nbd nbd: Fix large trim/zero requests Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * block/nbd: nbd_co_reconnect_loop(): don't sleep if drainedVladimir Sementsov-Ogievskiy2020-07-281-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | We try to go to wakeable sleep, so that, if drain begins it will break the sleep. But what if nbd_client_co_drain_begin() already called and s->drained is already true? We'll go to sleep, and drain will have to wait for the whole timeout. Let's improve it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * block/nbd: on shutdown terminate connection attemptVladimir Sementsov-Ogievskiy2020-07-281-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On shutdown nbd driver may be in a connecting state. We should shutdown it as well, otherwise we may hang in nbd_teardown_connection, waiting for conneciton_co to finish in BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang. Note, that this hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: allow drain during reconnect attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * block/nbd: allow drain during reconnect attemptVladimir Sementsov-Ogievskiy2020-07-281-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It should be safe to reenter qio_channel_yield() on io/channel read/write path, so it's safe to reduce in_flight and allow attaching new aio context. And no problem to allow drain itself: connection attempt is not a guest request. Moreover, if remote server is down, we can hang in negotiation, blocking drain section and provoking a dead lock. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang trying to drain, for example, like this: #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at util/aio-posix.c:600 #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427 #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433 #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710 #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498 #8 bdrv_open_inherit (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0, flags=24578, parent=0x0, child_class=0x0, child_role=0, errp=0x7fffba154620) at block.c:3491 #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block.c:3513 #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block/block-backend.c:421 And connection_co stack like this: #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918, action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302 #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193 #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at io/channel.c:472 #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:110 #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:143 #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at io/channel.c:247 #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:365 #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:391 #8 nbd_start_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, structured_reply=true, zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904 #9 nbd_receive_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at nbd/client.c:1032 #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at block/nbd.c:1460 #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287 #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309 #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360 #14 coroutine_trampoline (i0=113190480, i1=22000) at util/coroutine-ucontext.c:173 Note, that the hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: on shutdown terminate connection attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * block/nbd: split nbd_establish_connection out of nbd_client_connectVladimir Sementsov-Ogievskiy2020-07-282-26/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are going to implement non-blocking version of nbd_establish_connection, which for a while will be used only for nbd_reconnect_attempt, not for nbd_open, so we need to call it separately. Refactor nbd_reconnect_attempt in a way which makes next commit simpler. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * iotests: Test convert to qcow2 compressed to NBDNir Soffer2020-07-283-0/+159
| | | | | | | | | | | | | | | | | | | | | | | | Add test for "qemu-img convert -O qcow2 -c" to NBD target. The tests     create a OVA file and write compressed qcow2 disk content directly into the OVA file via qemu-nbd. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-5-nsoffer@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * iotests: Add more qemu_img helpersNir Soffer2020-07-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 2 helpers for measuring and checking images: - qemu_img_measure() - qemu_img_check() Both use --output-json and parse the returned json to make easy to use in other tests. I'm going to use them in a new test, and I hope they will be useful in may other tests. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-4-nsoffer@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * iotests: Make qemu_nbd_popen() a contextmanagerNir Soffer2020-07-283-50/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of duplicating the code to wait until the server is ready and remember to terminate the server and wait for it, make it possible to use like this: with qemu_nbd_popen('-k', sock, image): # Access image via qemu-nbd socket... Only test 264 used this helper, but I had to modify the output since it did not consistently when starting and stopping qemu-nbd. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-3-nsoffer@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * block: nbd: Fix convert qcow2 compressed to nbdNir Soffer2020-07-282-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When converting to qcow2 compressed format, the last step is a special zero length compressed write, ending in a call to bdrv_co_truncate(). This call always fails for the nbd driver since it does not implement bdrv_co_truncate(). For block devices, which have the same limits, the call succeeds since the file driver implements bdrv_co_truncate(). If the caller asked to truncate to the same or smaller size with exact=false, the truncate succeeds. Implement the same logic for nbd. Example failing without this change: In one shell start qemu-nbd: $ truncate -s 1g test.tar $ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar In another shell convert an image to qcow2 compressed via NBD: $ echo "disk data" > disk.raw $ truncate -s 1g disk.raw $ qemu-img convert -f raw -O qcow2 -c disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $? 1 qemu-img failed, but the conversion was successful: $ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock image: nbd+unix://?socket=/tmp/nbd.sock file format: qcow2 virtual size: 1 GiB (1073741824 bytes) ... $ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock No errors were found on the image. 1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters Image end offset: 393216 $ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock Images are identical. Fixes: https://bugzilla.redhat.com/1860627 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-2-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: typo fixes] Signed-off-by: Eric Blake <eblake@redhat.com>
| * nbd: Fix large trim/zero requestsEric Blake2020-07-281-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although qemu as NBD client limits requests to <2G, the NBD protocol allows clients to send requests almost all the way up to 4G. But because our block layer is not yet 64-bit clean, we accidentally wrap such requests into a negative size, and fail with EIO instead of performing the intended operation. The bug is visible in modern systems with something as simple as: $ qemu-img create -f qcow2 /tmp/image.img 5G $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img $ sudo blkdiscard /dev/nbd0 or with user-space only: $ truncate --size=3G file $ qemu-nbd -f raw file $ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)' Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0 on success, this is also a good time to fix our code to a more robust paradigm that treats all non-negative values as success. Alas, our iotests do not currently make it easy to add external dependencies on blkdiscard or nbdsh, so we have to rely on manual testing for now. This patch can be reverted when we later improve the overall block layer to be 64-bit clean, but for now, a minimal fix was deemed less risky prior to release. CC: qemu-stable@nongnu.org Fixes: 1f4d6d18ed Fixes: 1c6c4bb7f0 Fixes: https://github.com/systemd/systemd/issues/16242 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200722212231.535072-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rework success tests to use >=0]
* | Merge remote-tracking branch 'remotes/elmarco/tags/slirp-pull-request' into ↵Peter Maydell2020-07-282-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging slirp: update to latest stable-4.2 branch # gpg: Signature made Tue 28 Jul 2020 15:30:09 BST # gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5 # gpg: issuer "marcandre.lureau@redhat.com" # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full] # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full] # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/slirp-pull-request: slirp: update to latest stable-4.2 branch test-char: abort on serial test error Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | slirp: update to latest stable-4.2 branchMarc-André Lureau2020-07-281-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dr. David Alan Gilbert (1): ip_stripoptions use memmove Jindrich Novy (4): Fix possible infinite loops and use-after-free Use secure string copy to avoid overflow Be sure to initialize sockaddr structure Check lseek() for failure Marc-André Lureau (2): util: do not silently truncate Merge branch 'stable-4.2' into 'stable-4.2' Philippe Mathieu-Daudé (3): Fix win32 builds by using the SLIRP_PACKED definition Fix constness warnings Remove unnecessary break Ralf Haferkamp (2): Drop bogus IPv6 messages Fix MTU check Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
| * | test-char: abort on serial test errorMarc-André Lureau2020-07-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We are having issues debugging and bisecting this issue that happen mostly on patchew. Let's make it abort where it failed to gather some new informations. Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
* | | Merge remote-tracking branch ↵Peter Maydell2020-07-286-31/+54
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/pmaydell/tags/pull-target-arm-20200727' into staging target-arm queue: * ACPI: Assert that we don't run out of the preallocated memory * hw/misc/aspeed_sdmc: Fix incorrect memory size * target/arm: Always pass cacheattr in S1_ptw_translate * docs/system/arm/virt: Document 'mte' machine option * hw/arm/boot: Fix PAUTH, MTE for EL3 direct kernel boot * target/arm: Improve IMPDEF algorithm for IRG # gpg: Signature made Mon 27 Jul 2020 16:18:38 BST # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20200727: target/arm: Improve IMPDEF algorithm for IRG hw/arm/boot: Fix MTE for EL3 direct kernel boot hw/arm/boot: Fix PAUTH for EL3 direct kernel boot docs/system/arm/virt: Document 'mte' machine option target/arm: Always pass cacheattr in S1_ptw_translate hw/misc/aspeed_sdmc: Fix incorrect memory size ACPI: Assert that we don't run out of the preallocated memory Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | target/arm: Improve IMPDEF algorithm for IRGRichard Henderson2020-07-271-7/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When GCR_EL1.RRND==1, the choosing of the random value is IMPDEF, and the kernel is not expected to have set RGSR_EL1. Force a non-zero value into SEED, so that we do not continually return the same tag. Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200724163853.504655-4-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | hw/arm/boot: Fix MTE for EL3 direct kernel bootRichard Henderson2020-07-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting an EL3 cpu with -kernel, we set up EL3 and then drop down to EL2. We need to enable access to v8.5-MemTag tag allocation at EL3 before doing so. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200724163853.504655-3-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | hw/arm/boot: Fix PAUTH for EL3 direct kernel bootRichard Henderson2020-07-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting an EL3 cpu with -kernel, we set up EL3 and then drop down to EL2. We need to enable access to v8.3-PAuth keys and instructions at EL3 before doing so. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200724163853.504655-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | docs/system/arm/virt: Document 'mte' machine optionPeter Maydell2020-07-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6a0b7505f1fd6769c which added documentation of the virt board crossed in the post with commit 6f4e1405b91da0d0 which added a new 'mte' machine option. Update the docs to include the new option. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
| * | | target/arm: Always pass cacheattr in S1_ptw_translateRichard Henderson2020-07-271-13/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we changed the interface of get_phys_addr_lpae to require the cacheattr parameter, this spot was missed. The compiler is unable to detect the use of NULL vs the nonnull attribute here. Fixes: 7e98e21c098 Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Jan Kiszka <jan.kiskza@siemens.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | hw/misc/aspeed_sdmc: Fix incorrect memory sizePhilippe Mathieu-Daudé2020-07-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SDRAM Memory Controller has a 32-bit address bus, thus supports up to 4 GiB of DRAM. There is a signed to unsigned conversion error with the AST2600 maximum memory size: (uint64_t)(2048 << 20) = (uint64_t)(-2147483648) = 0xffffffff40000000 = 16 EiB - 2 GiB Fix by using the IEC suffixes which are usually safer, and add an assertion check to verify the memory is valid. This would have caught this bug: $ qemu-system-arm -M ast2600-evb qemu-system-arm: hw/misc/aspeed_sdmc.c:258: aspeed_sdmc_realize: Assertion `asc->max_ram_size < 4 * GiB' failed. Aborted (core dumped) Fixes: 1550d72679 ("aspeed/sdmc: Add AST2600 support") Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | ACPI: Assert that we don't run out of the preallocated memoryDongjiu Geng2020-07-271-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | data_length is a constant value, so we use assert instead of condition check. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Message-id: 20200622113146.33421-1-gengdongjiu@huawei.com Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | | | Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-07-28' ↵Peter Maydell2020-07-285-8/+42
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Block patches for 5.1.0: - Fix block I/O for split transfers - Fix iotest 197 for non-qcow2 formats # gpg: Signature made Tue 28 Jul 2020 14:45:28 BST # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2020-07-28: iotests/197: Fix for non-qcow2 formats iotests/028: Add test for cross-base-EOF reads block: Fix bdrv_aligned_p*v() for qiov_offset != 0 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | | iotests/197: Fix for non-qcow2 formatsMax Reitz2020-07-282-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While 197 is very much a qcow2 test, and it looks like the partial cluster case at the end (introduced in b0ddcbbb36a66a6) is specifically a qcow2 case, the whole test scripts actually marks itself to work with generic formats (and generic protocols, even). Said partial cluster case happened to work with non-qcow2 formats as well (mostly by accident), but 1855536256 broke that, because it sets the compat option, which does not work for non-qcow2 formats. So go the whole way and force IMGFMT=qcow2 and IMGPROTO=file, as done in other places in this test. Fixes: 1855536256eb0a5708b04b85f744de69559ea323 ("iotests/197: Fix for compat=0.10") Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728131134.902519-1-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
| * | | | iotests/028: Add test for cross-base-EOF readsMax Reitz2020-07-282-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728120806.265916-3-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Claudio Fontana <cfontana@suse.de>
| * | | | block: Fix bdrv_aligned_p*v() for qiov_offset != 0Max Reitz2020-07-281-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since these functions take a @qiov_offset, they must always take it into account when working with @qiov. There are a couple of places where they do not, but they should. Fixes: 65cd4424b9df03bb5195351c33e04cbbecc0705c ("block/io: bdrv_aligned_preadv: use and support qiov_offset") Fixes: 28c4da28695bdbe04b336b2c9c463876cc3aaa6d ("block/io: bdrv_aligned_pwritev: use and support qiov_offset") Reported-by: Claudio Fontana <cfontana@suse.de> Reported-by: Bruce Rogers <brogers@suse.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728120806.265916-2-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Claudio Fontana <cfontana@suse.de> Tested-by: Bruce Rogers <brogers@suse.com>
* | | | | Merge remote-tracking branch ↵Peter Maydell2020-07-282-11/+18
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/vivier2/tags/linux-user-for-5.1-pull-request' into staging linux-user 20200728 Fix "pgb_reserved_va: Assertion `guest_base != 0' failed." error Fix rt_sigtimedwait() errno Fix getcwd() errno # gpg: Signature made Tue 28 Jul 2020 13:34:11 BST # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/linux-user-for-5.1-pull-request: linux-user: Use getcwd syscall directly linux-user: Fix syscall rt_sigtimedwait() implementation linux-user: Ensure mmap_min_addr is non-zero Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | | | linux-user: Use getcwd syscall directlyAndreas Schwab2020-07-271-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The glibc getcwd function returns different errors than the getcwd syscall, which triggers an assertion failure in the glibc getcwd function when running under the emulation. When the syscall returns ENAMETOOLONG, the glibc wrapper uses a fallback implementation that potentially handles an unlimited path length, and returns with ERANGE if the provided buffer is too small. The qemu emulation cannot distinguish the two cases, and thus always returns ERANGE. This is unexpected by the glibc wrapper. Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <mvmmu3qplvi.fsf@suse.de> [lv: updated description] Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | | | | linux-user: Fix syscall rt_sigtimedwait() implementationFilip Bozuta2020-07-271-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation of 'rt_sigtimedwait()' in 'syscall.c' uses the function 'target_to_host_timespec()' to transfer the value of 'struct timespec' from target to host. However, the implementation doesn't check whether this conversion succeeds and thus can cause an unaproppriate error instead of the 'EFAULT (Bad address)' which is supposed to be set if the conversion from target to host fails. This was confirmed with the LTP test for rt_sigtimedwait: "/testcases/kernel/syscalls/rt_sigtimedwait/rt_sigtimedwait01.c" which causes an unapropriate error in test case "test_bad_adress3" which is run with a bad adress for the 'struct timespec' argument: FAIL: test_bad_address3 (349): Unexpected failure: EAGAIN/EWOULDBLOCK (11) The test fails with an unexptected errno 'EAGAIN/EWOULDBLOCK' instead of the expected EFAULT. After the changes from this patch, the test case is executed successfully along with the other LTP test cases for 'rt_sigtimedwait()': PASS: test_bad_address3 (349): Test passed Signed-off-by: Filip Bozuta <Filip.Bozuta@syrmia.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20200724181651.167819-1-Filip.Bozuta@syrmia.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | | | | linux-user: Ensure mmap_min_addr is non-zeroRichard Henderson2020-07-271-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the chroot does not have /proc mounted, we can read neither /proc/sys/vm/mmap_min_addr nor /proc/sys/maps. The enforcement of mmap_min_addr in the host kernel is done by the security module, and so does not apply to processes owned by root. Which leads pgd_find_hole_fallback to succeed in probing a reservation at address 0. Which confuses pgb_reserved_va to believe that guest_base has not actually been initialized. We don't actually want NULL addresses to become accessible, so make sure that mmap_min_addr is initialized with a non-zero value. Buglink: https://bugs.launchpad.net/qemu/+bug/1888728 Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20200724212314.545877-1-richard.henderson@linaro.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* | | | | | Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell2020-07-283-12/+23
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging Want to send earlier but most patches just come. - fix vhost-vdpa issues when no peer - fix virtio-pci queue enabling index value - forbid reentrant RX Changes from V1: - drop the patch that has been merged # gpg: Signature made Tue 28 Jul 2020 09:59:41 BST # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: net: forbid the reentrant RX virtio-net: check the existence of peer before accessing vDPA config virtio-pci: fix wrong index in virtio_pci_queue_enabled Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | | | | net: forbid the reentrant RXJason Wang2020-07-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory API allows DMA into NIC's MMIO area. This means the NIC's RX routine must be reentrant. Instead of auditing all the NIC, we can simply detect the reentrancy and return early. The queue->delivering is set and cleared by qemu_net_queue_deliver() for other queue helpers to know whether the delivering in on going (NIC's receive is being called). We can check it and return early in qemu_net_queue_flush() to forbid reentrant RX. Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | | | | | virtio-net: check the existence of peer before accessing vDPA configJason Wang2020-07-281-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We try to check whether a peer is VDPA in order to get config from there - with no peer, this leads to a NULL pointer dereference. Add a check before trying to access the peer type. No peer means not VDPA. Fixes: 108a64818e69b ("vhost-vdpa: introduce vhost-vdpa backend") Cc: Cindy Lu <lulu@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
| * | | | | | virtio-pci: fix wrong index in virtio_pci_queue_enabledYuri Benditovich2020-07-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should use the index passed by the caller instead of the queue_sel when checking the enablement of a specific virtqueue. This is reported in https://bugzilla.redhat.com/show_bug.cgi?id=1702608 Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method") Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* | | | | | | Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2020-07-27-tag' ↵Peter Maydell2020-07-282-7/+24
|\ \ \ \ \ \ \ | |_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging qemu-ga patch queue for hard-freeze * document use of -1 when pci_controller field can't be retrieved for guest-get-fsinfo * fix incorrect filesystem type reporting on w32 for guest-get-fsinfo when a volume is not mounted # gpg: Signature made Tue 28 Jul 2020 00:16:50 BST # gpg: using RSA key CEACC9E15534EBABB82D3FA03353C9CEF108B584 # gpg: issuer "mdroth@linux.vnet.ibm.com" # gpg: Good signature from "Michael Roth <flukshun@gmail.com>" [full] # gpg: aka "Michael Roth <mdroth@utexas.edu>" [full] # gpg: aka "Michael Roth <mdroth@linux.vnet.ibm.com>" [full] # Primary key fingerprint: CEAC C9E1 5534 EBAB B82D 3FA0 3353 C9CE F108 B584 * remotes/mdroth/tags/qga-pull-2020-07-27-tag: qga/qapi-schema: Document -1 for invalid PCI address fields qga-win: fix "guest-get-fsinfo" wrong filesystem type Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | | | | qga/qapi-schema: Document -1 for invalid PCI address fieldsThomas Huth2020-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "guest-get-fsinfo" could also be used for non-PCI devices in the future. And the code in GuestPCIAddress() in qga/commands-win32.c seems to be using "-1" for fields that it can not determine already. Thus let's properly document "-1" as value for invalid PCI address fields. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
| * | | | | | qga-win: fix "guest-get-fsinfo" wrong filesystem typeBasil Salman2020-07-271-6/+23
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles the case where unmounted volumes exist, where in that case GetVolumePathNamesForVolumeName returns empty path, GetVolumeInformation will use the current working directory instead. This patch fixes the issue by opening a handle to the volumes, and using GetVolumeInformationByHandleW instead. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1746667 Signed-off-by: Basil Salman <bsalman@redhat.com> Signed-off-by: Basil Salman <basil@daynix.com> *fix crash when guest_build_fsinfo() sets errp multiple times *make new error message more distinct from existing ones Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* | | | | | Merge remote-tracking branch 'remotes/ericb/tags/pull-bitmaps-2020-07-27' ↵Peter Maydell2020-07-289-244/+555
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging bitmaps patches for 2020-07-27 - Improve handling of various post-copy bitmap migration scenarios. A lost bitmap should merely mean that the next backup must be full rather than incremental, rather than abruptly breaking the entire guest migration. - Associated iotest improvements # gpg: Signature made Mon 27 Jul 2020 21:46:17 BST # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-bitmaps-2020-07-27: (24 commits) migration: Fix typos in bitmap migration comments iotests: Adjust which migration tests are quick qemu-iotests/199: add source-killed case to bitmaps postcopy qemu-iotests/199: add early shutdown case to bitmaps postcopy qemu-iotests/199: check persistent bitmaps qemu-iotests/199: prepare for new test-cases addition migration/savevm: don't worry if bitmap migration postcopy failed migration/block-dirty-bitmap: cancel migration on shutdown migration/block-dirty-bitmap: relax error handling in incoming part migration/block-dirty-bitmap: keep bitmap state for all bitmaps migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete migration/block-dirty-bitmap: rename finish_lock to just lock migration/block-dirty-bitmap: refactor state global variables migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup migration/block-dirty-bitmap: rename state structure types migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start qemu-iotests/199: increase postcopy period qemu-iotests/199: change discard patterns qemu-iotests/199: improve performance: set bitmap by discard ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | | | migration: Fix typos in bitmap migration commentsEric Blake2020-07-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Noticed while reviewing the file for newer patches. Fixes: b35ebdf076 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727203206.134996-1-eblake@redhat.com>
| * | | | | iotests: Adjust which migration tests are quickEric Blake2020-07-271-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A quick run of './check -qcow2 -g migration' shows that test 169 is NOT quick, but meanwhile several other tests ARE quick. Let's adjust the test designations accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727195117.132151-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
| * | | | | qemu-iotests/199: add source-killed case to bitmaps postcopyVladimir Sementsov-Ogievskiy2020-07-272-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous patches fixes behavior of bitmaps migration, so that errors are handled by just removing unfinished bitmaps, and not fail or try to recover postcopy migration. Add corresponding test. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-22-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | qemu-iotests/199: add early shutdown case to bitmaps postcopyVladimir Sementsov-Ogievskiy2020-07-272-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous patches fixed two crashes which may occur on shutdown prior to bitmaps postcopy finished. Check that it works now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-21-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | qemu-iotests/199: check persistent bitmapsVladimir Sementsov-Ogievskiy2020-07-271-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check that persistent bitmaps are not stored on source and that bitmaps are persistent on destination. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-20-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | qemu-iotests/199: prepare for new test-cases additionVladimir Sementsov-Ogievskiy2020-07-271-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move future common part to start_postcopy() method. Move checking number of bitmaps to check_bitmap(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-19-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | migration/savevm: don't worry if bitmap migration postcopy failedVladimir Sementsov-Ogievskiy2020-07-271-5/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First, if only bitmaps postcopy is enabled (and not ram postcopy) postcopy_pause_incoming crashes on an assertion assert(mis->to_src_file). And anyway, bitmaps postcopy is not prepared to be somehow recovered. The original idea instead is that if bitmaps postcopy failed, we just lose some bitmaps, which is not critical. So, on failure we just need to remove unfinished bitmaps and guest should continue execution on destination. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-18-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | migration/block-dirty-bitmap: cancel migration on shutdownVladimir Sementsov-Ogievskiy2020-07-273-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If target is turned off prior to postcopy finished, target crashes because busy bitmaps are found at shutdown. Canceling incoming migration helps, as it removes all unfinished (and therefore busy) bitmaps. Similarly on source we crash in bdrv_close_all which asserts that all bdrv states are removed, because bdrv states involved into dirty bitmap migration are referenced by it. So, we need to cancel outgoing migration as well. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-17-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | migration/block-dirty-bitmap: relax error handling in incoming partVladimir Sementsov-Ogievskiy2020-07-271-37/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bitmaps data is not critical, and we should not fail the migration (or use postcopy recovering) because of dirty-bitmaps migration failure. Instead we should just lose unfinished bitmaps. Still we have to report io stream violation errors, as they affect the whole migration stream. While touching this, tighten code that was previously blindly calling malloc on a size read from the migration stream, as a corrupted stream (perhaps from a malicious user) should not be able to convince us to allocate an inordinate amount of memory. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727194236.19551-16-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: typo fixes, enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | migration/block-dirty-bitmap: keep bitmap state for all bitmapsVladimir Sementsov-Ogievskiy2020-07-271-21/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep bitmap state for disabled bitmaps too. Keep the state until the end of the process. It's needed for the following commit to implement bitmap postcopy canceling. To clean-up the new list the following logic is used: We need two events to consider bitmap migration finished: 1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received 2. dirty_bitmap_mig_before_vm_start should be called These two events may come in any order, so we understand which one is last, and on the last of them we remove bitmap migration state from the list. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-15-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
| * | | | | migration/block-dirty-bitmap: simplify dirty_bitmap_load_completeVladimir Sementsov-Ogievskiy2020-07-271-21/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in postcopy, bitmap successor must be enabled, and reclaim operation will enable the bitmap. So, actually we need just call _reclaim_ in both if branches, and making differences only to add an assertion seems not really good. The logic becomes simple: on load complete we do reclaim and that's all. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-14-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>