focaccia-qemu - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Files	Lines
2019-10-15	hw/timer/cmsdk-apb-dualtimer.c: Switch to transaction-based ptimer API	Peter Maydell	1	-3/+11
	Switch the cmsdk-apb-dualtimer code away from bottom-half based ptimers to the new transaction-based ptimer API. This just requires adding begin/commit calls around the various places that modify the ptimer state, and using the new ptimer_init() function to create the timer. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-9-peter.maydell@linaro.org
2019-10-15	hw/timer/arm_mptimer.c: Switch to transaction-based ptimer API	Peter Maydell	1	-3/+11
	Switch the arm_mptimer.c code away from bottom-half based ptimers to the new transaction-based ptimer API. This just requires adding begin/commit calls around the various places that modify the ptimer state, and using the new ptimer_init() function to create the timer. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-8-peter.maydell@linaro.org
2019-10-15	hw/timer/allwinner-a10-pit.c: Switch to transaction-based ptimer API	Peter Maydell	1	-4/+8
	Switch the allwinner-a10-pit code away from bottom-half based ptimers to the new transaction-based ptimer API. This just requires adding begin/commit calls around the various places that modify the ptimer state, and using the new ptimer_init() function to create the timer. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-7-peter.maydell@linaro.org
2019-10-15	hw/arm/musicpal.c: Switch to transaction-based ptimer API	Peter Maydell	1	-6/+10
	Switch the musicpal code away from bottom-half based ptimers to the new transaction-based ptimer API. This just requires adding begin/commit calls around the various places that modify the ptimer state, and using the new ptimer_init() function to create the timer. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-6-peter.maydell@linaro.org
2019-10-15	hw/timer/arm_timer.c: Switch to transaction-based ptimer API	Peter Maydell	1	-5/+11
	Switch the arm_timer.c code away from bottom-half based ptimers to the new transaction-based ptimer API. This just requires adding begin/commit calls around the various arms of arm_timer_write() that modify the ptimer state, and using the new ptimer_init() function to create the timer. Fixes: https://bugs.launchpad.net/qemu/+bug/1777777 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-5-peter.maydell@linaro.org
2019-10-15	tests/ptimer-test: Switch to transaction-based ptimer API	Peter Maydell	1	-22/+84
	Convert the ptimer test cases to the transaction-based ptimer API, by changing to ptimer_init(), dropping the now-unused QEMUBH variables, and surrounding each set of changes to the ptimer state in ptimer_transaction_begin/commit calls. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-4-peter.maydell@linaro.org
2019-10-15	ptimer: Provide new transaction-based API	Peter Maydell	2	-15/+209
	Provide the new transaction-based API. If a ptimer is created using ptimer_init() rather than ptimer_init_with_bh(), then instead of providing a QEMUBH, it provides a pointer to the callback function directly, and has opted into the transaction API. All calls to functions which modify ptimer state: - ptimer_set_period() - ptimer_set_freq() - ptimer_set_limit() - ptimer_set_count() - ptimer_run() - ptimer_stop() must be between matched calls to ptimer_transaction_begin() and ptimer_transaction_commit(). When ptimer_transaction_commit() is called it will evaluate the state of the timer after all the changes in the transaction, and call the callback if necessary. In the old API the individual update functions generally would call ptimer_trigger() immediately, which would schedule the QEMUBH. In the new API the update functions will instead defer the "set s->next_event and call ptimer_reload()" work to ptimer_transaction_commit(). Because ptimer_trigger() can now immediately call into the device code which may then call other ptimer functions that update ptimer_state fields, we must be more careful in ptimer_reload() not to cache fields from ptimer_state across the ptimer_trigger() call. (This was harmless with the QEMUBH mechanism as the BH would not be invoked until much later.) We use assertions to check that: * the functions modifying ptimer state are not called outside a transaction block * ptimer_transaction_begin() and _commit() calls are paired * the transaction API is not used with a QEMUBH ptimer There is some slight repetition of code: * most of the set functions have similar looking "if s->bh call ptimer_reload, otherwise set s->need_reload" code * ptimer_init() and ptimer_init_with_bh() have similar code We deliberately don't try to avoid this repetition, because it will all be deleted when the QEMUBH version of the API is removed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-3-peter.maydell@linaro.org
2019-10-15	ptimer: Rename ptimer_init() to ptimer_init_with_bh()	Peter Maydell	31	-54/+56
	Currently the ptimer design uses a QEMU bottom-half as its mechanism for calling back into the device model using the ptimer when the timer has expired. Unfortunately this design is fatally flawed, because it means that there is a lag between the ptimer updating its own state and the device callback function updating device state, and guest accesses to device registers between the two can return inconsistent device state. We want to replace the bottom-half design with one where the guest device's callback is called either immediately (when the ptimer triggers by timeout) or when the device model code closes a transaction-begin/end section (when the ptimer triggers because the device model changed the ptimer's count value or other state). As the first step, rename ptimer_init() to ptimer_init_with_bh(), to free up the ptimer_init() name for the new API. We can then convert all the ptimer users away from ptimer_init_with_bh() before removing it entirely. (Commit created with git grep -l ptimer_init \| xargs sed -i -e 's/ptimer_init/ptimer_init_with_bh/' and three overlong lines folded by hand.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191008171740.9679-2-peter.maydell@linaro.org
2019-10-15	ARM: KVM: Check KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 for smp_cpus > 256	Eric Auger	1	-1/+9
	Host kernel within [4.18, 5.3] report an erroneous KVM_MAX_VCPUS=512 for ARM. The actual capability to instantiate more than 256 vcpus was fixed in 5.4 with the upgrade of the KVM_IRQ_LINE ABI to support vcpu id encoded on 12 bits instead of 8 and a redistributor consuming a single KVM IO device instead of 2. So let's check this capability when attempting to use more than 256 vcpus within any ARM kvm accelerated machine. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-id: 20191003154640.22451-4-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-15	intc/arm_gic: Support IRQ injection for more than 256 vpus	Eric Auger	4	-11/+19
	Host kernels that expose the KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 capability allow injection of interrupts along with vcpu ids larger than 255. Let's encode the vpcu id on 12 bits according to the upgraded KVM_IRQ_LINE ABI when needed. Given that we have two callsites that need to assemble the value for kvm_set_irq(), a new helper routine, kvm_arm_set_irq is introduced. Without that patch qemu exits with "kvm_set_irq: Invalid argument" message. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-id: 20191003154640.22451-3-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-15	linux headers: update against v5.4-rc1	Eric Auger	32	-59/+406
	Update the headers against commit: 0f1a7b3fac05 ("timer-of: don't use conditional expression with mixed 'void' types") Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-id: 20191003154640.22451-2-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-15	trace: avoid "is" with a literal Python 3.8 warnings	Stefan Hajnoczi	1	-2/+2
	The following statement produces a SyntaxWarning with Python 3.8: if len(format) is 0: scripts/tracetool/__init__.py:459: SyntaxWarning: "is" with a literal. Did you mean "=="? Use the conventional len(x) == 0 syntax instead. Reported-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20191010122154.10553-1-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-10-15	trace: add --group=all to tracing.txt	Stefan Hajnoczi	1	-1/+2
	tracetool needs to know the group name ("all", "root", or a specific subdirectory). Also remove the stdin redirection because tracetool.py needs the path to the trace-events file. Update the documentation. Fixes: 2098c56a9bc5901e145fa5d4759f075808811685 ("trace: move setting of group name into Makefiles") Buglink: https://bugs.launchpad.net/bugs/1844814 Reported-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191009135154.10970-1-stefanha@redhat.com>
2019-10-14	iotests: Test large write request to qcow2 file	Max Reitz	3	-0/+93
	Without HEAD^, the following happens when you attempt a large write request to a qcow2 file such that the number of bytes covered by all clusters involved in a single allocation will exceed INT_MAX: (A) handle_alloc_space() decides to fill the whole area with zeroes and fails because bdrv_co_pwrite_zeroes() fails (the request is too large). (B) If handle_alloc_space() does not do anything, but merge_cow() decides that the requests can be merged, it will create a too long IOV that later cannot be written. (C) Otherwise, all parts will be written separately, so those requests will work. In either B or C, though, qcow2_alloc_cluster_link_l2() will have an overflow: We use an int (i) to iterate over nb_clusters, and then calculate the L2 entry based on "i << s->cluster_bits" -- which will overflow if the range covers more than INT_MAX bytes. This then leads to image corruption because the L2 entry will be wrong (it will be recognized as a compressed cluster). Even if that were not the case, the .cow_end area would be empty (because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so their difference (which is the .cow_end size) will be 0). So this test checks that on such large requests, the image will not be corrupted. Unfortunately, we cannot check whether COW will be handled correctly, because that data is discarded when it is written to null-co (but we have to use null-co, because writing 2 GB of data in a test is not quite reasonable). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	qcow2: Limit total allocation range to INT_MAX	Max Reitz	1	-1/+4
	When the COW areas are included, the size of an allocation can exceed INT_MAX. This is kind of limited by handle_alloc() in that it already caps avail_bytes at INT_MAX, but the number of clusters still reflects the original length. This can have all sorts of effects, ranging from the storage layer write call failing to image corruption. (If there were no image corruption, then I suppose there would be data loss because the .cow_end area is forced to be empty, even though there might be something we need to COW.) Fix all of it by limiting nb_clusters so the equivalent number of bytes will not exceed INT_MAX. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	qemu-nbd: Support help options for --object	Kevin Wolf	1	-1/+8
	Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-nbd, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2019-10-14	qemu-img: Support help options for --object	Kevin Wolf	1	-13/+21
	Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-img, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2019-10-14	qemu-io: Support help options for --object	Kevin Wolf	1	-1/+8
	Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-io, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2019-10-14	vl: Split off user_creatable_print_help()	Kevin Wolf	3	-51/+74
	Printing help for --object is something that we not only want in the system emulator, but also in tools that support --object. Move it into a separate function in qom/object_interfaces.c to make the code accessible for tools. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2019-10-14	iotests/028: Fix for long $TEST_DIRs	Max Reitz	2	-4/+8
	For long test image paths, the order of the "Formatting" line and the "(qemu)" prompt after a drive_backup HMP command may be reversed. In fact, the interaction between the prompt and the line may lead to the "Formatting" to being greppable at all after "read"-ing it (if the prompt injects an IFS character into the "Formatting" string). So just wait until we get a prompt. At that point, the block job must have been started, so "info block-jobs" will only return "No active jobs" once it is done. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	block: Reject misaligned write requests with BDRV_REQ_NO_FALLBACK	Alberto Garcia	4	-0/+70
	The BDRV_REQ_NO_FALLBACK flag means that an operation should only be performed if it can be offloaded or otherwise performed efficiently. However a misaligned write request requires a RMW so we should return an error and let the caller decide how to proceed. This hits an assertion since commit c8bb23cbdb if the required alignment is larger than the cluster size: qemu-img create -f qcow2 -o cluster_size=2k img.qcow2 4G qemu-io -c "open -o driver=qcow2,file.align=4k blkdebug::img.qcow2" \ -c 'write 0 512' qemu-io: block/io.c:1127: bdrv_driver_pwritev: Assertion `!(flags & BDRV_REQ_NO_FALLBACK)' failed. Aborted The reason is that when writing to an unallocated cluster we try to skip the copy-on-write part and zeroize it using BDRV_REQ_NO_FALLBACK instead, resulting in a write request that is too small (2KB cluster size vs 4KB required alignment). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	replay: add BH oneshot event for block layer	Pavel Dovgalyuk	13	-16/+59
	Replay is capable of recording normal BH events, but sometimes there are single use callbacks scheduled with aio_bh_schedule_oneshot function. This patch enables recording and replaying such callbacks. Block layer uses these events for calling the completion function. Replaying these calls makes the execution deterministic. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	replay: finish record/replay before closing the disks	Pavel Dovgalyuk	2	-0/+3
	After recent updates block devices cannot be closed on qemu exit. This happens due to the block request polling when replay is not finished. Therefore now we stop execution recording before closing the block devices. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	replay: don't drain/flush bdrv queue while RR is working	Pavel Dovgalyuk	2	-2/+28
	In record/replay mode bdrv queue is controlled by replay mechanism. It does not allow saving or loading the snapshots when bdrv queue is not empty. Stopping the VM is not blocked by nonempty queue, but flushing the queue is still impossible there, because it may cause deadlocks in replay mode. This patch disables bdrv_drain_all and bdrv_flush_all in record/replay mode. Stopping the machine when the IO requests are not finished is needed for the debugging. E.g., breakpoint may be set at the specified step, and forcing the IO requests to finish may break the determinism of the execution. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	replay: update docs for record/replay with block devices	Pavel Dovgalyuk	1	-3/+9
	This patch updates the description of the command lines for using record/replay with attached block devices. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	replay: disable default snapshot for record/replay	Pavel Dovgalyuk	1	-2/+8
	This patch disables setting '-snapshot' option on by default in record/replay mode. This is needed for creating vmstates in record and replay modes. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	block: implement bdrv_snapshot_goto for blkreplay	Pavel Dovgalyuk	1	-0/+8
	This patch enables making snapshots with blkreplay used in block devices. This function is required to make bdrv_snapshot_goto without calling .bdrv_open which is not implemented. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	block/vhdx: add check for truncated image files	Peter Lieven	1	-17/+103
	qemu is currently not able to detect truncated vhdx image files. Add a basic check if all allocated blocks are reachable at open and report all errors during bdrv_co_check. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-14	test-bdrv-drain: fix iothread_join() hang	Stefan Hajnoczi	1	-2/+8
	tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run(): while (!atomic_read(&iothread->stopping)) { aio_poll(iothread->ctx, true); } The iothread_join() function works as follows: void iothread_join(IOThread *iothread) { iothread->stopping = true; aio_notify(iothread->ctx); qemu_thread_join(&iothread->thread); If iothread_run() checks iothread->stopping before the iothread_join() thread sets stopping to true, then aio_notify() may be optimized away and iothread_run() hangs forever in aio_poll(). The correct way to change iothread->stopping is from a BH that executes within iothread_run(). This ensures that iothread->stopping is checked after we set it to true. This was already fixed for ./iothread.c (note this is a different source file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread: fix iothread_stop() race condition"), but not for tests/iothread.c. Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a ("aio: introduce aio_co_schedule and aio_co_wake") Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20191003100103.331-1-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-10-12	Update OpenBIOS images to f28e16f9 built from submodule.	Mark Cave-Ayland	4	-0/+0
	Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2019-10-11	migration: Support gtree migration	Eric Auger	4	-0/+618
	Introduce support for GTree migration. A custom save/restore is implemented. Each item is made of a key and a data. If the key is a pointer to an object, 2 VMSDs are passed into the GTree VMStateField. When putting the items, the tree is traversed in sorted order by g_tree_foreach. On the get() path, gtrees must be allocated using the proper key compare, key destroy and value destroy. This must be handled beforehand, for example in a pre_load method. Tests are added to test save/dump of structs containing gtrees including the virtio-iommu domain/mappings scenario. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20191011121724.433-1-eric.auger@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> uintptr_t fixup for test on 32bit
2019-10-11	migration/multifd: pages->used would be cleared when attach to ↵	Wei Yang	1	-1/+0
	multifd_send_state When we found an available channel in multifd_send_pages(), its pages->used is cleared and then attached to multifd_send_state. It is not necessary to do this twice. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-5-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/multifd: initialize packet->magic/version once at setup stage	Wei Yang	1	-2/+2
	MultiFDPacket_t's magic and version field never changes during migration, so move these two fields in setup stage. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-4-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/multifd: use pages->allocated instead of the static max	Wei Yang	1	-2/+1
	multifd_send_fill_packet() prepares meta data for following pages to transfer. It would be more proper to fill pages->allocated instead of static max value, especially we want to support flexible packet size. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-3-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()	Wei Yang	1	-1/+1
	Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-2-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: check PostcopyState before setting to ↵	Wei Yang	1	-1/+2
	POSTCOPY_INCOMING_RUNNING Currently, we set PostcopyState blindly to RUNNING, even we found the previous state is not LISTENING. This will lead to a corner case. First let's look at the code flow: qemu_loadvm_state_main() ret = loadvm_process_command() loadvm_postcopy_handle_run() return -1; if (ret < 0) { if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING) ... } >From above snippet, the corner case is loadvm_postcopy_handle_run() always sets state to RUNNING. And then it checks the previous state. If the previous state is not LISTENING, it will return -1. But at this moment, PostcopyState is already been set to RUNNING. Then ret is checked in qemu_loadvm_state_main(), when it is -1 PostcopyState is checked. Current logic would pause postcopy and retry if PostcopyState is RUNNING. This is not what we expect, because postcopy is not active yet. This patch makes sure state is set to RUNNING only previous state is LISTENING by checking the state first. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Suggested by: Peter Xu <peterx@redhat.com> Message-Id: <20191010011316.31363-3-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: rename postcopy_ram_enable_notify to ↵	Wei Yang	3	-4/+4
	postcopy_ram_incoming_setup Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup is a pair. Rename to make it clear for audience. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191010011316.31363-2-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: postpone setting PostcopyState to END	Wei Yang	2	-2/+2
	There are two places to call function postcopy_ram_incoming_cleanup() postcopy_ram_listen_thread on migration success loadvm_postcopy_handle_listen one setup failure On success, the vm will never accept another migration. On failure, PostcopyState is transited from LISTENING to END and would be checked in qemu_loadvm_state_main(). If PostcopyState is RUNNING, migration would be paused and retried. Currently PostcopyState is set to END in function postcopy_ram_incoming_cleanup(). With above analysis, we can take this step out and postpone this till the end of listen thread to indicate the listen thread is done. This is a preparation patch for later cleanup. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up in merge to the 1 parameter postcopy_state_set
2019-10-11	migration/postcopy: mis->have_listen_thread check will never be touched	Wei Yang	1	-5/+0
	If mis->have_listen_thread is true, this means current PostcopyState must be LISTENING or RUNNING. While the check at the beginning of the function makes sure the state transaction happens when its previous PostcopyState is ADVISE or DISCARD. This means we would never touch this check. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: report SaveStateEntry id and name on failure	Wei Yang	1	-0/+2
	This provides helpful information on which entry failed. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-5-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: pass in_postcopy instead of check state again	Wei Yang	1	-2/+1
	Not necessary to do the check again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: fix typo in mark_postcopy_blocktime_begin's comment	Wei Yang	1	-3/+5
	Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-3-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: map large zero page in postcopy_ram_incoming_setup()	Wei Yang	1	-17/+17
	postcopy_ram_incoming_setup() and postcopy_ram_incoming_cleanup() are counterpart. It is reasonable to map/unmap large zero page in these two functions respectively. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005135021.21721-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration/postcopy: allocate tmp_page in setup stage	Wei Yang	3	-38/+11
	During migration, a tmp page is allocated so that we could place a whole host page during postcopy. Currently the page is allocated during load stage, this is a little bit late. And more important, if we failed to allocate it, the error is not checked properly. Even it is NULL, we would still use it. This patch moves the allocation to setup stage and if failed error message would be printed and caller would notice it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: Don't try and recover return path in non-postcopy	Dr. David Alan Gilbert	1	-1/+1
	In normal precopy we can't do reconnection recovery - but we also don't need to, since you can just rerun migration. At the moment if the 'return-path' capability is on, we use the return path in precopy to give a positive 'OK' to the end of migration; however if migration fails then we fall into the postcopy recovery path and hang. This fixes it by only running the return path in the postcopy case. Reported-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	rcu: Use automatic rc_read unlock in core memory/exec code	Dr. David Alan Gilbert	3	-151/+118
	Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-6-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: Use automatic rcu_read unlock in rdma.c	Dr. David Alan Gilbert	1	-46/+11
	Use the automatic read unlocker in migration/rdma.c. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-5-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: Use automatic rcu_read unlock in ram.c	Dr. David Alan Gilbert	1	-138/+121
	Use the automatic read unlocker in migration/ram.c Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-4-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	migration: Fix missing rcu_read_unlock	Dr. David Alan Gilbert	1	-14/+13
	Use the automatic rcu_read unlocker to fix a missing unlock. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11	rcu: Add automatically released rcu_read_lock variants	Dr. David Alan Gilbert	2	-0/+41
	RCU_READ_LOCK_GUARD() takes the rcu_read_lock and then uses glib's g_auto infrastructure (and thus whatever the compiler's hooks are) to release it on all exits of the block. WITH_RCU_READ_LOCK_GUARD() is similar but is used as a wrapper for the lock, i.e.: WITH_RCU_READ_LOCK_GUARD() { stuff under lock } Note the 'unused' attribute is needed to work around clang bug: https://bugs.llvm.org/show_bug.cgi?id=43482 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-2-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>