diff options
| author | Richard Henderson <richard.henderson@linaro.org> | 2024-06-21 11:19:25 -0700 |
|---|---|---|
| committer | Richard Henderson <richard.henderson@linaro.org> | 2024-06-21 11:19:25 -0700 |
| commit | ffeddb979400b1580ad28acbee09b6f971c3912d (patch) | |
| tree | b6e6752ff6c864edd312b9f6c15b05886861a1d0 /docs/devel/migration | |
| parent | 02d9c38236cf8c9826e5c5be61780c4444cb4ae0 (diff) | |
| parent | 04b09de16d78cf2d163ca65d7c6d161bf2baceb6 (diff) | |
| download | focaccia-qemu-ffeddb979400b1580ad28acbee09b6f971c3912d.tar.gz focaccia-qemu-ffeddb979400b1580ad28acbee09b6f971c3912d.zip | |
Merge tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu into staging
Migration pull request - Fabiano's fix for fdset + file migration truncating the migration file - Fabiano's fdset + direct-io support for mapped-ram - Peter's various cleanups (multifd sync, thread names, migration states, tests) - Peter's new migration state postcopy-recover-setup - Philippe's unused vmstate macro cleanup # -----BEGIN PGP SIGNATURE----- # # iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZ1vIsQHGZhcm9zYXNA # c3VzZS5kZQAKCRDHmNx0G+wxnVZTEACdFIsQ/PJw2C9eeLNor5B5MNSEqUjxX0KN # 6s/uTkJ/dcv+2PI92SzRCZ1dpR5e9AyjTFYbLc9tPRBIROEhlUaoc84iyEy0jCFU # eJ65/RQbH5QHRpOZwbN5RmGwnapfOWHGTn3bpdrmSQTOAy8R2TPGY4SVYR+gamTn # bAv1cAsrOOBUfCi8aqvSlmvuliOW0lzJdF4XHa3mAaigLoF14JdwUZdyIMP1mLDp # /fllbHCKCvJ1vprE9hQmptBR9PzveJZOZamIVt96djJr5+C869+9PMCn3a5vxqNW # b+/LhOZjac37Ecg5kgbq+cO1E4EXKC3zWOmDTw8kHUwp9oYNi1upwLdpHbAAZaQD # /JmHKsExx9QuV8mrVyGBXMI92E6RrT54b1Bjcuo63gAP8p9JRRxGT22U3LghNbTm # 1XcGPR3rswjT1yTgE6qAqAIMR+7X5MrJVWop9ub/lF5DQ1VYIwmlKSNdwDHFDhRq # 0F1k2+EksNpcZ0BH2+3iFml7qKHLVupLQKTWcLdrlnQnTfSG3+yW7eyA5Mte79Qp # nJPcHt8qBqUVQ9Uf/4490TM4Lrp+T+m16exIi0tISLaDXSVkFJnlowipSm+tQ7U3 # Sm68JWdWWEsXZVaMqJeBE8nA/hCoQDpo4hVdwftStI+NayXbRX/EgvPqrNAvwh+c # i4AdHdn6hQ== # =ZX0p # -----END PGP SIGNATURE----- # gpg: Signature made Fri 21 Jun 2024 10:46:51 AM PDT # gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D # gpg: issuer "farosas@suse.de" # gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown] # gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D * tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu: (28 commits) migration: Remove unused VMSTATE_ARRAY_TEST() macro tests/migration-tests: Cover postcopy failure on reconnect tests/migration-tests: Verify postcopy-recover-setup status tests/migration-tests: migration_event_wait() tests/migration-tests: Always enable migration events tests/migration-tests: Drop most WIN32 ifdefs for postcopy failure tests migration/docs: Update postcopy recover session for SETUP phase migration/postcopy: Add postcopy-recover-setup phase migration: Cleanup incoming migration setup state change migration: Use MigrationStatus instead of int migration: Rename thread debug names migration/multifd: Avoid the final FLUSH in complete() tests/qtest/migration: Add a test for mapped-ram with passing of fds migration: Add documentation for fdset with multifd + file monitor: fdset: Match against O_DIRECT tests/qtest/migration: Add tests for file migration with direct-io migration/multifd: Add direct-io support migration: Add direct-io parameter io: Stop using qemu_open_old in channel-file monitor: Report errors from monitor_fdset_dup_fd_add ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Diffstat (limited to 'docs/devel/migration')
| -rw-r--r-- | docs/devel/migration/main.rst | 24 | ||||
| -rw-r--r-- | docs/devel/migration/mapped-ram.rst | 6 | ||||
| -rw-r--r-- | docs/devel/migration/postcopy.rst | 31 |
3 files changed, 40 insertions, 21 deletions
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 495cdcb112..784c899dca 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -47,11 +47,25 @@ over any transport. QEMU interference. Note that QEMU does not flush cached file data/metadata at the end of migration. -In addition, support is included for migration using RDMA, which -transports the page data using ``RDMA``, where the hardware takes care of -transporting the pages, and the load on the CPU is much lower. While the -internals of RDMA migration are a bit different, this isn't really visible -outside the RAM migration code. + The file migration also supports using a file that has already been + opened. A set of file descriptors is passed to QEMU via an "fdset" + (see add-fd QMP command documentation). This method allows a + management application to have control over the migration file + opening operation. There are, however, strict requirements to this + interface if the multifd capability is enabled: + + - the fdset must contain two file descriptors that are not + duplicates between themselves; + - if the direct-io capability is to be used, exactly one of the + file descriptors must have the O_DIRECT flag set; + - the file must be opened with WRONLY on the migration source side + and RDONLY on the migration destination side. + +- rdma migration: support is included for migration using RDMA, which + transports the page data using ``RDMA``, where the hardware takes + care of transporting the pages, and the load on the CPU is much + lower. While the internals of RDMA migration are a bit different, + this isn't really visible outside the RAM migration code. All these migration protocols use the same infrastructure to save/restore state devices. This infrastructure is shared with the diff --git a/docs/devel/migration/mapped-ram.rst b/docs/devel/migration/mapped-ram.rst index fa4cefd9fc..d352b546e9 100644 --- a/docs/devel/migration/mapped-ram.rst +++ b/docs/devel/migration/mapped-ram.rst @@ -16,7 +16,7 @@ location in the file, rather than constantly being added to a sequential stream. Having the pages at fixed offsets also allows the usage of O_DIRECT for save/restore of the migration stream as the pages are ensured to be written respecting O_DIRECT alignment -restrictions (direct-io support not yet implemented). +restrictions. Usage ----- @@ -35,6 +35,10 @@ Use a ``file:`` URL for migration: Mapped-ram migration is best done non-live, i.e. by stopping the VM on the source side before migrating. +For best performance enable the ``direct-io`` parameter as well: + + ``migrate_set_parameter direct-io on`` + Use-cases --------- diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst index 6c51e96d79..82e7a848c6 100644 --- a/docs/devel/migration/postcopy.rst +++ b/docs/devel/migration/postcopy.rst @@ -99,17 +99,6 @@ ADVISE->DISCARD->LISTEN->RUNNING->END (although it can't do the cleanup it would do as it finishes a normal migration). - - Paused - - Postcopy can run into a paused state (normally on both sides when - happens), where all threads will be temporarily halted mostly due to - network errors. When reaching paused state, migration will make sure - the qemu binary on both sides maintain the data without corrupting - the VM. To continue the migration, the admin needs to fix the - migration channel using the QMP command 'migrate-recover' on the - destination node, then resume the migration using QMP command 'migrate' - again on source node, with resume=true flag set. - - End The listen thread can now quit, and perform the cleanup of migration @@ -221,7 +210,8 @@ paused postcopy migration. The recovery phase normally contains a few steps: - - When network issue occurs, both QEMU will go into PAUSED state + - When network issue occurs, both QEMU will go into **POSTCOPY_PAUSED** + migration state. - When the network is recovered (or a new network is provided), the admin can setup the new channel for migration using QMP command @@ -229,9 +219,20 @@ The recovery phase normally contains a few steps: - On source host, the admin can continue the interrupted postcopy migration using QMP command 'migrate' with resume=true flag set. - - - After the connection is re-established, QEMU will continue the postcopy - migration on both sides. + Source QEMU will go into **POSTCOPY_RECOVER_SETUP** state trying to + re-establish the channels. + + - When both sides of QEMU successfully reconnect using a new or fixed up + channel, they will go into **POSTCOPY_RECOVER** state, some handshake + procedure will be needed to properly synchronize the VM states between + the two QEMUs to continue the postcopy migration. For example, there + can be pages sent right during the window when the network is + interrupted, then the handshake will guarantee pages lost in-flight + will be resent again. + + - After a proper handshake synchronization, QEMU will continue the + postcopy migration on both sides and go back to **POSTCOPY_ACTIVE** + state. Postcopy migration will continue. During a paused postcopy migration, the VM can logically still continue running, and it will not be impacted from any page access to pages that |