summary refs log tree commit diff stats
path: root/migration (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* migration: Create migrate_downtime_limit() functionJuan Quintela2023-04-273-2/+10
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Make all functions check have the same formatJuan Quintela2023-04-271-114/+39
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Create migrate_params_init() functionJuan Quintela2023-04-273-28/+33
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* multifd: Fix the number of channels readyJuan Quintela2023-04-271-1/+2
| | | | | | | | | We don't wait in the sem when we are doing a sync_main. Make it wait there. To make things clearer, we mark the channel ready at the begining of the thread loop. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration/vmstate-dump: Dump array size too as "num"Peter Xu2023-04-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For VMS_ARRAY typed vmsd fields, also dump the number of entries in the array in -vmstate-dump. Without such information, vmstate static checker can report false negatives of incompatible vmsd on VMS_ARRAY typed fields, when the src/dst do not have the same type of array defined. It's because in the checker we only check against size of fields within a VMSD field. One example: e1000e used to have a field defined as a boolean array with 5 entries, then removed it and replaced it with UNUSED (in 31e3f318c8b535): - VMSTATE_BOOL_ARRAY(core.eitr_intr_pending, E1000EState, - E1000E_MSIX_VEC_NUM), + VMSTATE_UNUSED(E1000E_MSIX_VEC_NUM), It's a legal replacement but vmstate static checker is not happy with it, because it checks only against the "size" field between the two fields (here one is BOOL_ARRAY, the other is UNUSED): For BOOL_ARRAY: { "field": "core.eitr_intr_pending", "version_id": 0, "field_exists": false, "size": 1 }, For UNUSED: { "field": "unused", "version_id": 0, "field_exists": false, "size": 5 }, It's not the script to blame because there's just not enough information dumped to show the total size of the entry for an array. Add it. Note that this will not break old vmstate checker because the field will just be ignored. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Allow postcopy_ram_supported_by_host() to report errPeter Xu2023-04-274-35/+39
| | | | | | | | | | | | | | | | | | | | Instead of print it to STDERR, bring the error upwards so that it can be reported via QMP responses. E.g.: { "execute": "migrate-set-capabilities" , "arguments": { "capabilities": [ { "capability": "postcopy-ram", "state": true } ] } } { "error": { "class": "GenericError", "desc": "Postcopy is not supported: Host backend files need to be TMPFS or HUGETLBFS only" } } Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Move qmp_migrate_set_parameters() to options.cJuan Quintela2023-04-273-420/+429
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_tls() to options.cJuan Quintela2023-04-275-13/+13
| | | | | | | | | | | | Once there, rename it to migrate_tls() and make it return bool for consistency. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> --- Fix typos found by fabiano
* migration: Disable postcopy + multifd migrationLeonardo Bras2023-04-271-0/+5
| | | | | | | | | | | | | | | | | | | | Since the introduction of multifd, it's possible to perform a multifd migration and finish it using postcopy. A bug introduced by yank (fixed on cfc3bcf373) was previously preventing a successful use of this migration scenario, and now thing should be working on most scenarios. But since there is not enough testing/support nor any reported users for this scenario, we should disable this combination before it may cause any problems for users. Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Create migrate_max_bandwidth() functionJuan Quintela2023-04-243-69/+81
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Move migrate_postcopy() to options.cJuan Quintela2023-04-244-7/+17
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Create migrate_cpu_throttle_tailslow() functionJuan Quintela2023-04-243-2/+11
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Create migrate_cpu_throttle_increment() functionJuan Quintela2023-04-243-1/+11
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Create migrate_cpu_throttle_initial() to option.cJuan Quintela2023-04-243-1/+11
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Move migrate_announce_params() to option.cJuan Quintela2023-04-242-14/+17
| | | | | | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> --- Fix extra whitespace (fabiano)
* migration: Create migrate_max_cpu_throttle()Juan Quintela2023-04-244-3/+11
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Create migrate_checkpoint_delay()Juan Quintela2023-04-243-3/+12
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Create migrate_throttle_trigger_threshold()Juan Quintela2023-04-243-2/+11
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* migration: Move migrate_use_block_incremental() to option.cJuan Quintela2023-04-245-12/+12
| | | | | | | | To be consistent with every other parameter, rename to migrate_block_incremental(). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Use migrate_max_postcopy_bandwidth()Juan Quintela2023-04-241-1/+1
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move parameters functions to option.cJuan Quintela2023-04-246-102/+108
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_cap_set() to options.cJuan Quintela2023-04-243-20/+22
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move qmp_migrate_set_capabilities() to options.cJuan Quintela2023-04-242-26/+26
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move qmp_query_migrate_capabilities() to options.cJuan Quintela2023-04-242-22/+23
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_caps_check() to options.cJuan Quintela2023-04-243-190/+196
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Create migrate_rdma_pin_all() functionJuan Quintela2023-04-243-3/+11
| | | | | | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> --- Fixed missing space after comma (fabiano)
* migration: Move migrate_use_return() to options.cJuan Quintela2023-04-245-14/+14
| | | | | | | | Once that we are there, we rename the function to migrate_return_path() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_block() to options.cJuan Quintela2023-04-246-13/+13
| | | | | | | | Once that we are there, we rename the function to migrate_block() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_xbzrle() to options.cJuan Quintela2023-04-245-16/+16
| | | | | | | | | Once that we are there, we rename the function to migrate_xbzrle() to be consistent with all other capabilities. We change the type to return bool also for consistency. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_zero_copy_send() to options.cJuan Quintela2023-04-246-22/+16
| | | | | | | | | | | Once that we are there, we rename the function to migrate_zero_copy_send() to be consistent with all other capabilities. We can remove the CONFIG_LINUX guard. We already check that we can't setup this capability in migrate_caps_check(). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_multifd() to options.cJuan Quintela2023-04-247-25/+25
| | | | | | | | Once that we are there, we rename the function to migrate_multifd() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_events() to options.cJuan Quintela2023-04-245-12/+12
| | | | | | | | Once that we are there, we rename the function to migrate_events() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_use_compression() to options.cJuan Quintela2023-04-245-19/+19
| | | | | | | | Once that we are there, we rename the function to migrate_compress() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Move migrate_colo_enabled() to options.cJuan Quintela2023-04-244-12/+12
| | | | | | | | Once that we are there, we rename the function to migrate_colo() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: Create options.cJuan Quintela2023-04-2412-120/+165
| | | | | | | | | | | | | | | | | | | | We move there all capabilities helpers from migration.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Following David advise: - looked through the history, capabilities are newer than 2012, so we can remove that bit of the header. - This part is posterior to Anthony. Original Author is Orit. Once there, I put myself. Peter Xu also did quite a bit of work here. Anyone else wants/needs to be there? I didn't search too hard because nobody asked before to be added. What do you think?
* migration: Create migrate_cap_set()Juan Quintela2023-04-241-18/+16
| | | | | | | | And remove the convoluted use of qmp_migrate_set_capabilities() to enable disable MIGRATION_CAPABILITY_BLOCK. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
* spice: move client_migrate_info command to ui/Juan Quintela2023-04-242-47/+0
| | | | | | | | It has nothing to do with migration, except for the "migrate" in the name of the command. Move it with the rest of the ui commands. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* migration: move migration_global_dump() to migration-hmp-cmds.cJuan Quintela2023-04-242-20/+21
| | | | | | | | | | | | | It is only used there, so we can make it static. Once there, remove spice.h that it is not used. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- fix David Edmonson ui/qemu-spice.h unintended removal
* migration: Minor control flow simplificationEric Blake2023-04-241-3/+2
| | | | | | | | | | No need to declare a temporary variable. Suggested-by: Juan Quintela <quintela@redhat.com> Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better") Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Pass migrate_caps_check() the old and new capsJuan Quintela2023-04-241-49/+31
| | | | | | | | We used to pass the old capabilities array and the new capabilities as a list. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration: rename enabled_capabilities to capabilitiesJuan Quintela2023-04-244-33/+31
| | | | | | | | | It is clear from the context what that means, and such a long name with the extra long names of the capabilities make very difficilut to stay inside the 80 columns limit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* migration/postcopy: Detect file system on dest hostPeter Xu2023-04-241-4/+30
| | | | | | | | | | | | | | | | | | Postcopy requires the memory support userfaultfd to work. Right now we check it but it's a bit too late (when switching to postcopy migration). Do that early right at enabling of postcopy. Note that this is still only a best effort because ramblocks can be dynamically created. We can add check in hostmem creations and fail if postcopy enabled, but maybe that's too aggressive. Still, we have chance to fail the most obvious where we know there's an existing unsupported ramblock. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Handle block device inactivation failures betterEric Blake2023-04-241-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider what happens when performing a migration between two host machines connected to an NFS server serving multiple block devices to the guest, when the NFS server becomes unavailable. The migration attempts to inactivate all block devices on the source (a necessary step before the destination can take over); but if the NFS server is non-responsive, the attempt to inactivate can itself fail. When that happens, the destination fails to get the migrated guest (good, because the source wasn't able to flush everything properly): (qemu) qemu-kvm: load of migration failed: Input/output error at which point, our only hope for the guest is for the source to take back control. With the current code base, the host outputs a message, but then appears to resume: (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1) (src qemu)info status VM status: running but a second migration attempt now asserts: (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. Whether the guest is recoverable on the source after the first failure is debatable, but what we do not want is to have qemu itself fail due to an assertion. It looks like the problem is as follows: In migration.c:migration_completion(), the source sets 'inactivate' to true (since COLO is not enabled), then tries savevm.c:qemu_savevm_state_complete_precopy() with a request to inactivate block devices. In turn, this calls block.c:bdrv_inactivate_all(), which fails when flushing runs up against the non-responsive NFS server. With savevm failing, we are now left in a state where some, but not all, of the block devices have been inactivated; but migration_completion() then jumps to 'fail' rather than 'fail_invalidate' and skips an attempt to reclaim those those disks by calling bdrv_activate_all(). Even if we do attempt to reclaim disks, we aren't taking note of failure there, either. Thus, we have reached a state where the migration engine has forgotten all state about whether a block device is inactive, because we did not set s->block_inactive in enough places; so migration allows the source to reach vm_start() and resume execution, violating the block layer invariant that the guest CPUs should not be restarted while a device is inactive. Note that the code in migration.c:migrate_fd_cancel() will also try to reactivate all block devices if s->block_inactive was set, but because we failed to set that flag after the first failure, the source assumes it has reclaimed all devices, even though it still has remaining inactivated devices and does not try again. Normally, qmp_cont() will also try to reactivate all disks (or correctly fail if the disks are not reclaimable because NFS is not yet back up), but the auto-resumption of the source after a migration failure does not go through qmp_cont(). And because we have left the block layer in an inconsistent state with devices still inactivated, the later migration attempt is hitting the assertion failure. Since it is important to not resume the source with inactive disks, this patch marks s->block_inactive before attempting inactivation, rather than after succeeding, in order to prevent any vm_start() until it has successfully reactivated all devices. See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
* migration: Rename normal to normal_pagesJuan Quintela2023-04-243-7/+7
| | | | | | | | | | Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of full pages transferred. The name "normal" refered to the fact that they were sent without any optimization (compression, xbzrle, zero_page, ...). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Rename duplicate to zero_pagesJuan Quintela2023-04-243-7/+7
| | | | | | | | | | Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of pages composed of the same character, here comes the name "duplicated". But since years ago, it refers to the number of zero_pages. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Make postcopy_requests atomicJuan Quintela2023-04-243-3/+4
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Make dirty_sync_count atomicJuan Quintela2023-04-243-8/+10
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Make downtime_bytes atomicJuan Quintela2023-04-243-3/+3
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Make precopy_bytes atomicJuan Quintela2023-04-243-3/+3
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* migration: Make dirty_sync_missed_zero_copy atomicJuan Quintela2023-04-244-10/+3
| | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>