summary refs log tree commit diff stats
path: root/hw/vfio/migration.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* vfio/migration: Add x-migration-load-config-after-iter VFIO propertyMaciej S. Szmigiero2025-07-151-1/+9
| | | | | | | | | | | | | | | | This property allows configuring whether to start the config load only after all iterables were loaded, during non-iterables loading phase. Such interlocking is required for ARM64 due to this platform VFIO dependency on interrupt controller being loaded first. The property defaults to AUTO, which means ON for ARM, OFF for other platforms. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/0e03c60dbc91f9a9ba2516929574df605b7dfcb4.1752589295.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* migration: Rename save_live_complete_precopy_thread to ↵Juraj Marcin2025-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | save_complete_precopy_thread Recent patch [1] renames the save_live_complete_precopy handler to save_complete, as the machine is not live in most cases when this handler is executed. The same is true also for save_live_complete_precopy_thread, therefore this patch removes the "live" keyword from the handler itself and related types to keep the naming unified. In contrast to save_complete, this handler is only executed at the end of precopy, therefore the "precopy" keyword is retained. [1]: https://lore.kernel.org/all/20250613140801.474264-7-peterx@redhat.com/ Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Link: https://lore.kernel.org/r/20250626085235.294690-1-jmarcin@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
* migration: Rename save_live_complete_precopy to save_completePeter Xu2025-07-111-1/+1
| | | | | | | | | | | | | | | | | | Now after merging the precopy and postcopy version of complete() hook, rename the precopy version from save_live_complete_precopy() to save_complete(). Dropping the "live" when at it, because it's in most cases not live when happening (in precopy). No functional change intended. Reviewed-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20250613140801.474264-7-peterx@redhat.com [peterx: squash the fixup that covers a few more doc spots, per Juraj] Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
* system/runstate: add VM state change cb with return valueHaoqian He2025-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`, which has return value for `VMChangeStateEntry`. Thus, we can register a new VM state change cb with return value for device. Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually exclusive and cannot be provided at the same time. This patch is the pre patch for 'vhost-user: return failure if backend crashes when live migration', which makes the live migration aware of the loss of connection with the vhost-user backend and aborts the live migration. Virtio device will use VMChangeStateHandlerWithRet. Signed-off-by: Haoqian He <haoqian.he@smartx.com> Message-Id: <20250416024729.3289157-2-haoqian.he@smartx.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vfio: Rename vfio-common.h to vfio-device.hCédric Le Goater2025-04-251-1/+1
| | | | | | | | | | | "hw/vfio/vfio-common.h" has been emptied of most of its declarations by the previous changes and the only declarations left are related to VFIODevice. Rename it to "hw/vfio/vfio-device.h" and make the necessary adjustments. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-36-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Move vfio_device_state_is_running/precopy() into migration.cCédric Le Goater2025-04-251-0/+16
| | | | | | | | | | These routines are migration related. Move their declaration and implementation under the migration files. Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-8-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce a new header file for internal migration servicesCédric Le Goater2025-04-251-0/+1
| | | | | | | | | | | | | | Gather all VFIO migration related declarations into "vfio-migration-internal.h" to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". Cc: Kirti Wankhede <kwankhede@nvidia.com> Cc: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-7-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Make vfio_viommu_preset() staticCédric Le Goater2025-04-251-0/+5
| | | | | | | | | | This routine is only used in file "migration.c". Move it there. Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-6-clg@redhat.com Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-6-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Make vfio_un/block_multiple_devices_migration() staticCédric Le Goater2025-04-251-0/+59
| | | | | | | | | | | Both of these routines are only used in file "migration.c". Move them there. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-5-clg@redhat.com Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-5-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Introduce a new header file for external migration servicesCédric Le Goater2025-04-251-5/+6
| | | | | | | | | | | | | | | | | | | | | | | The migration core subsystem makes use of the VFIO migration API to collect statistics on the number of bytes transferred. These services are declared in "hw/vfio/vfio-common.h" which also contains VFIO internal declarations. Move the migration declarations into a new header file "hw/vfio/vfio-migration.h" to reduce the exposure of VFIO internals. While at it, use a 'vfio_migration_' prefix for these services. To be noted, vfio_migration_add_bytes_transferred() is a VFIO migration internal service which we will be moved in the subsequent patches. Cc: Kirti Wankhede <kwankhede@nvidia.com> Cc: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-4-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Rename vfio_reset_bytes_transferred()Cédric Le Goater2025-04-251-1/+1
| | | | | | | | | | | | Enforce a 'vfio_mig_' prefix for the VFIO migration API to better reflect the namespace these routines belong to. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-3-clg@redhat.com Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-3-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Move vfio_mig_active() into migration.cCédric Le Goater2025-04-251-0/+16
| | | | | | | | | | | vfio_mig_active() is part of the VFIO migration API. Move the definitions where VFIO migration is implemented. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-2-clg@redhat.com Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-2-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* hw/vfio: Compile some common objects oncePhilippe Mathieu-Daudé2025-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | Some files don't rely on any target-specific knowledge and can be compiled once: - helpers.c - container-base.c - migration.c (removing unnecessary "exec/ram_addr.h") - migration-multifd.c - cpr.c Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20250308230917.18907-4-philmd@linaro.org> Link: https://lore.kernel.org/qemu-devel/20250311085743.21724-6-philmd@linaro.org Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Multifd device state transfer support - send sideMaciej S. Szmigiero2025-03-061-6/+16
| | | | | | | | | | | | | | | | Implement the multifd device state transfer via additional per-device thread inside save_live_complete_precopy_thread handler. Switch between doing the data transfer in the new handler and doing it in the old save_state handler depending if VFIO multifd transfer is enabled or not. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/4d727e2e0435e0022d50004e474077632830e08d.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Updated save_live_complete_precopy* documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Multifd device state transfer support - config loading supportMaciej S. Szmigiero2025-03-061-1/+8
| | | | | | | | | | | | Load device config received via multifd using the existing machinery behind vfio_load_device_config_state(). Also, make sure to process the relevant main migration channel flags. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/5dbd3f3703ec1097da2cf82a7262233452146fee.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Multifd device state transfer support - load threadMaciej S. Szmigiero2025-03-061-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a thread which loads the VFIO device state buffers that were received via multifd. Each VFIO device that has multifd device state transfer enabled has one such thread, which is created using migration core API qemu_loadvm_start_load_thread(). Since it's important to finish loading device state transferred via the main migration channel (via save_live_iterate SaveVMHandler) before starting loading the data asynchronously transferred via multifd the thread doing the actual loading of the multifd transferred data is only started from switchover_start SaveVMHandler. switchover_start handler is called when MIG_CMD_SWITCHOVER_START sub-command of QEMU_VM_COMMAND is received via the main migration channel. This sub-command is only sent after all save_live_iterate data have already been posted so it is safe to commence loading of the multifd-transferred device state upon receiving it - loading of save_live_iterate data happens synchronously in the main migration thread (much like the processing of MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is processed all the proceeding data must have already been loaded. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/9abe612d775aaf42e31646796acd2363c723a57a.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Added switchover_start documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Multifd device state transfer support - received buffers queuingMaciej S. Szmigiero2025-03-061-0/+4
| | | | | | | | | | | | | | | | | | | | The multifd received data needs to be reassembled since device state packets sent via different multifd channels can arrive out-of-order. Therefore, each VFIO device state packet carries a header indicating its position in the stream. The raw device state data is saved into a VFIOStateBuffer for later in-order loading into the device. The last such VFIO device state packet should have VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/e3bff515a8d61c582b94b409eb12a45b1a143a69.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Added load_state_buffer documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Setup and cleanup multifd transfer in these general methodsMaciej S. Szmigiero2025-03-061-2/+22
| | | | | | | | | | Wire VFIO multifd transfer specific setup and cleanup functions into general VFIO load/save setup and cleanup methods. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/b1f864a65fafd4fdab1f89230df52e46ae41f2ac.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Multifd device state transfer support - basic typesMaciej S. Szmigiero2025-03-061-0/+1
| | | | | | | | | | | | | | Add basic types and flags used by VFIO multifd device state transfer support. Since we'll be introducing a lot of multifd transfer specific code, add a new file migration-multifd.c to home it, wired into main VFIO migration code (migration.c) via migration-multifd.h header file. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/4eedd529e6617f80f3d6a66d7268a0db2bc173fa.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Move migration channel flags to vfio-common.h header fileMaciej S. Szmigiero2025-03-061-17/+0
| | | | | | | | | | This way they can also be referenced in other translation units than migration.c. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/26a940f6b22c1b685818251b7a3ddbbca601b1d6.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Add vfio_add_bytes_transferred()Maciej S. Szmigiero2025-03-061-1/+6
| | | | | | | | | | This way bytes_transferred can also be incremented in other translation units than migration.c. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/d1fbc27ac2417b49892f354ba20f6c6b3f7209f8.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Convert bytes_transferred counter to atomicMaciej S. Szmigiero2025-03-061-4/+4
| | | | | | | | | | | | | | | | So it can be safety accessed from multiple threads. This variable type needs to be changed to unsigned long since 32-bit host platforms lack the necessary addition atomics on 64-bit variables. Using 32-bit counters on 32-bit host platforms should not be a problem in practice since they can't realistically address more memory anyway. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/dc391771d2d9ad0f311994f0cb9e666da564aeaf.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Add load_device_config_state_start trace eventMaciej S. Szmigiero2025-03-061-1/+3
| | | | | | | | | | | | | | And rename existing load_device_config_state trace event to load_device_config_state_end for consistency since it is triggered at the end of loading of the VFIO device config state. This way both the start and end points of particular device config loading operation (a long, BQL-serialized operation) are known. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/1b6c5a2097e64c272eb7e53f9e4cca4b79581b38.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-1/+1
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* vfio/migration: Add vfio_save_block_precopy_empty_hit trace eventMaciej S. Szmigiero2024-11-051-0/+8
| | | | | | | This way it is clearly known when there's no more data to send for that device. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
* vfio/migration: Add save_{iterate, complete_precopy}_start trace eventsMaciej S. Szmigiero2024-11-051-0/+9
| | | | | | | This way both the start and end points of migrating a particular VFIO device are known. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
* vfio/migration: Report only stop-copy size in vfio_state_pending_exact()Avihai Horon2024-10-231-3/+0
| | | | | | | | | | | | | | | | | | | | | | vfio_state_pending_exact() is used to update migration core how much device data is left for the device migration. Currently, the sum of pre-copy and stop-copy sizes of the VFIO device are reported. The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl, which returns the amount of device data available to be transferred while the device is in the PRE_COPY states. The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl, which returns the total amount of device data left to be transferred in order to complete the device migration. According to the above, current implementation is wrong -- it reports extra overlapping data because pre-copy size is already contained in stop-copy size. Fix it by reporting only stop-copy size. Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
* qapi/vfio: Rename VfioMigrationState to Qapi*, and drop prefixMarkus Armbruster2024-09-101-1/+1
| | | | | | | | | | | | | | | | | | | QAPI's 'prefix' feature can make the connection between enumeration type and its constants less than obvious. It's best used with restraint. VfioMigrationState has a 'prefix' that overrides the generated enumeration constants' prefix to QAPI_VFIO_MIGRATION_STATE. We could simply drop 'prefix', but then the enumeration constants would look as if they came from kernel header linux/vfio.h. Rename the type to QapiVfioMigrationState instead, so that 'prefix' is not needed. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240904111836.3273842-20-armbru@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
* vfio/common: Allow disabling device dirty page trackingJoao Martins2024-07-231-1/+3
| | | | | | | | | | | | | | The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole tracking of VF pre-copy phase of dirty page tracking, though it means that it will only be used at the start of the switchover phase. Add an option that disables the VF dirty page tracking, and fall back into container-based dirty page tracking. This also allows to use IOMMU dirty tracking even on VFs with their own dirty tracker scheme. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
* vfio/migration: Don't block migration device dirty tracking is unsupportedJoao Martins2024-07-231-5/+5
| | | | | | | | | | | | | | | | | | | | | | | By default VFIO migration is set to auto, which will support live migration if the migration capability is set *and* also dirty page tracking is supported. For testing purposes one can force enable without dirty page tracking via enable-migration=on, but that option is generally left for testing purposes. So starting with IOMMU dirty tracking it can use to accommodate the lack of VF dirty page tracking allowing us to minimize the VF requirements for migration and thus enabling migration by default for those too. While at it change the error messages to mention IOMMU dirty tracking as well. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> [ clg: - spelling in commit log ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Enhance VFIO migration state tracingAvihai Horon2024-05-161-2/+6
| | | | | | | | | | | | | | | | Move trace_vfio_migration_set_state() to the top of the function, add recover_state to it, and add a new trace event to vfio_migration_set_device_state(). This improves tracing of device state changes as state changes are now also logged when vfio_migration_set_state() fails (covering recover state and device reset transitions) and in no-op state transitions to the same state. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Don't emit STOP_COPY VFIO migration QAPI event twiceAvihai Horon2024-05-161-0/+4
| | | | | | | | | | | | | | | | | | | | | When migrating a VFIO device that supports pre-copy, it is transitioned to STOP_COPY twice: once in vfio_vmstate_change() and second time in vfio_save_complete_precopy(). The second transition is harmless, as it's a STOP_COPY->STOP_COPY no-op transition. However, with the newly added VFIO migration QAPI event, the STOP_COPY event is undesirably emitted twice. Prevent this by returning early in vfio_migration_set_state() if new_state is the same as current device state. Note that the STOP_COPY transition in vfio_save_complete_precopy() is essential for VFIO devices that don't support pre-copy, for migrating an already stopped guest and for snapshots. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Emit VFIO migration QAPI eventAvihai Horon2024-05-161-3/+56
| | | | | | | | | | | | | Emit VFIO migration QAPI event when a VFIO device changes its migration state. This can be used by management applications to get updates on the current state of the VFIO device for their own purposes. A new per VFIO device capability, "migration-events", is added so events can be enabled only for the required devices. It is disabled by default. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio: Also trace event failures in vfio_save_complete_precopy()Cédric Le Goater2024-05-161-3/+0
| | | | | | | | | vfio_save_complete_precopy() currently returns before doing the trace event. Change that. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Add Error** argument to .vfio_save_config() handlerCédric Le Goater2024-05-161-7/+18
| | | | | | | | | | | Use vmstate_save_state_with_err() to improve error reporting in the callers and store a reported error under the migration stream. Add documentation while at it. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* vfio/migration: Add an Error** argument to vfio_migration_set_state()Cédric Le Goater2024-05-161-29/+52
| | | | | | | | | | | | | | Add an Error** argument to vfio_migration_set_state() and adjust callers, including vfio_save_setup(). The error will be propagated up to qemu_savevm_state_setup() where the save_setup() handler is executed. Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to store a reported error under the migration stream if a migration is in progress. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* migration: Extend migration_file_set_error() with Error* argumentCédric Le Goater2024-05-161-2/+2
| | | | | | | | | | | Use it to update the current error of the migration stream if available and if not, simply print out the error. Next changes will update with an error to report. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Acked-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* migration: Add Error** argument to .load_setup() handlerCédric Le Goater2024-04-231-2/+7
| | | | | | | | | | | | This will be useful to report errors at a higher level, mostly in VFIO today. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-9-clg@redhat.com [peterx: drop comment for ERRP_GUARD, per Markus] Signed-off-by: Peter Xu <peterx@redhat.com>
* migration: Add Error** argument to .save_setup() handlerCédric Le Goater2024-04-231-9/+8
| | | | | | | | | | | | | | | | | | | | | | | The purpose is to record a potential error in the migration stream if qemu_savevm_state_setup() fails. Most of the current .save_setup() handlers can be modified to use the Error argument instead of managing their own and calling locally error_report(). Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: John Snow <jsnow@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-8-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* vfio: Always report an error in vfio_save_setup()Cédric Le Goater2024-04-231-3/+12
| | | | | | | | | | | | This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, vfio_save_setup() always sets a new error. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-3-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* Merge tag 'migration-20240311-pull-request' of ↵Peter Maydell2024-03-121-14/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.com/peterx/qemu into staging Migration pull request - Avihai's fix to allow vmstate iterators to not starve for VFIO - Maksim's fix on additional check on precopy load error - Fabiano's fix on fdatasync() hang in mapped-ram - Jonathan's fix on vring cached access over MMIO regions - Cedric's cleanup patches 1-4 out of his error report series - Yu's fix for RDMA migration (which used to be broken even for 8.2) - Anthony's small cleanup/fix on err message - Steve's patches on privatize migration.h - Xiang's patchset to enable zero page detections in multifd threads # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv # 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB # =xavb # -----END PGP SIGNATURE----- # gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal] # gpg: aka "Peter Xu <peterx@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits) migration/multifd: Add new migration test cases for legacy zero page checking. migration/multifd: Enable multifd zero page checking by default. migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page. migration/multifd: Implement zero page transmission on the multifd thread. migration/multifd: Add new migration option zero-page-detection. migration/multifd: Allow clearing of the file_bmap from multifd migration/multifd: Allow zero pages in file migration migration: purge MigrationState from public interface migration: delete unused accessors migration: privatize colo interfaces migration: migration_file_set_error migration: migration_is_device migration: migration_thread_is_self migration: export vcpu_dirty_limit_period migration: export migration_is_running migration: export migration_is_active migration: export migration_is_setup_or_active migration: remove migration.h references migration: export fewer options migration: Fix format in error message ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * migration: migration_file_set_errorSteve Sistare2024-03-111-8/+3
| | | | | | | | | | | | | | | | | | Define and export migration_file_set_error to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * migration: export fewer optionsSteve Sistare2024-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A small number of migration options are accessed by migration clients, but to see them clients must include all of options.h, which is mostly for migration core code. migrate_mode() in particular will be needed by multiple clients. Refactor the option declarations so clients can see the necessary few via misc.h, which already exports a portion of the client API. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * vfio/migration: Add a note about migration rate limitingAvihai Horon2024-03-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VFIO migration buffer size is currently limited to 1MB. Therefore, there is no need to check if migration rate exceeded, as in the worst case it will exceed by only 1MB. However, if the buffer size is later changed to a bigger value, vfio_save_iterate() should enforce migration rate (similar to migration RAM code). Add a note about this in vfio_save_iterate() to serve as a reminder. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240304105339.20713-4-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * vfio/migration: Refactor vfio_save_state() return valueAvihai Horon2024-03-111-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, vfio_save_state() returns 1 regardless of whether there is more data to send or not. This was done to prevent a fast changing VFIO device from potentially blocking other devices from sending their data, as qemu_savevm_state_iterate() serialized devices. Now that qemu_savevm_state_iterate() no longer serializes devices, there is no need for that. Refactor vfio_save_state() to return 0 if more data is available and 1 if no more data is available. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240304105339.20713-3-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
* | vfio: allow cpr-reboot migration if suspendedSteve Sistare2024-03-081-1/+1
|/ | | | | | | | | | | | | | Allow cpr-reboot for vfio if the guest is in the suspended runstate. The guest drivers' suspend methods flush outstanding requests and re-initialize the devices, and thus there is no device state to save and restore. The user is responsible for suspending the guest before initiating cpr, such as by issuing guest-suspend-ram to the qemu guest agent. Relax the vfio blocker so it does not apply to cpr, and add a notifier that verifies the guest is suspended. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
* migration: MigrationNotifyFuncSteve Sistare2024-02-281-2/+1
| | | | | | | | | | | Define MigrationNotifyFunc to improve type safety and simplify migration notifiers. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
* migration: MigrationEvent for notifiersSteve Sistare2024-02-281-7/+3
| | | | | | | | | | | | | | Passing MigrationState to notifiers is unsound because they could access unstable migration state internals or even modify the state. Instead, pass the minimal info needed in a new MigrationEvent struct, which could be extended in the future if needed. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
* migration: convert to NotifierWithReturnSteve Sistare2024-02-281-1/+3
| | | | | | | | | | | | | Change all migration notifiers to type NotifierWithReturn, so notifiers can return an error status in a future patch. For now, pass NULL for the notifier error parameter, and do not check the return value. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com [peterx: dropped unexpected update to roms/seabios-hppa] Signed-off-by: Peter Xu <peterx@redhat.com>
* vfio/migration: Add helper function to set state or reset deviceAvihai Horon2024-01-051-24/+17
| | | | | | | | | | | | | There are several places where failure in setting the device state leads to a device reset, which is done by setting ERROR as the recover state. Add a helper function that sets the device state and resets the device in case of failure. This will make the code cleaner and remove duplicate comments. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>