summary refs log tree commit diff stats
path: root/python/qemu/utils/qemu_ga_client.py (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-02-16vfio/migration: Remove VFIO migration protocol v1Avihai Horon4-707/+24
Now that v2 protocol implementation has been added, remove the deprecated v1 implementation. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-10-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Implement VFIO migration protocol v2Avihai Horon4-39/+469
Implement the basic mandatory part of VFIO migration protocol v2. This includes all functionality that is necessary to support VFIO_MIGRATION_STOP_COPY part of the v2 protocol. The two protocols, v1 and v2, will co-exist and in the following patches v1 protocol code will be removed. There are several main differences between v1 and v2 protocols: - VFIO device state is now represented as a finite state machine instead of a bitmap. - Migration interface with kernel is now done using VFIO_DEVICE_FEATURE ioctl and normal read() and write() instead of the migration region. - Pre-copy is made optional in v2 protocol. Support for pre-copy will be added later on. Detailed information about VFIO migration protocol v2 and its difference compared to v1 protocol can be found here [1]. [1] https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com>. Link: https://lore.kernel.org/r/20230216143630.25610-9-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Rename functions/structs related to v1 protocolAvihai Horon4-61/+61
To avoid name collisions, rename functions and structs related to VFIO migration protocol v1. This will allow the two protocols to co-exist when v2 protocol is added, until v1 is removed. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-8-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Move migration v1 logic to vfio_migration_init()Avihai Horon2-16/+16
Move vfio_dev_get_region_info() logic from vfio_migration_probe() to vfio_migration_init(). This logic is specific to v1 protocol and moving it will make it easier to add the v2 protocol implementation later. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-7-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Block multiple devices migrationAvihai Horon3-0/+61
Currently VFIO migration doesn't implement some kind of intermediate quiescent state in which P2P DMAs are quiesced before stopping or running the device. This can cause problems in multi-device migration where the devices are doing P2P DMAs, since the devices are not stopped together at the same time. Until such support is added, block migration of multiple devices. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-6-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/common: Change vfio_devices_all_running_and_saving() logic to ↵Avihai Horon1-7/+10
equivalent one vfio_devices_all_running_and_saving() is used to check if migration is in pre-copy phase. This is done by checking if migration is in setup or active states and if all VFIO devices are in pre-copy state, i.e. _SAVING | _RUNNING. In VFIO migration protocol v2 pre-copy support is made optional. Hence, a matching v2 protocol pre-copy state can't be used here. As preparation for adding v2 protocol, change vfio_devices_all_running_and_saving() logic such that it doesn't use the VFIO pre-copy state. The new equivalent logic checks if migration is in active state and if all VFIO devices are in running state [1]. No functional changes intended. [1] Note that checking if migration is in setup or active states and if all VFIO devices are in running state doesn't guarantee that we are in pre-copy phase, thus we check if migration is only in active state. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-5-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Allow migration without VFIO IOMMU dirty tracking supportAvihai Horon2-4/+19
Currently, if IOMMU of a VFIO container doesn't support dirty page tracking, migration is blocked. This is because a DMA-able VFIO device can dirty RAM pages without updating QEMU about it, thus breaking the migration. However, this doesn't mean that migration can't be done at all. In such case, allow migration and let QEMU VFIO code mark all pages dirty. This guarantees that all pages that might have gotten dirty are reported back, and thus guarantees a valid migration even without VFIO IOMMU dirty tracking support. The motivation for this patch is the introduction of iommufd [1]. iommufd can directly implement the /dev/vfio/vfio container IOCTLs by mapping them into its internal ops, allowing the usage of these IOCTLs over iommufd. However, VFIO IOMMU dirty tracking is not supported by this VFIO compatibility API. This patch will allow migration by hosts that use the VFIO compatibility API and prevent migration regressions caused by the lack of VFIO IOMMU dirty tracking support. [1] https://lore.kernel.org/kvm/0-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-4-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Fix NULL pointer dereference bugAvihai Horon1-1/+3
As part of its error flow, vfio_vmstate_change() accesses MigrationState->to_dst_file without any checks. This can cause a NULL pointer dereference if the error flow is taken and MigrationState->to_dst_file is not set. For example, this can happen if VM is started or stopped not during migration and vfio_vmstate_change() error flow is taken, as MigrationState->to_dst_file is not set at that time. Fix it by checking that MigrationState->to_dst_file is set before using it. Fixes: 02a7e71b1e5b ("vfio: Add VM state change handler to know state of VM") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20230216143630.25610-3-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16linux-headers: Update to v6.2-rc8Avihai Horon13-39/+230
Update to commit ceaa837f96ad ("Linux 6.2-rc8"). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/r/20230216143630.25610-2-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-15migration: Rename res_{postcopy,precopy}_onlyJuan Quintela9-66/+59
Once that res_compatible is removed, they don't make sense anymore. We remove the _only preffix. And to make things clearer we rename them to must_precopy and can_postcopy. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-15migration: Remove unused res_compatibleJuan Quintela11-33/+16
Nothing assigns to it after previous commit. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-15migration: In case of postcopy, the memory ends in res_postcopy_onlyJuan Quintela1-1/+1
So remove last assignation of res_compatible. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-15migration/block: Convert remaining DPRINTF() debug macro to trace eventsPhilippe Mathieu-Daudé2-11/+2
Finish the conversion from commit fe80c0241d ("migration: using trace_ to replace DPRINTF"). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-15migration/qemu-file: Add qemu_file_get_to_fd()Avihai Horon2-0/+35
Add new function qemu_file_get_to_fd() that allows reading data from QEMUFile and writing it straight into a given fd. This will be used later in VFIO migration code. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-15ui: remove deprecated 'password' option for SPICEDaniel P. Berrangé4-31/+8
This has been replaced by the 'password-secret' option, which references a 'secret' object instance. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2023-02-15block: deprecate iSCSI 'password' in favour of 'password-secret'Daniel P. Berrangé2-0/+11
Support for referencing secret objects was added in commit b189346eb1784df95ed6fed610411dbf23d19e1f Author: Daniel P. Berrangé <berrange@redhat.com> Date: Thu Jan 21 14:19:21 2016 +0000 iscsi: add support for getting CHAP password via QCryptoSecret API The existing 'password' option is overdue for deprecation and subsequent removal. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2023-02-15block: mention 'password-secret' option for -iscsiDaniel P. Berrangé1-2/+2
The 'password-secret' option was added commit b189346eb1784df95ed6fed610411dbf23d19e1f Author: Daniel P. Berrangé <berrange@redhat.com> Date: Thu Jan 21 14:19:21 2016 +0000 iscsi: add support for getting CHAP password via QCryptoSecret API but was not mentioned in the command line docs Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2023-02-15io/channel-tls: fix handling of bigger read buffersAntoine Damhet1-1/+65
Since the TLS backend can read more data from the underlying QIOChannel we introduce a minimal child GSource to notify if we still have more data available to be read. Signed-off-by: Antoine Damhet <antoine.damhet@shadow.tech> Signed-off-by: Charles Frey <charles.frey@shadow.tech> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2023-02-15crypto: TLS: introduce `check_pending`Antoine Damhet2-0/+25
The new `qcrypto_tls_session_check_pending` function allows the caller to know if data have already been consumed from the backend and is already available. Signed-off-by: Antoine Damhet <antoine.damhet@shadow.tech> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2023-02-14hw/s390x/event-facility: Replace DO_UPCAST(SCLPEvent) by SCLP_EVENT()Philippe Mathieu-Daudé1-2/+1
Use the SCLP_EVENT() QOM type-checking macro to avoid DO_UPCAST(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230212225144.58660-16-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/tcg/s390x: Use -nostdlib for softmmu testsIlya Leoshkevich1-1/+1
The code currently uses -nostartfiles, but this does not prevent linking with libc. On Fedora there is no cross-libc, so the linking step fails. Fix by using the more comprehensive -nostdlib (that's also what probe_target_compiler() checks for as well). Fixes: 503e549e441e ("tests/tcg/s390x: Test unaligned accesses to lowcore") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-Id: <20230131182057.2261614-1-iii@linux.ibm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Don't build virtio-serial-test.c if device not presentFabiano Rosas1-1/+5
The virtconsole device might not be present in the QEMU build that is being tested. Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230213210738.9719-5-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: bios-tables-test: Skip if missing configsFabiano Rosas1-1/+3
If we build with --without-default-devices, CONFIG_HPET and CONFIG_PARALLEL are set to N, which makes the respective devices go missing from acpi tables. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230208194700.11035-13-farosas@suse.de> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qemu-iotests: Require virtio-scsi-pciFabiano Rosas1-0/+1
Check that virtio-scsi-pci is present in the QEMU build before running the tests. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230208194700.11035-12-farosas@suse.de> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Do not include hexloader-test if loader device is not presentFabiano Rosas1-2/+2
Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-11-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Check for devices in bios-tables-testFabiano Rosas1-4/+71
Do not include tests that require devices that are not available in the QEMU build. Signed-off-by: Fabiano Rosas <farosas@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230208194700.11035-10-farosas@suse.de> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: drive_del-test: Skip tests that require missing devicesFabiano Rosas1-0/+65
Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-9-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Skip unplug tests that use missing devicesFabiano Rosas1-6/+27
Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-8-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14test/qtest: Fix coding style in device-plug-test.cFabiano Rosas1-3/+5
We should not mix declarations and statements in QEMU code. Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-7-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: hd-geo-test: Check for missing devicesFabiano Rosas1-13/+25
Don't include tests that require devices not available in the QEMU binary. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230208194700.11035-6-farosas@suse.de> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Add dependence on PCIE_PORT for virtio-net-failover.cFabiano Rosas1-1/+2
This test depends on the presence of the pcie-root-port device. Add a build time dependency. Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-4-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Do not run lsi53c895a test if device is not presentFabiano Rosas1-0/+4
The tests are built once for all the targets, so as long as one QEMU binary is built with CONFIG_LSI_SCSI_PCI=y, this test will run. However some binaries might not include the device. So check this again in runtime. Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-3-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest: Skip PXE tests for missing devicesFabiano Rosas1-0/+4
Check if the devices we're trying to add are present in the QEMU binary. They could have been removed from the build via Kconfig or the --without-default-devices option. Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-Id: <20230208194700.11035-2-farosas@suse.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14Do not include "qemu/error-report.h" in headers that do not need itThomas Huth19-3/+16
Include it in the .c files instead that use the error reporting functions. Message-Id: <20230210111931.1115489-1-thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14include/hw: Do not include "hw/registerfields.h" in headers that don't need itThomas Huth5-3/+2
Include "hw/registerfields.h" in the .c files instead (if needed). Message-Id: <20230210112315.1116966-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14hw/misc/sga: Remove the deprecated "sga" deviceThomas Huth14-106/+12
It's been deprecated since QEMU v6.2, so it should be OK to finally remove this now. Message-Id: <20230209161540.1054669-1-thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14tests/qtest/npcm7xx_pwm-test: Be less verbose unless V=2Peter Maydell1-6/+21
The npcm7xx_pwm-test produces a lot of output at V=1, which means that on our CI tests the log files exceed the gitlab 500KB limit. Suppress the messages about exactly what is being tested unless at V=2 and above. This follows the pattern we use with qom-test. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230209135047.1753081-1-peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14build: deprecate --enable-gprof builds and remove from CIAlex Bennée4-17/+26
As gprof relies on instrumentation you rarely get useful data compared to a real optimised build. Lets deprecate the build option and simplify the CI configuration as a result. Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1338 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230131094224.861621-1-alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14meson: Disable libdw for static builds by defaultIlya Leoshkevich1-1/+2
Static QEMU build fails on Debian Bullseye: /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libdw.a(debuginfod-client.o): in function `__libdwfl_debuginfod_init': (.text.startup+0x17): undefined reference to `dlopen' The reason is that pkg-config does not suggest -ldl for libdw, and adding --extra-ldflags="-ldl" resolves the issue. However, static linking with libdw is an unclear topic: * Linux perf does it. * Debian's libdw-dev description says: Only link to the static version for special cases and when you don't need anything from the ebl backends. * As the error message above indicates, -ldl is also needed for debuginfod support. The functionality provided by libdw is needed for analyzing performance of JITed code, which is mostly useful to developers and researchers. Therefore, in order to avoid unpleasant surprises for people who don't need this, simply disable libdw for static builds by default. It can still be enabled explicitly if needed. Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-Id: <20230210005208.438142-2-iii@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14meson: Add missing libdw knobsIlya Leoshkevich3-4/+12
Add the missing meson infrastructure bits for the new libdw dependency. Model them after the existing capstone knobs. Fixes: 7c10cb38ccb8 ("accel/tcg: Add debuginfo support") Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230210005208.438142-1-iii@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-14configure: Bump minimum Clang version to 10.0Thomas Huth1-19/+6
Anthony Perard recently reported some problems with Clang v6.0 from Ubuntu Bionic (with regards to the -Wmissing-braces configure test). Since we're not officially supporting that version of Ubuntu anymore, we should better bump our minimum version check in the configure script instead of using our time to fix problems of unsupported compilers. According to repology.org, our supported distros ship these versions of Clang (looking at the highest version only): Fedora 36: 14.0.5 CentOS 8 (RHEL-8): 12.0.1 Debian 11: 13.0.1 OpenSUSE Leap 15.4: 13.0.1 Ubuntu LTS 20.04: 12.0.0 FreeBSD Ports: 15.0.7 NetBSD pkgsrc: 15.0.7 Homebrew: 15.0.7 MSYS2 mingw: 15.0.7 Haiku ports: 12.0.1 While it seems like we could update to v12.0.0 from that point of view, the default version on Ubuntu 20.04 is still v10.0, and we use that for our CI tests based via the tests/docker/dockerfiles/ubuntu2004.docker file. Thus let's make v10.0 our minimum version now (which corresponds to Apple Clang version v12.0). The -Wmissing-braces check can then be removed, too, since both our minimum GCC and our minimum Clang version now handle this correctly. Message-Id: <20230131180239.1582302-1-thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-02-13ram: Document migration ram flagsJuan Quintela1-6/+10
0x80 is RAM_SAVE_FLAG_HOOK, it is in qemu-file now. Bigger usable flag is 0x200, noticing that. We can reuse RAM_SAVe_FLAG_FULL. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-13migration/multifd: Move load_cleanup inside incoming_state_destroyLeonardo Bras3-1/+11
Currently running migration_incoming_state_destroy() without first running multifd_load_cleanup() will cause a yank error: qemu-system-x86_64: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed. (core dumped) The above error happens in the target host, when multifd is being used for precopy, and then postcopy is triggered and the migration finishes. This will crash the VM in the target host. To avoid that, move multifd_load_cleanup() inside migration_incoming_state_destroy(), so that the load cleanup becomes part of the incoming state destroying process. Running multifd_load_cleanup() twice can become an issue, though, but the only scenario it could be ran twice is on process_incoming_migration_bh(). So removing this extra call is necessary. On the other hand, this multifd_load_cleanup() call happens way before the migration_incoming_state_destroy() and having this happening before dirty_bitmap_mig_before_vm_start() and vm_start() may be a need. So introduce a new function multifd_load_shutdown() that will mainly stop all multifd threads and close their QIOChannels. Then use this function instead of multifd_load_cleanup() to make sure nothing else is received before dirty_bitmap_mig_before_vm_start(). Fixes: b5eea99ec2 ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-13migration/multifd: Join all multifd threads in order to avoid leaksLeonardo Bras1-1/+2
Current approach will only join threads that are still running. For the threads not joined, resources or private memory are always kept in the process space and never reclaimed before process end, and this risks serious memory leaks. This should usually not represent a big problem, since multifd migration is usually just ran at most a few times, and after it succeeds there is not much to be done before exiting the process. Yet still, it should not hurt performance to join all of them. Fixes: b5eea99ec2 ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-13migration/multifd: Remove unnecessary assignment on multifd_load_cleanup()Leonardo Bras1-1/+0
Before assigning "p->quit = true" for every multifd channel, multifd_load_cleanup() will call multifd_recv_terminate_threads() which already does the same assignment, while protected by a mutex. So there is no point doing the same assignment again. Fixes: b5eea99ec2 ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-13migration/multifd: Change multifd_load_cleanup() signature and usageLeonardo Bras3-15/+7
Since it's introduction in commit f986c3d256 ("migration: Create multifd migration threads"), multifd_load_cleanup() never returned any value different than 0, neither set up any error on errp. Even though, on process_incoming_migration_bh() an if clause uses it's return value to decide on setting autostart = false, which will never happen. In order to simplify the codebase, change multifd_load_cleanup() signature to 'void multifd_load_cleanup(void)', and for every usage remove error handling or decision made based on return value != 0. Fixes: b5eea99ec2 ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-11migration: Postpone postcopy preempt channel to be after mainPeter Xu5-21/+82
Postcopy with preempt-mode enabled needs two channels to communicate. The order of channel establishment is not guaranteed. It can happen that the dest QEMU got the preempt channel connection request before the main channel is established, then the migration may make no progress even during precopy due to the wrong order. To fix it, create the preempt channel only if we know the main channel is established. For a general postcopy migration, we delay it until postcopy_start(), that's where we already went through some part of precopy on the main channel. To make sure dest QEMU has already established the channel, we wait until we got the first PONG received. That's something we do at the start of precopy when postcopy enabled so it's guaranteed to happen sooner or later. For a postcopy recovery, we delay it to qemu_savevm_state_resume_prepare() where we'll have round trips of data on bitmap synchronizations, which means the main channel must have been established. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-11migration: Add a semaphore to count PONGsPeter Xu2-0/+9
This is mostly useless, but useful for us to know whether the main channel is correctly established without changing the migration protocol. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-11migration: Cleanup postcopy_preempt_setup()Peter Xu3-14/+4
Since we just dropped the only case where postcopy_preempt_setup() can return an error, it doesn't need a retval anymore because it never fails. Move the preempt check to the caller, preparing it to be used elsewhere to do nothing but as simple as kicking the async connection. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-11migration: Rework multi-channel checks on URIPeter Xu4-42/+28
The whole idea of multi-channel checks was not properly done, IMHO. Currently we check multi-channel in a lot of places, but actually that's not needed because we only need to check it right after we get the URI and that should be it. If the URI check succeeded, we should never need to check it again because we must have it. If it check fails, we should fail immediately on either the qmp_migrate or qmp_migrate_incoming, instead of failingg it later after the connection established. Neither should we fail any set capabiliities like what we used to do here: 5ad15e8614 ("migration: allow enabling mutilfd for specific protocol only", 2021-10-19) Because logically the URI will only be set later after the capability is set, so it doesn't make a lot of sense to check the URI type when setting the capability, because we're checking the cap with an old URI passed in, and that may not even be the URI we're going to use later. This patch mostly reverted all such checks for before, dropping the variable migrate_allow_multi_channels and helpers. Instead, add a common helper to check URI for multi-channels for either qmp_migrate and qmp_migrate_incoming and that should do all the proper checks. The failure will only trigger with the "migrate" or "migrate_incoming" command, or when user specified "-incoming xxx" where "xxx" is not "defer". Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>