summary refs log tree commit diff stats
path: root/tests/qtest/migration-test.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* tests/qtest: massively speed up migration-testDaniel P. Berrangé2023-07-101-18/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The migration test cases that actually exercise live migration want to ensure there is a minimum of two iterations of pre-copy, in order to exercise the dirty tracking code. Historically we've queried the migration status, looking for the 'dirty-sync-count' value to increment to track iterations. This was not entirely reliable because often all the data would get transferred quickly enough that the migration would finish before we wanted it to. So we massively dropped the bandwidth and max downtime to guarantee non-convergance. This had the unfortunate side effect that every migration took at least 30 seconds to run (100 MB of dirty pages / 3 MB/sec). This optimization takes a different approach to ensuring that a mimimum of two iterations. Rather than waiting for dirty-sync-count to increment, directly look for an indication that the source VM has dirtied RAM that has already been transferred. On the source VM a magic marker is written just after the 3 MB offset. The destination VM is now montiored to detect when the magic marker is transferred. This gives a guarantee that the first 3 MB of memory have been transferred. Now the source VM memory is monitored at exactly the 3MB offset until we observe a flip in its value. This gives us a guaranteed that the guest workload has dirtied a byte that has already been transferred. Since we're looking at a place that is only 3 MB from the start of memory, with the 3 MB/sec bandwidth, this test should complete in 1 second, instead of 30 seconds. Once we've proved there is some dirty memory, migration can be set back to full speed for the remainder of the 1st iteration, and the entire of the second iteration at which point migration should be complete. On a test machine this further reduces the migration test time from 8 minutes to 1 minute 40. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-11-berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: Add migration switchover ack capability testAvihai Horon2023-06-301-0/+31
| | | | | | | | | | | | | Add migration switchover ack capability test. The test runs without devices that support this capability, but is still useful to make sure it didn't break anything. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* qtest/migration: Document live=true casesPeter Xu2023-06-021-6/+31
| | | | | | | | | | | | | | | Document every single live=true use cases on why it should be done in the live manner. Also document on the parameter so new precopy cases should always use live=off unless with explicit reasonings. Cc: Thomas Huth <thuth@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20230601172935.175726-1-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: make more migration pre-copy scenarios run non-liveDaniel P. Berrangé2023-06-021-15/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 27 pre-copy live migration scenarios being tested. In all of these we force non-convergence and run for one iteration, then let it converge and wait for completion during the second (or following) iterations. At 3 mbps bandwidth limit the first iteration takes a very long time (~30 seconds). While it is important to test the migration passes and convergence logic, it is overkill to do this for all 27 pre-copy scenarios. The TLS migration scenarios in particular are merely exercising different code paths during connection establishment. To optimize time taken, switch most of the test scenarios to run non-live (ie guest CPUs paused) with no bandwidth limits. This gives a massive speed up for most of the test scenarios. For test coverage the following scenarios are unchanged * Precopy with UNIX sockets * Precopy with UNIX sockets and dirty ring tracking * Precopy with XBZRLE * Precopy with UNIX compress * Precopy with UNIX compress (nowait) * Precopy with multifd On a test machine this reduces execution time from 13 minutes to 8 minutes. Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-10-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: distinguish src/dst migration VM stop/resume eventsDaniel P. Berrangé2023-06-021-13/+13
| | | | | | | | | | | The 'got_stop' and 'got_resume' global variables apply to the src and dst migration VM respectively. Change their names to make this explicit to developers. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-9-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: capture RESUME events during migrationDaniel P. Berrangé2023-06-021-0/+5
| | | | | | | | | | When running migration tests we monitor for a STOP event so we can skip redundant waits. This will be needed for the RESUME event too shortly. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-8-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: replace wait_command() with qtest_qmp_assert_successDaniel P. Berrangé2023-06-021-99/+71
| | | | | | | | | | | Most usage of wait_command() is followed by qobject_unref(), which is just a verbose re-implementation of qtest_qmp_assert_success(). Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-7-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: switch to using event callbacks for STOP eventDaniel P. Berrangé2023-06-021-0/+4
| | | | | | | | | | | | | Change the migration test to use the new qtest event callback to watch for the stop event. This ensures that we only watch for the STOP event on the source QEMU. The previous code would set the single 'got_stop' flag when either source or dest QEMU got the STOP event. Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-6-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: get rid of some 'qtest_qmp' usage in migration testDaniel P. Berrangé2023-06-021-34/+18
| | | | | | | | | | | Some of the usage is just a verbose way of re-inventing the qtest_qmp_assert_success(_ref) methods. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-5-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: get rid of 'qmp_command' helper in migration testDaniel P. Berrangé2023-06-021-14/+15
| | | | | | | | | | | | | This function duplicates logic of qtest_qmp_assert_success_ref. The qtest_qmp_assert_success_ref method has better diagnostics on failure because it prints the entire QMP response, instead of just asserting on existance of the 'error' key. Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230601161347.1803440-4-berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: replace qmp_discard_response with qtest_qmp_assert_successDaniel P. Berrangé2023-05-161-4/+1
| | | | | | | | | | | | | | | The qmp_discard_response method simply ignores the result of the QMP command, merely unref'ing the object. This is a bad idea for tests as it leaves no trace if the QMP command unexpectedly failed. The qtest_qmp_assert_success method will validate that the QMP command returned without error, and if errors occur, it will print a message on the console aiding debugging. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230421171411.566300-2-berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* qtest/migration-test.c: Add postcopy tests with compress enabledLukas Straub2023-05-081-30/+55
| | | | | | | | | | | Add postcopy tests with compress enabled to ensure nothing breaks with the refactoring in the next commits. preempt+compress is blocked, so no test needed for that case. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* qtest/migration-test.c: Add tests with compress enabledLukas Straub2023-05-081-0/+109
| | | | | | | | | | | | There has never been tests for migration with compress enabled. Add suitable tests, testing with compress-wait-thread = false too. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* tests/qtest: Fix tests when no KVM or TCG are presentFabiano Rosas2023-05-021-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | It is possible to have a build with both TCG and KVM disabled due to Xen requiring the i386 and x86_64 binaries to be present in an aarch64 host. If we build with --disable-tcg on the aarch64 host, we will end-up with a QEMU binary (x86) that does not support TCG nor KVM. Skip tests that crash or hang in the above scenario. Do not include any test cases if TCG and KVM are missing. Make sure that calls to qtest_has_accel are placed after g_test_init in similar fashion to commit ae4b01b349 ("tests: Ensure TAP version is printed before other messages") to avoid TAP parsing errors. Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230426180013.14814-9-farosas@suse.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tests/migration: Only run auto_converge in slow modeJuan Quintela2023-04-201-2/+21
| | | | | | Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230412142001.16501-3-quintela@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* qtests: avoid printing comments before g_test_init()Daniel P. Berrangé2023-03-221-3/+7
| | | | | | | | | | | | | | | | | | | | | | | The TAP protocol version line must be the first thing printed on stdout. The migration test failed that requirement in certain scenarios: # Skipping test: Userfault not available (builtdtime) TAP version 13 # random seed: R02Sc120c807f11053eb90bfea845ba1e368 1..32 # Start of x86_64 tests # Start of migration tests .... The TAP version is printed by g_test_init(), so we need to make sure that any methods which print are run after that. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230317170553.592707-1-berrange@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
* tests/migration: Tweek auto converge limits checkDr. David Alan Gilbert2023-03-131-8/+11
| | | | | | | | | | | | | | | | | | | | Thomas found an autoconverge test failure where the migration completed before the autoconverge had kicked in. To try and avoid this again: a) Reduce the usleep in test_migrate_auto_converge so that it should exit quicker when autoconverge kicks in b) Make the loop exit immediately rather than have the sleep when it does start autoconverge, otherwise the autoconverge might succeed during the sleep. c) Reduce inc_pct so auto converge happens more slowly d) Reduce the max-bandwidth in migrate_ensure_non_converge to make the ensure more ensure. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20230306152612.52291-1-dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancelPeter Maydell2023-03-041-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | migration-test has been flaky for a long time, both in CI and otherwise: https://gitlab.com/qemu-project/qemu/-/jobs/3806090216 (a FreeBSD job) 32/648 ERROR:../tests/qtest/migration-helpers.c:205:wait_for_migration_status: assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) ERROR on a local macos x86 box: ▶ 34/621 ERROR:../../tests/qtest/migration-helpers.c:151:migrate_query_not_failed: assertion failed: (!g_str_equal(status, "failed")) ERROR 34/621 qemu:qtest+qtest-i386 / qtest-i386/migration-test ERROR 168.12s killed by signal 6 SIGABRT ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― stderr: qemu-system-i386: Failed to peek at channel query-migrate shows failed migration: Unable to write to socket: Broken pipe ** ERROR:../../tests/qtest/migration-helpers.c:151:migrate_query_not_failed: assertion failed: (!g_str_equal(status, "failed")) (test program exited with status code -6) ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― ▶ 37/621 ERROR:../../tests/qtest/migration-helpers.c:151:migrate_query_not_failed: assertion failed: (!g_str_equal(status, "failed")) ERROR 37/621 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR 174.37s killed by signal 6 SIGABRT ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― stderr: query-migrate shows failed migration: Unable to write to socket: Broken pipe ** ERROR:../../tests/qtest/migration-helpers.c:151:migrate_query_not_failed: assertion failed: (!g_str_equal(status, "failed")) (test program exited with status code -6) In the cases where I've looked at the underlying log, this seems to be in the migration/multifd/tcp/plain/cancel subtest. Disable that specific subtest by default until somebody can track down the underlying cause. Enthusiasts can opt back in by setting QEMU_TEST_FLAKY_TESTS=1 in their environment. We might need to disable more parts of this test if this isn't sufficient to fix the flakiness. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-id: 20230302172211.4146376-1-peter.maydell@linaro.org
* util/userfaultfd: Add uffd_open()Peter Xu2023-02-061-2/+2
| | | | | | | | | Add a helper to create the uffd handle. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* Call qemu_socketpair() instead of socketpair() when possibleGuoyi Tu2023-01-161-1/+1
| | | | | | | | | | | | | As qemu_socketpair() was introduced in commit 3c63b4e9 ("oslib-posix: Introduce qemu_socketpair()"), it's time to replace the other existing socketpair() calls with qemu_socketpair() if possible Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <cd28916a-f1f3-b54e-6ade-8a3647c3a9a5@chinatelecom.cn> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* tests/qtest/migration-test: Fix unlink error and memory leaksThomas Huth2022-12-031-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running the migration test compiled with Clang from Fedora 37 and sanitizers enabled, there is an error complaining about unlink(): ../tests/qtest/migration-test.c:1072:12: runtime error: null pointer passed as argument 1, which is declared to never be null /usr/include/unistd.h:858:48: note: nonnull attribute specified here SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../tests/qtest/migration-test.c:1072:12 in (test program exited with status code 1) TAP parsing error: Too few tests run (expected 33, got 20) The data->clientcert and data->clientkey pointers can indeed be unset in some tests, so we have to check them before calling unlink() with those. While we're at it, I also noticed that the code is only freeing some but not all of the allocated strings in this function, and indeed, valgrind is also complaining about memory leaks here. So let's call g_free() on all allocated strings to avoid leaking memory here. Message-Id: <20221125083054.117504-1-thuth@redhat.com> Tested-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Enable TLS PSK tests for win32Bin Meng2022-11-061-14/+0
| | | | | | | | | | Since commit f1018ea0a30f ("tests: avoid DOS line endings in PSK file"), the bug of the helper test_tls_psk_init_common() that caused TLS PSK tests to fail on Windows was fixed. Let's enable these tests on win32. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20221101035021.729669-1-bin.meng@windriver.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: Fix two format stringsStefan Weil2022-11-061-2/+2
| | | | | | | Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20221105115525.623059-1-sw@weilnetz.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Make sure QEMU process "to" exited after ↵Xuzhou Cheng2022-10-281-0/+4
| | | | | | | | | | | | | migration is canceled Make sure QEMU process "to" exited before launching another target for migration in the test_multifd_tcp_cancel case. Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20221028045736.679903-8-bin.meng@windriver.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: Use EXIT_FAILURE instead of magic numberBin Meng2022-10-281-2/+2
| | | | | | | | | | | | When migration fails, QEMU exits with a status code EXIT_FAILURE. Change qtests to use the well-defined macro instead of magic number. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20221028045736.679903-6-bin.meng@windriver.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Fix [-Werror=format-overflow=] build warningBin Meng2022-10-221-2/+2
| | | | | | | | | | | | | | | When tmpfs is NULL, a build warning is seen with GCC 9.3.0. It's strange that GCC 11.2.0 on Ubuntu 22.04 does not catch this, neither did the QEMU CI. While we are here, improve the error message as well. Reported-by: Shengjiang Wu <shengjiang.wu@windriver.com> Fixes: e5553c1b8d28 ("tests/qtest: migration-test: Avoid using hardcoded /tmp") Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221017132023.2228641-1-bmeng.cn@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* tests/qtest: migration-test: Avoid using hardcoded /tmpBin Meng2022-10-121-4/+6
| | | | | | | | | | This case was written to use hardcoded /tmp directory for temporary files. Update to use g_dir_make_tmp() for a portable implementation. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20221006151927.2079583-5-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Skip running some TLS cases for win32Bin Meng2022-09-271-0/+14
| | | | | | | | | | | | | | | Some migration test cases use TLS to communicate, but they fail on Windows with the following error messages: qemu-system-x86_64: TLS handshake failed: Insufficient credentials for that request. qemu-system-x86_64: TLS handshake failed: Error in the pull function. query-migrate shows failed migration: TLS handshake failed: Error in the pull function. Disable them temporarily. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20220925113032.1949844-51-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Disable IO redirection for win32Bin Meng2022-09-271-0/+9
| | | | | | | | | | | On Windows the QEMU executable is created via CreateProcess() and IO redirection does not work, so don't bother adding IO redirection to the command line. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220925113032.1949844-40-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Skip running test_migrate_fd_proto on win32Bin Meng2022-08-251-0/+4
| | | | | | | | | | | The test case 'test_migrate_fd_proto' calls socketpair() which does not exist on win32. Exclude it. The helper function wait_command_fd() is not needed anymore, hence exclude it too. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220824094029.1634519-22-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: migration-test: Handle link() for win32Bin Meng2022-08-251-0/+8
| | | | | | | | | | | Windows does not provide a link() API like POSIX. Instead it provides a similar API CreateHardLink() that does the same thing, but with different argument order and return value. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220824094029.1634519-14-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: Use g_mkdir_with_parents()Bin Meng2022-08-251-3/+3
| | | | | | | | | | Use the same g_mkdir_with_parents() call to create a directory on all platforms. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220824094029.1634519-13-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest: Use g_mkdtemp()Bin Meng2022-08-251-2/+2
| | | | | | | | | | Windows does not provide a mkdtemp() API, but glib does. Replace mkdtemp() call with the glib version. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220824094029.1634519-3-bmeng.cn@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests/qtest/migration-test: Remove duplicated test_postcopy from the test planThomas Huth2022-08-241-1/+0
| | | | | | | | | | | | | | | test_postcopy() is currently run twice - which is just a waste of resources and time. The commit d1a27b169b2d that introduced the duplicate talked about renaming the "postcopy/unix" test, but apparently it forgot to remove the old entry. Let's do that now. Fixes: d1a27b169b ("tests: Add postcopy tls migration test") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220819053802.296584-5-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220822165608.2980552-6-alex.bennee@linaro.org>
* tests/qtest/migration-test: Only wait for serial output where migration succeedsThomas Huth2022-08-241-1/+3
| | | | | | | | | | | | | | | | | Waiting for the serial output can take a couple of seconds - and since we're doing a lot of migration tests, this time easily sums up to multiple minutes. But if a test is supposed to fail, it does not make much sense to wait for the source to be in the right state first, so we can skip the waiting here. This way we can speed up all tests where the migration is supposed to fail. In the gitlab-CI gprov-gcov test, each of the migration-tests now run two minutes faster! Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220819053802.296584-2-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220822165608.2980552-3-alex.bennee@linaro.org>
* tests/qtest/migration-test: Run the dirty ring tests only with the x86 targetThomas Huth2022-08-011-3/+4
| | | | | | | | | | | | | | kvm_dirty_ring_supported() only checks whether the dirty ring support is available on the x86 host, but it ignores whether the target QEMU architecture is x86 or not. Thus the test_vcpu_dirty_limit() test currently fails with the assert((strcmp(arch, "x86_64") == 0)) statement in dirtylimit_start_vm() if the users run e.g. "make check-qtest-aarch64" on their x86 host. Fix it by only executing the tests when we're running with a x86_64 target QEMU binary with KVM. Message-Id: <20220801114644.208197-1-thuth@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* migration-test: Allow test to run without uffdPeter Xu2022-08-011-23/+25
| | | | | | | | | | | | | We used to stop running all tests if uffd is not detected. However logically that's only needed for postcopy not the rest of tests. Keep running the rest when still possible. Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220728133516.92061-3-peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* migration-test: Use migrate_ensure_converge() for auto-convergePeter Xu2022-08-011-17/+2
| | | | | | | | | | | | | | | | | | | Thomas reported that auto-converge test will timeout on MacOS CI gatings. Use the migrate_ensure_converge() helper too in the auto-converge as when Daniel reworked the other test cases. Since both max_bandwidth / downtime_limit will not be used for converge calculations, make it simple by removing the remaining check, then we can completely remove both variables altogether, since migrate_ensure_converge is used the remaining check won't make much sense anyway. Reported-by: Thomas Huth <thuth@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220728133516.92061-2-peterx@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: Add postcopy preempt testsPeter Xu2022-07-201-2/+57
| | | | | | | | | | | | | | | Four tests are added for preempt mode: - Postcopy plain - Postcopy recovery - Postcopy tls - Postcopy tls+recovery Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185530.27801-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge
* tests: Add postcopy tls recovery migration testPeter Xu2022-07-201-9/+30
| | | | | | | | | | | It's easy to build this upon the postcopy tls test. Rename the old postcopy recovery test to postcopy/recovery/plain. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185527.27747-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge
* tests: Add postcopy tls migration testPeter Xu2022-07-201-10/+51
| | | | | | | | | | | | | | We just added TLS tests for precopy but not postcopy. Add the corresponding test for vanilla postcopy. Rename the vanilla postcopy to "postcopy/plain" because all postcopy tests will only use unix sockets as channel. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185525.27692-1-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge
* tests: Move MigrateCommon upperPeter Xu2022-07-201-72/+72
| | | | | | | | | So that it can be used in postcopy tests too soon. Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185522.27638-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* tests: Add dirty page rate limit testHyman Huang(黄勇)2022-07-201-0/+256
| | | | | | | | | | | | | Add dirty page rate limit test if kernel support dirty ring, The following qmp commands are covered by this test case: "calc-dirty-rate", "query-dirty-rate", "set-vcpu-dirty-limit", "cancel-vcpu-dirty-limit" and "query-vcpu-dirty-limit". Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Acked-by: Peter Xu <peterx@redhat.com> Message-Id: <eed5b847a6ef0a9c02a36383dbdd7db367dd1e7e.1656177590.git.huangy81@chinatelecom.cn> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* tests: use consistent bandwidth/downtime limits in migration testsDaniel P. Berrangé2022-07-051-34/+20
| | | | | | | | | | | | | | The different migration test cases are using a variety of settings to ensure convergance/non-convergance. Introduce two helpers to extra the common functionality and ensure consistency. * Non-convergance: 1ms downtime, 30mbs bandwidth * Convergance: 30s downtime, 1gbs bandwidth Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220628105434.295905-5-berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: increase migration test converge downtime to 30 secondsDaniel P. Berrangé2022-07-051-1/+1
| | | | | | | | | | | | | | While 1 second might be enough to converge migration on a fast host, this is not guaranteed, especially if using TLS in the tests without hardware accelerated crypto available. Increasing the downtime to 30 seconds should guarantee it can converge in any sane scenario. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220628105434.295905-4-berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: wait for migration completion before looking for STOP eventDaniel P. Berrangé2022-07-051-1/+4
| | | | | | | | | | | | | | | When moving into the convergance phase, the precopy tests will first look for a STOP event and once found will look for migration completion status. If the test VM is not converging, the test suite will be waiting for the STOP event forever. If we wait for the migration completion status first, then we will trigger the previously added timeout and prevent the test hanging forever. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220628105434.295905-3-berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tests: ensure migration status isn't reported as failedDaniel P. Berrangé2022-05-161-3/+3
| | | | | | | | | | | | | | | Various methods in the migration test call 'query_migrate' to fetch the current status and then access a particular field. Almost all of these cases expect the migration to be in a non-failed state. In the case of 'wait_for_migration_pass' in particular, if the status is 'failed' then it will get into an infinite loop. By validating that the status is not 'failed' the test suite will assert rather than hang when getting into an unexpected state. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220426160048.812266-10-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* tests: add multifd migration tests of TLS with x509 credentialsDaniel P. Berrangé2022-05-161-0/+127
| | | | | | | | | | | | | This validates that we correctly handle multifd migration success and failure scenarios when using TLS with x509 certificates. There are quite a few different scenarios that matter in relation to hostname validation, but we skip a couple as we can assume that the non-multifd coverage applies to some extent. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220426160048.812266-9-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* tests: add multifd migration tests of TLS with PSK credentialsDaniel P. Berrangé2022-05-161-4/+56
| | | | | | | | | | This validates that we correctly handle multifd migration success and failure scenarios when using TLS with pre shared keys. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220426160048.812266-8-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* tests: convert multifd migration tests to use common helperDaniel P. Berrangé2022-05-161-37/+40
| | | | | | | | | | | | Most of the multifd migration test logic is common with the rest of the precopy tests, so it can use the helper without difficulty. The only exception of the multifd cancellation test which tries to run multiple migrations in a row. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220426160048.812266-7-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>