summary refs log tree commit diff stats
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into ↵Peter Maydell2018-09-284-41/+75
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging Block and testing patches - Paolo's AIO fixes. - VMDK streamOptimized corner case fix - VM testing improvment on -cpu # gpg: Signature made Wed 26 Sep 2018 03:54:08 BST # gpg: using RSA key CA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/staging-pull-request: vmdk: align end of file to a sector boundary tests/vm: Use -cpu max rather than -cpu host aio-posix: do skip system call if ctx->notifier polling succeeds aio-posix: compute timeout before polling aio-posix: fix concurrent access to poll_disable_cnt Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * vmdk: align end of file to a sector boundaryyuchenlin2018-09-261-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a rare case which the size of last compressed cluster is larger than the cluster size, which will cause the file is not aligned at the sector boundary. There are three reasons to do it. First, if vmdk doesn't align at the sector boundary, there may be many undefined behaviors, such as, in vbox it will show VMDK: Compressed image is corrupted 'syno-vm-disk1.vmdk' (VERR_ZIP_CORRUPTED) when we try to import an ova with unaligned vmdk. Second, all the cluster_sector is aligned to sector, the last one should be like this, too. Third, it ease reading with sector based I/Os. Signed-off-by: yuchenlin <yuchenlin@synology.com> Message-Id: <20180913082952.3675-1-yuchenlin@synology.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
| * tests/vm: Use -cpu max rather than -cpu hostPeter Maydell2018-09-261-2/+1
| | | | | | | | | | | | | | | | | | | | -cpu max works with any accelerator, so we don't need to use it only conditionally if not using KVM. Just use it all the time. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180820155554.23476-1-peter.maydell@linaro.org> Signed-off-by: Fam Zheng <famz@redhat.com>
| * aio-posix: do skip system call if ctx->notifier polling succeedsPaolo Bonzini2018-09-261-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 70232b5253 ("aio-posix: Don't count ctx->notifier as progress when 2018-08-15), by not reporting progress, causes aio_poll to execute the system call when polling succeeds because of ctx->notifier. This introduces latency before the call to aio_bh_poll() and negates the advantages of polling, unfortunately. The fix builds on the previous patch, separating the effect of polling on the timeout from the progress reported to aio_poll(). ctx->notifier does zero the timeout, causing the caller to skip the system call, but it does not report progress, so that the bug fix of commit 70232b5253 still stands. Fixes: 70232b5253a3c4e03ed1ac47ef9246a8ac66c6fa Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-4-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
| * aio-posix: compute timeout before pollingPaolo Bonzini2018-09-262-27/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preparation for the next patch, and also a very small optimization. Compute the timeout only once, before invoking try_poll_mode, and adjust it in run_poll_handlers. The adjustment is the polling time when polling fails, or zero (non-blocking) if polling succeeds. Fixes: 70232b5253a3c4e03ed1ac47ef9246a8ac66c6fa Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-3-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
| * aio-posix: fix concurrent access to poll_disable_cntPaolo Bonzini2018-09-261-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is valid for an aio_set_fd_handler to happen concurrently with aio_poll. In that case, poll_disable_cnt can change under the heels of aio_poll, and the assertion on poll_disable_cnt can fail in run_poll_handlers. Therefore, this patch simply checks the counter on every polling iteration. There are no particular needs for ordering, since the polling loop is terminated anyway by aio_notify at the end of aio_set_fd_handler. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-2-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
* | Merge remote-tracking branch ↵Peter Maydell2018-09-286-1453/+1589
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging - some fixes for setrlimit() and write() - fixes ELF loader when host page size is greater than target page size - add SO_LINGER to getsockopt()/setsockopt() - move TargetFdTrans from syscall.c v2: add "#include <linux/netlink.h>" in linux-user/fd-trans.c # gpg: Signature made Tue 25 Sep 2018 21:51:13 BST # gpg: using RSA key F30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/linux-user-for-3.1-pull-request: linux-user: do setrlimit selectively linux-user: write(fd, NULL, 0) parity with linux's treatment of same linux-user: elf: mmap all the target-pages of hostpage for data segment linux-user: add SO_LINGER to {g,s}etsockopt linux-user: move TargetFdTrans functions to their own file Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | linux-user: do setrlimit selectivelyMax Filippov2018-09-251-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | setrlimit guest calls that affect memory resources (RLIMIT_{AS,DATA,STACK}) may interfere with QEMU internal memory management. They may result in QEMU lockup because mprotect call in page_unprotect would fail with ENOMEM error code, causing infinite loop of SIGSEGV. E.g. it happens when running libstdc++ testsuite for xtensa target on x86_64 host. Don't call host setrlimit for memory-related resources. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Message-Id: <20180917181314.22551-1-jcmvbkbc@gmail.com> [lv: rebase on master] Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | linux-user: write(fd, NULL, 0) parity with linux's treatment of sameTony Garnock-Jones2018-09-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring linux-user write(2) handling into line with linux for the case of a 0-byte write with a NULL buffer. Based on a patch originally written by Zhuowei Zhang. Addresses https://bugs.launchpad.net/qemu/+bug/1716292. >From Zhuowei Zhang's patch (https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg08073.html): Linux returns success for the special case of calling write with a zero-length NULL buffer: compiling and running int main() { ssize_t ret = write(STDOUT_FILENO, NULL, 0); fprintf(stderr, "write returned %ld\n", ret); return 0; } gives "write returned 0" when run directly, but "write returned -1" in QEMU. This commit checks for this situation and returns success if found. Subsequent discussion raised the following questions (and my answers): - Q. Should TARGET_NR_read pass through to safe_read in this situation too? A. I'm wary of changing unrelated code to the specific problem I'm addressing. TARGET_NR_read is already consistent with Linux for this case. - Q. Do pread64/pwrite64 need to be changed similarly? A. Experiment suggests not: both linux and linux-user yield -1 for NULL 0-length reads/writes. Signed-off-by: Tony Garnock-Jones <tonygarnockjones@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20180908182205.GB409@mornington.dcs.gla.ac.uk> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | linux-user: elf: mmap all the target-pages of hostpage for data segmentShivaprasad G Bhat2018-09-251-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the hostpage size is greater than the TARGET_PAGESIZE, the target-pages of size TARGET_PAGESIZE are marked valid only till the length requested during the elfload. The glibc attempts to consume unused space in the last page of data segment(__libc_memalign() in elf/dl-minimal.c). If PT_LOAD p_align is greater than or equal to hostpage size, the GLRO(dl_pagesize) is actually the host pagesize as set in the auxillary vectors. So, there is no explicit mmap request for the remaining target-pages on the last hostpage. The glibc assumes that particular space as available and subsequent attempts to use those addresses lead to crash as the target_mmap has not marked them valid for those target-pages. The issue is seen when trying to chroot to 16.04-x86_64 ubuntu on a PPC64 host where the fork fails to access the thread_id as it is allocated on a page not marked valid. The recent glibc doesn't have checks for thread-id in fork, but the issue can manifest somewhere else, none the less. The fix here is to map all the target-pages of the hostpage during the elfload if the p_align is greater than or equal to hostpage size, for data segment to allow the glibc for proper consumption. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <153553435604.51992.5640085189104207249.stgit@lep8c.aus.stglabs.ibm.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | linux-user: add SO_LINGER to {g,s}etsockoptCarlo Marcelo Arenas Belón2018-09-252-1/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original implementation for setsockopt by Chen Gang[1]; all bugs mine, including removing assignment for optname which hopefully makes the logic easier to follow and moving some variables to make the code more selfcontained. [1] http://patchwork.ozlabs.org/patch/565659/ Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Co-Authored-By: Chen Gang <gang.chen.5i5j@gmail.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20180824085601.6259-1-carenas@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
| * | linux-user: move TargetFdTrans functions to their own fileLaurent Vivier2018-09-254-1448/+1508
| | | | | | | | | | | | | | | | | | | | | | | | This will ease to move out syscall functions from syscall.c Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180823222215.13781-1-laurent@vivier.eu>
* | | Merge remote-tracking branch ↵Peter Maydell2018-09-2519-16/+56
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/huth-gitlab/tags/pull-request-2018-09-25' into staging - Deprecate the usage of a network backend via "name" instead of "id" - Deprecate the "enforce-config-section" machine parameter - Re-enable the wdt_ib700, endianness and vmxnet3 qtests - Some trivial fixes and doc update patches that crossed my way # gpg: Signature made Tue 25 Sep 2018 16:58:42 BST # gpg: using RSA key 2ED9D774FE702DB5 # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" # gpg: aka "Thomas Huth <thuth@redhat.com>" # gpg: aka "Thomas Huth <huth@tuxfamily.org>" # gpg: aka "Thomas Huth <th.huth@posteo.de>" # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth-gitlab/tags/pull-request-2018-09-25: Revert "check: Move VMXNET3 test to common" Revert "check: Move endianess test to common" Revert "check: Move wdt_ib700 test to common" tests/migration: Speed up the test on ppc64 hw/qdev-core: Fix description of instance_init qdev: fix a typo in comment docs: Fix some typos (most found by codespell) trivial: Make bios files and source files non-executable memfd: fix possible usage of the uninitialized file descriptor hw/core/machine: Officially deprecate the enforce-config-section parameter net/slirp: Deprecate the [hub_id name] parameter tuple net: Deprecate the "name" parameter of -net Makefile: Add missing dependency for qemu-deprecated.texi Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | Revert "check: Move VMXNET3 test to common"Thomas Huth2018-09-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 7a066770f53c198014add869696427f81d67e9c2. The patch did not work as expected: The vmxnet3 test is currently not run at all anymore. Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | Revert "check: Move endianess test to common"Thomas Huth2018-09-251-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 669cc7100065c690cb7b4f3da5cfc471d1ed4740. The patch did not work as expected: The endianess test is currently not run at all anymore. Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | Revert "check: Move wdt_ib700 test to common"Thomas Huth2018-09-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit ee1f6c812b3240420dff07a3860060b7d4abfe09. The patch did not work as expected: The wdt_ib700 test is currently not run at all anymore. Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | tests/migration: Speed up the test on ppc64Thomas Huth2018-09-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SLOF boot process is always quite slow ... but we can speed it up a little bit by specifying "-nodefaults" and by using the "nvramrc" variable instead of "boot-command" (since "nvramrc" is evaluated earlier in the SLOF boot process than "boot-command"). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | hw/qdev-core: Fix description of instance_initThomas Huth2018-09-251-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The part of the documentation of DeviceClass that talks about instance_init is partly wrong: instance_init() functions must not abort or exit, since the function is also called during introspection of the device already. So if a device calls exit() during its instance_init() function, QEMU terminates unexpectedly if somebody tries to just have a look at the interfaces from the device with "device_add xyz,help" or with the "device-list-properties" QOM command. This should never happen. Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | qdev: fix a typo in commentLi Qiang2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | Found by reading code. Signed-off-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | docs: Fix some typos (most found by codespell)Stefan Weil2018-09-253-4/+4
| | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | trivial: Make bios files and source files non-executableThomas Huth2018-09-256-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | These files can not be executed on the host, so they should not be marked as executable. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | memfd: fix possible usage of the uninitialized file descriptorDima Stepanov2018-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The qemu_memfd_alloc_check() routine allocates the fd variable on stack. This variable is initialized inside the qemu_memfd_alloc() function. There are several cases when *fd will be left unintialized which can lead to the unexpected close() in the qemu_memfd_free() call. Set file descriptor to -1 before calling the qemu_memfd_alloc routine. Signed-off-by: Dima Stepanov <dimastep@yandex-team.ru> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | hw/core/machine: Officially deprecate the enforce-config-section parameterThomas Huth2018-09-252-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 16f7244842b5135543ef068a1adafd94c6965953 added this parameter to the documentation, including a note that it is deprecated. But it has never been added to the "Deprecated features" appendix, which is our official way to deprecate legacy parameters. So let's do this now. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | net/slirp: Deprecate the [hub_id name] parameter tupleThomas Huth2018-09-252-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "name" in the [hub_id name] parameter tuple is the same as a "netdev_id" (which should be unique), so specifying the hub_id here is just redundant (it was likely just necessary in the past when the network subsystem was still using "vlans" only and when it did not use unique "id"s yet). Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | net: Deprecate the "name" parameter of -netThomas Huth2018-09-252-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In early times, network backends were specified by a "vlan" and "name" tuple. With the introduction of netdevs, the "name" was replaced by an "id" (which is supposed to be unique), but the "name" parameter stayed as an alias which could be used instead of "id". Unfortunately, we miss the duplication check for "name": $ qemu-system-x86_64 -net user,name=n1 -net user,name=n1 ... starts without an error, while "id" correctly complains: $ qemu-system-x86_64 -net user,id=n1 -net user,id=n1 qemu-system-x86_64: -net user,id=n1: Duplicate ID 'n1' for net Instead of trying to fix the code for the legacy "name" parameter, let's rather get rid of this old interface and deprecate the "name" parameter now - this will also be less confusing for the users in the long run. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
| * | Makefile: Add missing dependency for qemu-deprecated.texiThomas Huth2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that the docs get correctly regenerated when the file qemu-deprecated.texi has been changed. Fixes: 44c67847e32c91a6071fb0440c357b9489f08bc6 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit f99ce85279178385f204a52236f855c879c29cdc)
* | | Merge remote-tracking branch 'remotes/xanclic/tags/pull-block-2018-09-25' ↵Peter Maydell2018-09-2529-317/+856
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Block layer patches: - Drain fixes - node-name parameters for block-commit - Refactor block jobs to use transactional callbacks for exiting # gpg: Signature made Tue 25 Sep 2018 16:12:44 BST # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/xanclic/tags/pull-block-2018-09-25: (42 commits) test-bdrv-drain: Test draining job source child and parent block: Use a single global AioWait test-bdrv-drain: Fix outdated comments test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort job: Avoid deadlocks in job_completed_txn_abort() test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level() block: Remove aio_poll() in bdrv_drain_poll variants blockjob: Lie better in child_job_drained_poll() block-backend: Decrease in_flight only after callback block-backend: Fix potential double blk_delete() block-backend: Add .drained_poll callback block: Add missing locking in bdrv_co_drain_bh_cb() test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback job: Use AIO_WAIT_WHILE() in job_finish_sync() test-blockjob: Acquire AioContext around job_cancel_sync() test-bdrv-drain: Drain with block jobs in an I/O thread aio-wait: Increase num_waiters even in home thread blockjob: Wake up BDS when job becomes idle job: Fix missing locking due to mismerge job: Fix nested aio_poll() hanging in job_txn_apply ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * \ \ Merge remote-tracking branch 'kevin/tags/for-upstream' into blockMax Reitz2018-09-2524-114/+527
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer patches: - Fix some jobs/drain/aio_poll related hangs - commit: Add top-node/base-node options - linux-aio: Fix locking for qemu_laio_process_completions() - Fix use after free error in bdrv_open_inherit # gpg: Signature made Tue Sep 25 15:54:01 2018 CEST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * kevin/tags/for-upstream: (26 commits) test-bdrv-drain: Test draining job source child and parent block: Use a single global AioWait test-bdrv-drain: Fix outdated comments test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort job: Avoid deadlocks in job_completed_txn_abort() test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level() block: Remove aio_poll() in bdrv_drain_poll variants blockjob: Lie better in child_job_drained_poll() block-backend: Decrease in_flight only after callback block-backend: Fix potential double blk_delete() block-backend: Add .drained_poll callback block: Add missing locking in bdrv_co_drain_bh_cb() test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback job: Use AIO_WAIT_WHILE() in job_finish_sync() test-blockjob: Acquire AioContext around job_cancel_sync() test-bdrv-drain: Drain with block jobs in an I/O thread aio-wait: Increase num_waiters even in home thread blockjob: Wake up BDS when job becomes idle job: Fix missing locking due to mismerge job: Fix nested aio_poll() hanging in job_txn_apply ... Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | test-bdrv-drain: Test draining job source child and parentKevin Wolf2018-09-251-8/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the block job drain test, don't only test draining the source and the target node, but create a backing chain for the source (source_backing <- source <- source_overlay) and test draining each of the nodes in it. When using iothreads, the source node (and therefore the job) is in a different AioContext than the drain, which happens from the main thread. This way, the main thread waits in AIO_WAIT_WHILE() for the iothread to make process and aio_wait_kick() is required to notify it. The test validates that calling bdrv_wakeup() for a child or a parent node will actually notify AIO_WAIT_WHILE() instead of letting it hang. Increase the sleep time a bit (to 1 ms) because the test case is racy and with the shorter sleep, it didn't reproduce the bug it is supposed to test for me under 'rr record -n'. This was because bdrv_drain_invoke_entry() (in the main thread) was only called after the job had already reached the pause point, so we got a bdrv_dec_in_flight() from the main thread and the additional aio_wait_kick() when the job becomes idle (that we really wanted to test here) wasn't even necessary any more to make progress. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | block: Use a single global AioWaitKevin Wolf2018-09-2510-65/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | test-bdrv-drain: Fix outdated commentsKevin Wolf2018-09-251-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 89bd030533e changed the test case from using job_sleep_ns() to using qemu_co_sleep_ns() instead. Also, block_job_sleep_ns() became job_sleep_ns() in commit 5d43e86e11f. In both cases, some comments in the test case were not updated. Do that now. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
| | * | | test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abortKevin Wolf2018-09-251-12/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds tests for calling AIO_WAIT_WHILE() in the .commit and .abort callbacks. Both reasons why .abort could be called for a single job are tested: Either .run or .prepare could return an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | job: Avoid deadlocks in job_completed_txn_abort()Kevin Wolf2018-09-251-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Amongst others, job_finalize_single() calls the .prepare/.commit/.abort callbacks of the individual job driver. Recently, their use was adapted for all block jobs so that they involve code calling AIO_WAIT_WHILE() now. Such code must be called under the AioContext lock for the respective job, but without holding any other AioContext lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level()Kevin Wolf2018-09-251-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a regression test for a deadlock that could occur in callbacks called from the aio_poll() in bdrv_drain_poll_top_level(). The AioContext lock wasn't released and therefore would be taken a second time in the callback. This would cause a possible AIO_WAIT_WHILE() in the callback to hang. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
| | * | | block: Remove aio_poll() in bdrv_drain_poll variantsKevin Wolf2018-09-251-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_drain_poll_top_level() was buggy because it didn't release the AioContext lock of the node to be drained before calling aio_poll(). This way, callbacks called by aio_poll() would possibly take the lock a second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. However, it turns out that the aio_poll() call isn't actually needed any more. It was introduced in commit 91af091f923, which is effectively reverted by this patch. The cases it was supposed to fix are now covered by bdrv_drain_poll(), which waits for block jobs to reach a quiescent state. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | blockjob: Lie better in child_job_drained_poll()Kevin Wolf2018-09-253-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block jobs claim in .drained_poll() that they are in a quiescent state as soon as job->deferred_to_main_loop is true. This is obviously wrong, they still have a completion BH to run. We only get away with this because commit 91af091f923 added an unconditional aio_poll(false) to the drain functions, but this is bypassing the regular drain mechanisms. However, just removing this and telling that the job is still active doesn't work either: The completion callbacks themselves call drain functions (directly, or indirectly with bdrv_reopen), so they would deadlock then. As a better lie, tell that the job is active as long as the BH is pending, but falsely call it quiescent from the point in the BH when the completion callback is called. At this point, nested drain calls won't deadlock because they ignore the job, and outer drains will wait for the job to really reach a quiescent state because the callback is already running. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | block-backend: Decrease in_flight only after callbackKevin Wolf2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Request callbacks can do pretty much anything, including operations that will yield from the coroutine (such as draining the backend). In that case, a decreased in_flight would be visible to other code and could lead to a drain completing while the callback hasn't actually completed yet. Note that reordering these operations forbids calling drain directly inside an AIO callback. As Paolo explains, indirectly calling it is okay: - Calling it through a coroutine is okay, because then bdrv_drained_begin() goes through bdrv_co_yield_to_drain() and you have in_flight=2 when bdrv_co_yield_to_drain() yields, then soon in_flight=1 when the aio_co_wake() in the AIO callback completes, then in_flight=0 after the bottom half starts. - Calling it through a bottom half would be okay too, as long as the AIO callback remembers to do inc_in_flight/dec_in_flight just like bdrv_co_yield_to_drain() and bdrv_co_drain_bh_cb() do A few more important cases that come to mind: - A coroutine that yields because of I/O is okay, with a sequence similar to bdrv_co_yield_to_drain(). - A coroutine that yields with no I/O pending will correctly decrease in_flight to zero before yielding. - Calling more AIO from the callback won't overflow the counter just because of mutual recursion, because AIO functions always yield at least once before invoking the callback. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | block-backend: Fix potential double blk_delete()Kevin Wolf2018-09-251-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blk_unref() first decreases the refcount of the BlockBackend and calls blk_delete() if the refcount reaches zero. Requests can still be in flight at this point, they are only drained during blk_delete(): At this point, arbitrary callbacks can run. If any callback takes a temporary BlockBackend reference, it will first increase the refcount to 1 and then decrease it to 0 again, triggering another blk_delete(). This will cause a use-after-free crash in the outer blk_delete(). Fix it by draining the BlockBackend before decreasing to refcount to 0. Assert in blk_ref() that it never takes the first refcount (which would mean that the BlockBackend is already being deleted). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | block-backend: Add .drained_poll callbackKevin Wolf2018-09-251-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bdrv_drain operation must ensure that all parents are quiesced, this includes BlockBackends. Otherwise, callbacks called by requests that are completed on the BDS layer, but not quite yet on the BlockBackend layer could still create new requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | block: Add missing locking in bdrv_co_drain_bh_cb()Kevin Wolf2018-09-253-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_do_drained_begin/end() assume that they are called with the AioContext lock of bs held. If we call drain functions from a coroutine with the AioContext lock held, we yield and schedule a BH to move out of coroutine context. This means that the lock for the home context of the coroutine is released and must be re-acquired in the bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callbackKevin Wolf2018-09-251-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a regression test for a deadlock that occurred in block job completion callbacks (via job_defer_to_main_loop) because the AioContext lock was taken twice: once in job_finish_sync() and then again in job_defer_to_main_loop_bh(). This would cause AIO_WAIT_WHILE() to hang. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
| | * | | job: Use AIO_WAIT_WHILE() in job_finish_sync()Kevin Wolf2018-09-251-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | job_finish_sync() needs to release the AioContext lock of the job before calling aio_poll(). Otherwise, callbacks called by aio_poll() would possibly take the lock a second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. Also, job_drain() without aio_poll() isn't necessarily enough to make progress on a job, it could depend on bottom halves to be executed. Combine both open-coded while loops into a single AIO_WAIT_WHILE() call that solves both of these problems. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | test-blockjob: Acquire AioContext around job_cancel_sync()Kevin Wolf2018-09-252-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers in QEMU proper hold the AioContext lock when calling job_finish_sync(). test-blockjob should do the same when it calls the function indirectly through job_cancel_sync(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
| | * | | test-bdrv-drain: Drain with block jobs in an I/O threadKevin Wolf2018-09-251-6/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This extends the existing drain test with a block job to include variants where the block job runs in a different AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
| | * | | aio-wait: Increase num_waiters even in home threadKevin Wolf2018-09-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if AIO_WAIT_WHILE() is called in the home context of the AioContext, we still want to allow the condition to change depending on other threads as long as they kick the AioWait. Specfically block jobs can be running in an I/O thread and should then be able to kick a drain in the main loop context. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
| | * | | blockjob: Wake up BDS when job becomes idleKevin Wolf2018-09-254-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the context of draining a BDS, the .drained_poll callback of block jobs is called. If this returns true (i.e. there is still some activity pending), the drain operation may call aio_poll() with blocking=true to wait for completion. As soon as the pending activity is completed and the job finally arrives in a quiescent state (i.e. its coroutine either yields with busy=false or terminates), the block job must notify the aio_poll() loop to wake up, otherwise we get a deadlock if both are running in different threads. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
| | * | | job: Fix missing locking due to mismergeKevin Wolf2018-09-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | job_completed() had a problem with double locking that was recently fixed independently by two different commits: "job: Fix nested aio_poll() hanging in job_txn_apply" "jobs: add exit shim" One fix removed the first aio_context_acquire(), the other fix removed the other one. Now we have a bug again and the code is run without any locking. Add it back in one of the places. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
| | * | | job: Fix nested aio_poll() hanging in job_txn_applyFam Zheng2018-09-251-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers have acquired ctx already. Doing that again results in aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the callback cannot make progress because ctx is recursively locked, for example, when drive-backup finishes. There are two callers of job_finalize(): fam@lemon:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize' blockdev.c: job_finalize(&job->job, errp); blockdev.c- aio_context_release(aio_context); -- job-qmp.c: job_finalize(job, errp); job-qmp.c- aio_context_release(aio_context); -- tests/test-blockjob.c: job_finalize(&job->job, &error_abort); tests/test-blockjob.c- assert(job->job.status == JOB_STATUS_CONCLUDED); Ignoring the test, it's easy to see both callers to job_finalize (and job_do_finalize) have acquired the context. Cc: qemu-stable@nongnu.org Reported-by: Gu Nini <ngu@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| | * | | util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cbSergio Lopez2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AIO Coroutines shouldn't by managed by an AioContext different than the one assigned when they are created. aio_co_enter avoids entering a coroutine from a different AioContext, calling aio_co_schedule instead. Scheduled coroutines are then entered by co_schedule_bh_cb using qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the current AioContext obtained with qemu_get_current_aio_context. Eventually, co->ctx will be set to the AioContext passed as an argument to qemu_aio_coroutine_enter. This means that, if an IO Thread's AioConext is being processed by the Main Thread (due to aio_poll being called with a BDS AioContext, as it happens in AIO_WAIT_WHILE among other places), the AioContext from some coroutines may be wrongly replaced with the one from the Main Thread. This is the root cause behind some crashes, mainly triggered by the drain code at block/io.c. The most common are these abort and failed assertion: util/async.c:aio_co_schedule 456 if (scheduled) { 457 fprintf(stderr, 458 "%s: Co-routine was already scheduled in '%s'\n", 459 __func__, scheduled); 460 abort(); 461 } util/qemu-coroutine-lock.c: 286 assert(mutex->holder == self); But it's also known to cause random errors at different locations, and even SIGSEGV with broken coroutine backtraces. By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can pass the correct AioContext as an argument, making sure co->ctx is not wrongly altered. Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| | * | | qemu-iotests: Test snapshot=on with nonexistent TMPDIRAlberto Garcia2018-09-253-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We just fixed a bug that was causing a use-after-free when QEMU was unable to create a temporary snapshot. This is a test case for this scenario. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>