summary refs log tree commit diff stats
path: root/blockdev.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* block: mark bdrv_inactivate() as GRAPH_RDLOCK and move drain to callersFiona Ebner2025-07-141-28/+45
| | | | | | | | | | | | | | The function bdrv_inactivate() calls bdrv_drain_all_begin(), which needs to be called with the graph unlocked, so either bdrv_inactivate() should be marked as GRAPH_UNLOCKED or the drain needs to be moved to the callers. The caller in qmp_blockdev_set_active() requires that the locked section covers bdrv_find_node() too, so the latter alternative is chosen. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-36-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: drop wrapper for bdrv_set_backing_hd_drained()Fiona Ebner2025-07-141-5/+8
| | | | | | | | | | | | Nearly all callers (outside of the tests) are already using the _drained() variant of the function. It doesn't seem worth keeping. Simply adapt the remaining callers of bdrv_set_backing_hd() and rename bdrv_set_backing_hd_drained() to bdrv_set_backing_hd(). Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-31-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: avoid locking and draining multiple times in external_snapshot_abort()Fiona Ebner2025-07-141-8/+14
| | | | | | | | | | | By using the appropriate variants bdrv_set_backing_hd_drained() and bdrv_try_change_aio_context_locked(), there only needs to be a single drained and write-locked section in external_snapshot_abort(). Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-30-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: add bdrv_graph_wrlock_drained() convenience wrapperFiona Ebner2025-07-141-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Many write-locked sections are also drained sections. A new bdrv_graph_wrunlock_drained() wrapper around bdrv_graph_wrunlock() is introduced, which will begin a drained section first. A global variable is used so bdrv_graph_wrunlock() knows if it also needs to end such a drained section. Both the aio_poll call in bdrv_graph_wrlock() and the aio_bh_poll() in bdrv_graph_wrunlock() can re-enter a write-locked section. While for the latter, ending the drain could be moved to before the call, the former requires that the variable is a counter and not just a boolean. Since the wrapper calls bdrv_drain_all_begin(), which must be called with the graph unlocked, mark the wrapper as GRAPH_UNLOCKED too. The switch to the new helpers was generated with the following commands and then manually checked: find . -name '*.c' -exec sed -i -z 's/bdrv_drain_all_begin();\n\s*bdrv_graph_wrlock();/bdrv_graph_wrlock_drained();/g' {} ';' find . -name '*.c' -exec sed -i -z 's/bdrv_graph_wrunlock();\n\s*bdrv_drain_all_end();/bdrv_graph_wrunlock();/g' {} ';' Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-25-f.ebner@proxmox.com> [kwolf: Removed redundant GRAPH_UNLOCKED] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: drain while unlocked in external_snapshot_action()Fiona Ebner2025-06-041-1/+16
| | | | | | | | | This is in preparation to mark bdrv_drained_begin() as GRAPH_UNLOCKED. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-19-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: drain while unlocked in internal_snapshot_action()Fiona Ebner2025-06-041-2/+17
| | | | | | | | | This is in preparation to mark bdrv_drained_begin() as GRAPH_UNLOCKED. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-18-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: move drain outside of quorum_add_child()Fiona Ebner2025-06-041-0/+2
| | | | | | | | | | | | | | | | | | This is part of resolving the deadlock mentioned in commit "block: move draining out of bdrv_change_aio_context() and mark GRAPH_RDLOCK". The quorum_add_child() callback runs under the graph lock, so it is not allowed to drain. It is only called as the .bdrv_add_child() callback, which is only called in the bdrv_add_child() function, which also runs under the graph lock. The bdrv_add_child() function is called by qmp_x_blockdev_change(), where a drained section is introduced. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-15-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: move drain outside of bdrv_try_change_aio_context()Fiona Ebner2025-06-041-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is part of resolving the deadlock mentioned in commit "block: move draining out of bdrv_change_aio_context() and mark GRAPH_RDLOCK". Convert the function to a _locked() version that has to be called with the graph lock held and add a convenience wrapper that has to be called with the graph unlocked, which drains and takes the lock itself. Since bdrv_try_change_aio_context() is global state code, the wrapper is too. Callers are adapted to use the appropriate variant, depending on whether the caller already holds the lock. In the test_set_aio_context() unit test, prior drains can be removed, because draining already happens inside the new wrapper. Note that bdrv_attach_child_common_abort(), bdrv_attach_child_common() and bdrv_root_unref_child() hold the graph lock and are not actually allowed to drain either. This will be addressed in the following commits. Functions like qmp_blockdev_mirror() query the nodes to act on before draining and locking. In theory, draining could invalidate those nodes. This kind of issue is not addressed by these commits. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20250530151125.955508-10-f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/snapshot: move drain outside of read-locked bdrv_snapshot_delete()Fiona Ebner2025-06-041-8/+17
| | | | | | | | | | | | | | | | This is in preparation to mark bdrv_drained_begin() as GRAPH_UNLOCKED. More granular draining is not trivially possible, because bdrv_snapshot_delete() can recursively call itself. The return value of bdrv_all_delete_snapshot() changes from -1 to -errno propagated from failed sub-calls. This is fine for the existing callers of bdrv_all_delete_snapshot(). Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20250530151125.955508-4-f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* mirror: Drop redundant zero_target parameterEric Blake2025-05-141-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two callers to a mirror job (drive-mirror and blockdev-mirror) set zero_target precisely when sync mode == FULL, with the one exception that drive-mirror skips zeroing the target if it was newly created and reads as zero. But given the previous patch, that exception is equally captured by target_is_zero. Meanwhile, there is another slight wrinkle, fortunately caught by iotest 185: if the caller uses "sync":"top" but the source has no backing file, the code in blockdev.c was changing sync to be FULL, but only after it had set zero_target=false. In mirror.c, prior to recent patches, this didn't matter: the only places that inspected sync were setting is_none_mode (both TOP and FULL had set that to false), and mirror_start() setting base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL. But now that we are passing sync around, the slammed sync mode would result in a new pre-zeroing pass even when the user had passed "sync":"top" in an effort to skip pre-zeroing. Fortunately, the assignment of base when bs has no backing chain still works out to NULL if we don't slam things. So with the forced change of sync ripped out of blockdev.c, the sync mode is passed through the full callstack unmolested, and we can now reliably reconstruct the same settings as what used to be passed in by zero_target=false, without the redundant parameter. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20250509204341.3553601-24-eblake@redhat.com> Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [eblake: Fix regression in iotest 185] Signed-off-by: Eric Blake <eblake@redhat.com>
* mirror: Allow QMP override to declare target already zeroEric Blake2025-05-141-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QEMU has an optimization for a just-created drive-mirror destination that is not possible for blockdev-mirror (which can't create the destination) - any time we know the destination starts life as all zeroes, we can skip a pre-zeroing pass on the destination. Recent patches have added an improved heuristic for detecting if a file contains all zeroes, and we plan to use that heuristic in upcoming patches. But since a heuristic cannot quickly detect all scenarios, and there may be cases where the caller is aware of information that QEMU cannot learn quickly, it makes sense to have a way to tell QEMU to assume facts about the destination that can make the mirror operation faster. Given our existing example of "qemu-img convert --target-is-zero", it is time to expose this override in QMP for blockdev-mirror as well. This patch results in some slight redundancy between the older s->zero_target (set any time mode==FULL and the destination image was not just created - ie. clear if drive-mirror is asking to skip the pre-zero pass) and the newly-introduced s->target_is_zero (in addition to the QMP override, it is set when drive-mirror creates the destination image); this will be cleaned up in the next patch. There is also a subtlety that we must consider. When drive-mirror is passing target_is_zero on behalf of a just-created image, we know the image is sparse (skipping the pre-zeroing keeps it that way), so it doesn't matter whether the destination also has "discard":"unmap" and "detect-zeroes":"unmap". But now that we are letting the user set the knob for target-is-zero, if the user passes a pre-existing file that is fully allocated, it is fine to leave the file fully allocated under "detect-zeroes":"on", but if the file is open with "detect-zeroes":"unmap", we should really be trying harder to punch holes in the destination for every region of zeroes copied from the source. The easiest way to do this is to still run the pre-zeroing pass (turning the entire destination file sparse before populating just the allocated portions of the source), even though that currently results in double I/O to the portions of the file that are allocated. A later patch will add further optimizations to reduce redundant zeroing I/O during the mirror operation. Since "target-is-zero":true is designed for optimizations, it is okay to silently ignore the parameter rather than erroring if the user ever sets the parameter in a scenario where the mirror job can't exploit it (for example, when doing "sync":"top" instead of "sync":"full", we can't pre-zero, so setting the parameter won't make a speed difference). Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20250509204341.3553601-23-eblake@redhat.com> Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
* blockdev-backup: Add error handling option for copy-before-write jobsRaman Dzehtsiar2025-05-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the blockdev-backup QMP command to allow users to specify how to behave when IO errors occur during copy-before-write operations. Previously, the behavior was fixed and could not be controlled by the user. The new 'on-cbw-error' option can be set to one of two values: - 'break-guest-write': Forwards the IO error to the guest and triggers the on-source-error policy. This preserves snapshot integrity at the expense of guest IO operations. - 'break-snapshot': Allows the guest OS to continue running normally, but invalidates the snapshot and aborts related jobs. This prioritizes guest operation over backup consistency. This enhancement provides more flexibility for backup operations in different environments where requirements for guest availability versus backup consistency may vary. The default behavior remains unchanged to maintain backward compatibility. Signed-off-by: Raman Dzehtsiar <Raman.Dzehtsiar@gmail.com> Message-ID: <20250414090025.828660-1-Raman.Dzehtsiar@gmail.com> Acked-by: Markus Armbruster <armbru@redhat.com> [vsementsov: fix long lines] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingStefan Hajnoczi2025-02-101-0/+48
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer patches - Managing inactive nodes (enables QSD migration with shared storage) - Fix swapped values for BLOCK_IO_ERROR 'device' and 'qom-path' - vpc: Read images exported from Azure correctly - scripts/qemu-gdb: Support coroutine dumps in coredumps - Minor cleanups # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmek34IRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9bDpxAAnTvwmdazAXG0g9GzqvrEB/+6rStjAsqE # 9MTWV4WxyN41d0RXxN8CYKb8CXSiTRyw6r3CSGNYEI2eShe9e934PriSkZm41HyX # n9Yh5YxqGZqitzvPtx62Ii/1KG+PcjQbfHuK1p4+rlKa0yQ2eGlio1JIIrZrCkBZ # ikZcQUrhIyD0XV8hTQ2+Ysa+ZN6itjnlTQIG3gS3m8f8WR7kyUXD8YFMQFJFyjVx # NrAIpLnc/ln9+5PZR9tje8U7XEn2KCgI5pgGaQnrd0h0G1H4ig8ogzYYnKTLhjU/ # AmQpS8np8Tyg6S1UZTiekEq0VuAhThEQc5b3sGbmHWH/R2ABMStyf18oCBAkPzZ7 # s6h+3XzTKKY2Q5Q3ZG/ANkUJjTNBhdj1fcaARvbSWsqsuk5CWX/I3jzvgihFtCSs # eGu+b/bLeW6P7hu4qPHBcgLHuB1Fc7Rd2t4BoIGM1wcO2CeC9DzUKOiIMZOEJIh0 # GGqCkEWDHgckDTakD4/vSqm0UDKt6FSlQC9ga/ILBY3IB5HpHoArY58selymy28i # X7MgAvbjdsmNuUuXDZZOiObcFt3j8jlmwPJpPyzXPQIiPX1RXeBPRhVAEeZCKn6Z # tfHr72SJdMeVOGXVTvOrJ2iW+4g03rPdmkDFCUhpOwo62RODq7ahvCIXsNf3nEFR # rSB3T1M/8EM= # =iQLP # -----END PGP SIGNATURE----- # gpg: Signature made Thu 06 Feb 2025 11:12:50 EST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (25 commits) block: remove unused BLOCK_OP_TYPE_DATAPLANE iotests: Add (NBD-based) tests for inactive nodes iotests: Add qsd-migrate case iotests: Add filter_qtest() nbd/server: Support inactive nodes block/export: Add option to allow export of inactive nodes block: Drain nodes before inactivating them block/export: Don't ignore image activation error in blk_exp_add() block: Support inactive nodes in blk_insert_bs() block: Add blockdev-set-active QMP command block: Add option to create inactive nodes block: Fix crash on block_resize on inactive node block: Don't attach inactive child to active node migration/block-active: Remove global active flag block: Inactivate external snapshot overlays when necessary block: Allow inactivating already inactive nodes block: Add 'active' field to BlockDeviceInfo block-backend: Fix argument order when calling 'qapi_event_send_block_io_error()' scripts/qemu-gdb: Support coroutine dumps in coredumps scripts/qemu-gdb: Simplify fs_base fetching for coroutines ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
| * block: Add blockdev-set-active QMP commandKevin Wolf2025-02-061-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The system emulator tries to automatically activate and inactivate block nodes at the right point during migration. However, there are still cases where it's necessary that the user can do this manually. Images are only activated on the destination VM of a migration when the VM is actually resumed. If the VM was paused, this doesn't happen automatically. The user may want to perform some operation on a block device (e.g. taking a snapshot or starting a block job) without also resuming the VM yet. This is an example where a manual command is necessary. Another example is VM migration when the image files are opened by an external qemu-storage-daemon instance on each side. In this case, the process that needs to hand over the images isn't even part of the migration and can't know when the migration completes. Management tools need a way to explicitly inactivate images on the source and activate them on the destination. This adds a new blockdev-set-active QMP command that lets the user change the status of individual nodes (this is necessary in qemu-storage-daemon because it could be serving multiple VMs and only one of them migrates at a time). For convenience, operating on all devices (like QEMU does automatically during migration) is offered as an option, too, and can be used in the context of single VM. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20250204211407.381505-9-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * block: Inactivate external snapshot overlays when necessaryKevin Wolf2025-02-061-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Putting an active block node on top of an inactive one is strictly speaking an invalid configuration and the next patch will turn it into a hard error. However, taking a snapshot while disk images are inactive after completing migration has an important use case: After migrating to a file, taking an external snapshot is what is needed to take a full VM snapshot. In order for this to keep working after the later patches, change creating a snapshot such that it automatically inactivates an overlay that is added on top of an already inactive node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20250204211407.381505-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* | qapi: Move include/qapi/qmp/ to include/qobject/Daniel P. Berrangé2025-02-101-4/+4
|/ | | | | | | | | | | | | | | | | | | | | | The general expectation is that header files should follow the same file/path naming scheme as the corresponding source file. There are various historical exceptions to this practice in QEMU, with one of the most notable being the include/qapi/qmp/ directory. Most of the headers there correspond to source files in qobject/. This patch corrects most of that inconsistency by creating include/qobject/ and moving the headers for qobject/ there. This also fixes MAINTAINERS for include/qapi/qmp/dispatch.h: scripts/get_maintainer.pl now reports "QAPI" instead of "No maintainers found". Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> #s390x Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20241118151235.2665921-2-armbru@redhat.com> [Rebased]
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-6/+6
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* backup: add minimum cluster size to performance optionsFiona Ebner2024-09-301-0/+3
| | | | | | | | | | | | | | | | | | | | In the context of backup fleecing, discarding the source will not work when the fleecing image has a larger granularity than the one used for block-copy operations (can happen if the backup target has smaller cluster size), because cbw_co_pdiscard_snapshot() will align down the discard requests and thus effectively ignore then. To make @discard-source work in such a scenario, allow specifying the minimum cluster size used for block-copy operations and thus in particular also the granularity for discard requests to the source. Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema) Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240711120915.310243-3-f.ebner@proxmox.com> [vsementsov: switch version to 9.2 in QAPI doc] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* qapi: blockdev-backup: add discard-source parameterVladimir Sementsov-Ogievskiy2024-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a parameter that enables discard-after-copy. That is mostly useful in "push backup with fleecing" scheme, when source is snapshot-access format driver node, based on copy-before-write filter snapshot-access API: [guest] [snapshot-access] ~~ blockdev-backup ~~> [backup target] | | | root | file v v [copy-before-write] | | | file | target v v [active disk] [temp.img] In this case discard-after-copy does two things: - discard data in temp.img to save disk space - avoid further copy-before-write operation in discarded area Note that we have to declare WRITE permission on source in copy-before-write filter, for discard to work. Still we can't take it unconditionally, as it will break normal backup from RO source. So, we have to add a parameter and pass it thorough bdrv_open flags. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20240313152822.626493-5-vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block: Improve error message when external snapshot can't flushMarkus Armbruster2024-05-271-2/+4
| | | | | | | | | | | | | | | | | external_snapshot_action() reports bdrv_flush() failure to its caller as An IO error has occurred The errno code returned by bdrv_flush() is lost. Improve this to Write to node '<device or node name>' failed: <description of errno> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240513141703.549874-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* qapi: Inline and remove QERR_DEVICE_HAS_NO_MEDIUM definitionPhilippe Mathieu-Daudé2024-04-241-1/+1
| | | | | | | | | | | | | | | | Address the comment added in commit 4629ed1e98 ("qerror: Finally unused, clean up"), from 2015: /* * These macros will go away, please don't use * in new code, and do not add new ones! */ Mechanical transformation using sed, and manual cleanup. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312141343.3168265-4-armbru@redhat.com>
* blockdev: Fix blockdev-snapshot-sync error reporting for no mediumMarkus Armbruster2024-03-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When external_snapshot_abort() rejects a BlockDriverState without a medium, it creates an error like this: error_setg(errp, "Device '%s' has no medium", device); Trouble is @device can be null. My system formats null as "(null)", but other systems might crash. Reproducer: 1. Create a block device without a medium -> {"execute": "blockdev-add", "arguments": {"driver": "host_cdrom", "node-name": "blk0", "filename": "/dev/sr0"}} <- {"return": {}} 3. Attempt to snapshot it -> {"execute":"blockdev-snapshot-sync", "arguments": { "node-name": "blk0", "snapshot-file":"/tmp/foo.qcow2","format":"qcow2"}} <- {"error": {"class": "GenericError", "desc": "Device '(null)' has no medium"}} Broken when commit 0901f67ecdb made @device optional. Use bdrv_get_device_or_node_name() instead. Now it fails as it should: <- {"error": {"class": "GenericError", "desc": "Device 'blk0' has no medium"}} Fixes: 0901f67ecdb7 ("qmp: Allow to take external snapshots on bs graphs node.") Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240306142831.2514431-1-armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: Fix block_resize error reporting for op blockersMarkus Armbruster2024-03-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When block_resize() runs into an op blocker, it creates an error like this: error_setg(errp, "Device '%s' is in use", device); Trouble is @device can be null. My system formats null as "(null)", but other systems might crash. Reproducer: 1. Create two block devices -> {"execute": "blockdev-add", "arguments": {"driver": "file", "node-name": "blk0", "filename": "64k.img"}} <- {"return": {}} -> {"execute": "blockdev-add", "arguments": {"driver": "file", "node-name": "blk1", "filename": "m.img"}} <- {"return": {}} 2. Put a blocker on one them -> {"execute": "blockdev-mirror", "arguments": {"job-id": "job0", "device": "blk0", "target": "blk1", "sync": "full"}} {"return": {}} -> {"execute": "job-pause", "arguments": {"id": "job0"}} {"return": {}} -> {"execute": "job-complete", "arguments": {"id": "job0"}} {"return": {}} Note: job events elided for brevity. 3. Attempt to resize -> {"execute": "block_resize", "arguments": {"node-name": "blk1", "size":32768}} <- {"error": {"class": "GenericError", "desc": "Device '(null)' is in use"}} Broken when commit 3b1dbd11a60 made @device optional. Fixed in commit ed3d2ec98a3 (block: Add errp to b{lk,drv}_truncate()), except for this one instance. Fix it by using the error message provided by the op blocker instead, so it fails like this: <- {"error": {"class": "GenericError", "desc": "Node 'blk1' is busy: block device is in use by block job: mirror"}} Fixes: 3b1dbd11a60d (qmp: Allow block_resize to manipulate bs graph nodes.) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* stream: Allow users to request only format driver names in backing file formatPeter Krempa2024-01-261-0/+7
| | | | | | | | | | | | | | | | | | | | | | | Introduce a new flag 'backing-mask-protocol' for the block-stream QMP command which instructs the internals to use 'raw' instead of the protocol driver in case when a image is used without a dummy 'raw' wrapper. The flag is designed such that it can be always asserted by management tools even when there isn't any update to backing files. The flag will be used by libvirt so that the backing images still reference the proper format even when libvirt will stop using the dummy raw driver (raw driver with no other config). Libvirt needs this so that the images stay compatible with older libvirt versions which didn't expect that a protocol driver name can appear in the backing file format field. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <bbee9a0a59748a8893289bf8249f568f0d587e62.1701796348.git.pkrempa@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* commit: Allow users to request only format driver names in backing file formatPeter Krempa2024-01-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | Introduce a new flag 'backing-mask-protocol' for the block-commit QMP command which instructs the internals to use 'raw' instead of the protocol driver in case when a image is used without a dummy 'raw' wrapper. The flag is designed such that it can be always asserted by management tools even when there isn't any update to backing files. The flag will be used by libvirt so that the backing images still reference the proper format even when libvirt will stop using the dummy raw driver (raw driver with no other config). Libvirt needs this so that the images stay compatible with older libvirt versions which didn't expect that a protocol driver name can appear in the backing file format field. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <2cb46e37093ce793ea1604abc8bbb90f4c8e434b.1701796348.git.pkrempa@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: remove bdrv_co_lock()Stefan Hajnoczi2023-12-211-5/+0
| | | | | | | | | | The bdrv_co_lock() and bdrv_co_unlock() functions are already no-ops. Remove them. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20231205182011.1976568-8-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: remove AioContext lockingStefan Hajnoczi2023-12-211-254/+53
| | | | | | | | | | | | | | | | This is the big patch that removes aio_context_acquire()/aio_context_release() from the block layer and affected block layer users. There isn't a clean way to split this patch and the reviewers are likely the same group of people, so I decided to do it in one patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-ID: <20231205182011.1976568-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* graph-lock: remove AioContext lockingStefan Hajnoczi2023-12-211-4/+4
| | | | | | | | | | | | | | | | Stop acquiring/releasing the AioContext lock in bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any effect. The distinction between bdrv_graph_wrunlock() and bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed into one function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231205182011.1976568-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Fix AioContext locking in qmp_block_resize()Kevin Wolf2023-12-121-1/+2
| | | | | | | | | | | | | | | The AioContext must be unlocked before calling blk_co_unref(), because it takes the AioContext lock internally in blk_unref_bh(), which is scheduled in the main thread. If we don't unlock, the AioContext is locked twice and nested event loops such as in bdrv_graph_wrlock() will deadlock. Cc: <qemu-stable@nongnu.org> Fixes: https://issues.redhat.com/browse/RHEL-15965 Fixes: 0c7d204f50c382c6baac8c94bd57af4a022b3888 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20231208124352.30295-1-kwolf@redhat.com>
* block: Fix deadlocks in bdrv_graph_wrunlock()Kevin Wolf2023-11-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that have a nested event loop. Nested event loops can depend on other iothreads making progress, so in order to allow them to make progress it must not hold the AioContext lock of another thread while calling aio_poll(). This introduces a @bs parameter to bdrv_graph_wrunlock() whose AioContext is temporarily dropped (which matches bdrv_graph_wrlock()), and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState doesn't necessarily exist any more when unlocking. This also requires a change to bdrv_schedule_unref(), which was relying on the incorrectly taken lock. It needs to take the lock itself now. While this is a separate bug, it can't be fixed a separate patch because otherwise the intermediate state would either deadlock or try to release a lock that we don't even hold. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [kwolf: Fixed up bdrv_schedule_unref()] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_replace_node() GRAPH_WRLOCKKevin Wolf2023-11-071-0/+5
| | | | | | | | | | | | Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_replace_node(). Its callers may already want to hold the graph lock and so wouldn't be able to call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-17-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_chain_contains() and callers GRAPH_RDLOCKKevin Wolf2023-11-071-21/+27
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_chain_contains() need to hold a reader lock for the graph because it calls bdrv_filter_or_cow_bs(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-11-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_skip_filters() and callers GRAPH_RDLOCKKevin Wolf2023-11-071-3/+4
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_skip_filters() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-9-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_skip_implicit_filters() and callers GRAPH_RDLOCKKevin Wolf2023-11-071-6/+8
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_skip_implicit_filters() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-8-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCKKevin Wolf2023-11-071-1/+1
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_filter_or_cow_bs() need to hold a reader lock for the graph because it calls bdrv_filter_or_cow_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-7-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_has_zero_init() and callers GRAPH_RDLOCKKevin Wolf2023-11-071-0/+2
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_has_zero_init() need to hold a reader lock for the graph because it calls bdrv_filter_bs(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* hw/xen: automatically assign device index to block devicesDavid Woodhouse2023-11-071-3/+12
| | | | | | | | | | | | | | | | | There's no need to force the user to assign a vdev. We can automatically assign one, starting at xvda and searching until we find the first disk name that's unused. This means we can now allow '-drive if=xen,file=xxx' to work without an explicit separate -driver argument, just like if=virtio. Rip out the legacy handling from the xenpv machine, which was scribbling over any disks configured by the toolstack, and didn't work with anything but raw images. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org>
* blockjob: introduce block-job-change QMP commandFiona Ebner2023-10-311-0/+14
| | | | | | | | | | | | | | | which will allow changing job-type-specific options after job creation. In the JobVerbTable, the same allow bits as for set-speed are used, because set-speed can be considered an existing change command. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20231031135431.393137-2-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: mirror: avoid potential deadlock when using iothreadFiona Ebner2023-10-311-2/+12
| | | | | | | | | | | | | | | The bdrv_getlength() function is a generated co-wrapper and uses AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE() expects the lock to be acquired exactly once. Fix a case where it may be acquired twice. This can happen when the source node is explicitly specified as the @replaces parameter or if the source node is a filter node. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20231019131936.414246-4-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_op_is_blocked() and callers GRAPH_RDLOCKKevin Wolf2023-10-121-0/+15
| | | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_op_is_blocked() need to hold a reader lock for the graph because it calls bdrv_get_device_or_node_name(), which accesses the parents list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-18-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_refresh_filename() and callers GRAPH_RDLOCKKevin Wolf2023-10-121-0/+13
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_refresh_filename() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-11-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_get_xdbg_block_graph() and callers GRAPH_RDLOCKKevin Wolf2023-10-121-0/+2
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_get_xdbg_block_graph() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-10-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_snapshot_fallback() and callers GRAPH_RDLOCKKevin Wolf2023-10-121-0/+9
| | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_snapshot_fallback() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-8-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCKKevin Wolf2023-10-121-0/+5
| | | | | | | | | | | | This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_first_blk() and bdrv_is_root_node() need to hold a reader lock for the graph. These functions are the only functions in block-backend.c that access the parent list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-5-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_add/del_child() and caller GRAPH_WRLOCKKevin Wolf2023-09-201-6/+11
| | | | | | | | | | | | | The functions read the parents list in the generic block layer, so we need to hold the graph lock already there. The BlockDriver implementations actually modify the graph, so it has to be a writer lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-22-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Mark bdrv_get_cumulative_perm() and callers GRAPH_RDLOCKKevin Wolf2023-09-201-0/+6
| | | | | | | | | | | | | | | The function reads the parents list, so it needs to hold the graph lock. This happens to result in BlockDriver.bdrv_set_perm() to be called with the graph lock held. For consistency, make it the same for all of the BlockDriver callbacks for updating permissions and annotate the function pointers with GRAPH_RDLOCK_PTR. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-15-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* cutils: Adjust signature of parse_uint[_full]Eric Blake2023-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | It's already confusing that we have two very similar functions for wrapping the parse of a 64-bit unsigned value, differing mainly on whether they permit leading '-'. Adjust the signature of parse_uint() and parse_uint_full() to be like all of qemu_strto*(): put the result parameter last, use the same types (uint64_t and unsigned long long have the same width, but are not always the same type), and mark endptr const (this latter change only affects the rare caller of parse_uint). Adjust all callers in the tree. While at it, note that since cutils.c already includes: QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long)); we are guaranteed that the result of parse_uint* cannot exceed UINT64_MAX (or the build would have failed), so we can drop pre-existing dead comparisons in opts-visitor.c that were never false. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-8-eblake@redhat.com> [eblake: Drop dead code spotted by Markus] Signed-off-by: Eric Blake <eblake@redhat.com>
* block: Take main AioContext lock when calling bdrv_open()Kevin Wolf2023-05-301-6/+23
| | | | | | | | | | | | The function documentation already says that all callers must hold the main AioContext lock, but not all of them do. This can cause assertion failures when functions called by bdrv_open() try to drop the lock. Fix a few more callers to take the lock before calling bdrv_open(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230525124713.401149-4-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: qmp_transaction: drop extra generic layerVladimir Sementsov-Ogievskiy2023-05-191-180/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | Let's simplify things: First, actions generally don't need access to common BlkActionState structure. The only exclusion are backup actions that need block_job_txn. Next, for transaction actions of Transaction API is more native to allocated state structure in the action itself. So, do the following transformation: 1. Let all actions be represented by a function with corresponding structure as arguments. 2. Instead of array-map marshaller, let's make a function, that calls corresponding action directly. 3. BlkActionOps and BlkActionState structures become unused Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230510150624.310640-7-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* blockdev: use state.bitmap in block-dirty-bitmap-add actionVladimir Sementsov-Ogievskiy2023-05-191-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Other bitmap related actions use the .bitmap pointer in .abort action, let's do same here: 1. It helps further refactoring, as bitmap-add is the only bitmap action that uses state.action in .abort 2. It must be safe: transaction actions rely on the fact that on .abort() the state is the same as at the end of .prepare(), so that in .abort() we could precisely rollback the changes done by .prepare(). The only way to remove the bitmap during transaction should be block-dirty-bitmap-remove action, but it postpones actual removal to .commit(), so we are OK on any rollback path. (Note also that bitmap-remove is the only bitmap action that has .commit() phase, except for simple g_free the state on .clean()) 3. Again, other bitmap actions behave this way: keep the bitmap pointer during the transaction. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230510150624.310640-6-vsementsov@yandex-team.ru> [kwolf: Also remove the now unused BlockDirtyBitmapState.prepared] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>