summary refs log tree commit diff stats
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell2019-06-0637-164/+261
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | virtio, pci, pc: cleanups, features stricter rules for acpi tables: we now fail on any difference that isn't whitelisted. vhost-scsi migration. some cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed 05 Jun 2019 20:55:04 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: bios-tables-test: ignore identical binaries tests: acpi: add simple arm/virt testcase tests: add expected ACPI tables for arm/virt board bios-tables-test: list all tables that differ vhost-scsi: Allow user to enable migration vhost-scsi: Add VMState descriptor vhost-scsi: The vhost backend should be stopped when the VM is not running bios-tables-test: add diff allowed list vhost: fix memory leak in vhost_user_scsi_realize vhost: fix incorrect print type vhost: remove the dead code docs: smbios: remove family=x from type2 entry description pci: Fold pci_get_bus_devfn() into its sole caller pci: Make is_bridge a bool pcie: Simplify pci_adjust_config_limit() acpi: pci: use build_append_foo() API to construct MCFG hw/acpi: Consolidate build_mcfg to pci.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * bios-tables-test: ignore identical binariesMichael S. Tsirkin2019-06-051-2/+14
| | | | | | | | | | | | | | when binary of the tables is identical, there is no need to run iasl to check that they are functionally equivalent. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * tests: acpi: add simple arm/virt testcaseIgor Mammedov2019-06-033-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | adds simple arm/virt test case that starts guest with bios-tables-test.aarch64.iso.qcow2 boot image which initializes UefiTestSupport* structure in RAM once guest is booted. * see commit: tests: acpi: add acpi_find_rsdp_address_uefi() helper Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1559560929-260254-3-git-send-email-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * tests: add expected ACPI tables for arm/virt boardIgor Mammedov2019-06-036-0/+0
| | | | | | | | | | | | | | | | Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <1559560929-260254-2-git-send-email-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * bios-tables-test: list all tables that differMichael S. Tsirkin2019-06-031-2/+4
| | | | | | | | | | | | | | | | Fail after comparing all tables: this way user gets the full list of tables that need to be updated or whitelisted. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-scsi: Allow user to enable migrationLiran Alon2019-06-022-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to perform a valid migration of a vhost-scsi device, the following requirements must be met: (1) The virtio-scsi device state needs to be saved & loaded. (2) The vhost backend must be stopped before virtio-scsi device state is saved: (2.1) Sync vhost backend state to virtio-scsi device state. (2.2) No further I/O requests are made by vhost backend to target SCSI device. (2.3) No further guest memory access takes place after VM is stopped. (3) Requests in-flight to target SCSI device are completed before migration handover. (4) Target SCSI device state needs to be saved & loaded into the destination host target SCSI device. Previous commit ("vhost-scsi: Add VMState descriptor") add support to save & load the device state using VMState. This meets requirement (1). When VM is stopped by migration thread (On Pre-Copy complete), the following code path is executed: migration_completion() -> vm_stop_force_state() -> vm_stop() -> do_vm_stop(). do_vm_stop() calls first pause_all_vcpus() which pause all guest vCPUs and then call vm_state_notify(). In case of vhost-scsi device, this will lead to the following code path to be executed: vm_state_notify() -> virtio_vmstate_change() -> virtio_set_status() -> vhost_scsi_set_status() -> vhost_scsi_stop(). vhost_scsi_stop() then calls vhost_scsi_clear_endpoint() and vhost_scsi_common_stop(). vhost_scsi_clear_endpoint() sends VHOST_SCSI_CLEAR_ENDPOINT ioctl to vhost backend which will reach kernel's vhost_scsi_clear_endpoint() which process all pending I/O requests and wait for them to complete (vhost_scsi_flush()). This meets requirement (3). vhost_scsi_common_stop() will stop the vhost backend. As part of this stop, dirty-bitmap is synced and vhost backend state is synced with virtio-scsi device state. As at this point guest vCPUs are already paused, this meets requirement (2). At this point we are left with requirement (4) which is target SCSI device specific and therefore cannot be done by QEMU. Which is the main reason why vhost-scsi adds a migration blocker. However, as this can be handled either by an external orchestrator or by using shared-storage (i.e. iSCSI), there is no reason to limit the orchestrator from being able to explictly specify it wish to enable migration even when VM have a vhost-scsi device. Considering all the above, this commit allows orchestrator to explictly specify that it is responsbile for taking care of requirement (4) and therefore vhost-scsi should not add a migration blocker. Reviewed-by: Nir Weiner <nir.weiner@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190416125912.44001-4-liran.alon@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * vhost-scsi: Add VMState descriptorNir Weiner2019-06-021-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As preparation of enabling migration of vhost-scsi device, define it’s VMState. Note, we keep the convention of verifying in the pre_save() method that the vhost backend must be stopped before attempting to save the device state. Similar to how it is done for vhost-vsock. Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Nir Weiner <nir.weiner@oracle.com> Message-Id: <20190416125912.44001-3-liran.alon@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * vhost-scsi: The vhost backend should be stopped when the VM is not runningNir Weiner2019-06-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vhost-scsi doesn’t takes into account whether the VM is running or not in order to decide if it should start/stop vhost backend. This would lead to vhost backend still being active when VM's RunState suddenly change to stopped. An example of when this issue is encountered is when Live-Migration Pre-Copy phase completes. As in this case, VM state will be changed to stopped (while vhost backend is still active), which will result in virtio_vmstate_change() -> virtio_set_status() -> vhost_scsi_set_status() executed but vhost_scsi_set_status() will just return without stopping vhost backend. To handle this, change code to consider that vhost processing should be stopped when VM is not running. Similar to how it is done in vhost-vsock device at vhost_vsock_set_status(). Fixes: 5e9be92d7752 ("vhost-scsi: new device supporting the tcm_vhost Linux kernel module”) Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Nir Weiner <nir.weiner@oracle.com> Message-Id: <20190416125912.44001-2-liran.alon@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * bios-tables-test: add diff allowed listMichael S. Tsirkin2019-05-292-1/+19
| | | | | | | | | | | | | | | | | | | | Expected table change is then handled like this: 1. add table to diff allowed list 2. change generating code (can be combined with 1) 3. maintainer runs a script to update expected + blows away allowed diff list Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: fix memory leak in vhost_user_scsi_realizeJie Wang2019-05-291-0/+3
| | | | | | | | | | | | | | | | | | | | fix memory leak in vhost_user_scsi_realize Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1556608500-12183-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * vhost: fix incorrect print typeJie Wang2019-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | fix incorrect print type in vhost_virtqueue_stop Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1556605773-42019-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
| * vhost: remove the dead codeJie Wang2019-05-291-1/+0
| | | | | | | | | | | | | | | | | | | | remove the dead code Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1556604614-32081-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * docs: smbios: remove family=x from type2 entry descriptionIgor Mammedov2019-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | 'family' option is not part of type 2 table and if user tries to use it as such QEMU will error out with an unknow option error. Drop it from docs lest it confuse users. Fixes: b155eb1d04 ("smbios: document cmdline options for smbios type 2-4, 17 structures") Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1558448611-315074-1-git-send-email-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * pci: Fold pci_get_bus_devfn() into its sole callerDavid Gibson2019-05-291-32/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The only remaining caller of pci_get_bus_devfn() is pci_nic_init_nofail(), itself an old compatibility function. Fold the two together to avoid re-using the stale interface. While we're there replace the explicit fprintf()s with error_report(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513061939.3464-6-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>
| * pci: Make is_bridge a boolDavid Gibson2019-05-299-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | The is_bridge field in PCIDevice acts as a bool, but is declared as an int. Declare it as a bool for clarity, and change everything that writes it to use true/false instead of 0/1 to match. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190513061939.3464-5-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * pcie: Simplify pci_adjust_config_limit()David Gibson2019-05-295-54/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since c2077e2c "pci: Adjust PCI config limit based on bus topology", pci_adjust_config_limit() has been used in the config space read and write paths to only permit access to extended config space on buses which permit it. Specifically it prevents access on devices below a vanilla-PCI bus via some combination of bridges, even if both the host bridge and the device itself are PCI-E. It accomplishes this with a somewhat complex call up the chain of bridges to see if any of them prohibit extended config space access. This is overly complex, since we can always know if the bus will support such access at the point it is constructed. This patch simplifies the test by using a flag in the PCIBus instance indicating whether extended configuration space is accessible. It is false for vanilla PCI buses. For PCI-E buses, it is true for root buses and equal to the parent bus's's capability otherwise. For the special case of sPAPR's paravirtualized PCI root bus, which acts mostly like vanilla PCI, but does allow extended config space access, we override the default value of the flag from the host bridge code. This should cause no behavioural change. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190513061939.3464-4-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * acpi: pci: use build_append_foo() API to construct MCFGWei Yang2019-05-292-30/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | build_append_foo() API doesn't need explicit endianness conversions which eliminates a source of errors and it makes build_mcfg() look like declarative definition of MCFG table in ACPI spec, which makes it easy to review. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> v3: * add some comment on the Configuration Space base address allocation structure v2: * miss the reserved[8] of MCFG in last version, add it back * drop SOBs and make sure bios-tables-test all OK Message-Id: <20190521062836.6541-3-richardw.yang@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * hw/acpi: Consolidate build_mcfg to pci.cWei Yang2019-05-298-34/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we have two identical build_mcfg functions. Consolidate them in acpi/pci.c. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> v4: * ACPI_PCI depends on both ACPI and PCI * rebase on latest master, adjust arm Kconfig v3: * adjust changelog based on Igor's suggestion Message-Id: <20190521062836.6541-2-richardw.yang@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell2019-06-061-4/+9
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix pr-manager-helper (Markus) # gpg: Signature made Wed 05 Jun 2019 15:15:34 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: vl: Document why objects are delayed vl: Fix -drive / -blockdev persistent reservation management Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | vl: Document why objects are delayedMarkus Armbruster2019-06-051-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Objects should not be "delayed" without a reason, as the previous commit demonstrates. The remaining ones have reasons. State them. and demand future ones come with such a statement. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604151251.9903-3-armbru@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | vl: Fix -drive / -blockdev persistent reservation managementMarkus Armbruster2019-06-051-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qemu-system-FOO's main() acts on command line arguments in its own idiosyncratic order. There's not much method to its madness. Whenever we find a case where one kind of command line argument needs to refer to something created for another kind later, we rejigger the order. Recent commit cda4aa9a5a "vl: Create block backends before setting machine properties" was such a rejigger. Block backends are now created before "delayed" objects. This broke persistent reservation management. Reproducer: $ qemu-system-x86_64 -object pr-manager-helper,id=pr-helper0,path=/tmp/pr-helper0.sock-drive -drive file=/dev/mapper/crypt,file.pr-manager=pr-helper0,format=raw,if=none,id=drive-scsi0-0-0-2 qemu-system-x86_64: -drive file=/dev/mapper/crypt,file.pr-manager=pr-helper0,format=raw,if=none,id=drive-scsi0-0-0-2: No persistent reservation manager with id 'pr-helper0' The delayed pr-manager-helper object is created too late for use by -drive or -blockdev. Normal objects are still created in time. pr-manager-helper has always been a delayed object (commit 7c9e527659 "scsi, file-posix: add support for persistent reservation management"). Turns out there's no real reason for that. Make it a normal object. Fixes: cda4aa9a5a08777cf13e164c0543bd4888b8adce Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604151251.9903-2-armbru@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | Merge remote-tracking branch ↵Peter Maydell2019-06-066-23/+246
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/juanquintela/tags/migration-pull-request' into staging Migration Pull request # gpg: Signature made Wed 05 Jun 2019 12:52:06 BST # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration-pull-request: migratioin/ram: leave RAMBlock->bmap blank on allocating migration-test: Add a test for fd protocol migration: Fix fd protocol for incoming defer migration/ram.c: multifd_send_state->count is not really used migration/ram.c: MultiFDSendParams.sem_sync is not really used Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | migratioin/ram: leave RAMBlock->bmap blank on allocatingWei Yang2019-06-051-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During migration, we would sync bitmap from ram_list.dirty_memory to RAMBlock.bmap in cpu_physical_memory_sync_dirty_bitmap(). Since we set RAMBlock.bmap and ram_list.dirty_memory both to all 1, this means at the first round this sync is meaningless and is a duplicated work. Leaving RAMBlock->bmap blank on allocating would have a side effect on migration_dirty_pages, since it is calculated from the result of cpu_physical_memory_sync_dirty_bitmap(). To keep it right, we need to set migration_dirty_pages to 0 in ram_state_init(). Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
| * | | migration-test: Add a test for fd protocolYury Kotov2019-06-053-5/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
| * | | migration: Fix fd protocol for incoming deferYury Kotov2019-06-052-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, incoming migration through fd supports only command-line case: E.g. fork(); fd = open(); exec("qemu ... -incoming fd:%d", fd); It's possible to use add-fd commands to pass fd for migration, but it's invalid case. add-fd works with fdset but not with particular fds. To work with getfd in incoming defer it's enough to use monitor_fd_param instead of strtol. monitor_fd_param supports both cases: * fd:123 * fd:fd_name (added by getfd). And also the use of monitor_fd_param improves error messages. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
| * | | migration/ram.c: multifd_send_state->count is not really usedWei Yang2019-06-051-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
| * | | migration/ram.c: MultiFDSendParams.sem_sync is not really usedWei Yang2019-06-051-4/+0
|/ / / | | | | | | | | | | | | | | | | | | | | | Besides init and destroy, MultiFDSendParams.sem_sync is not really used. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* | | Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell2019-06-0472-272/+1144
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer patches: - block: AioContext management, part 2 - Avoid recursive block_status call (i.e. lseek() calls) if possible - linux-aio: Drop unused BlockAIOCB submission method - nvme: add Get/Set Feature Timestamp support - Fix crash on commit job start with active I/O on base node - Fix crash in bdrv_drained_end - Fix integer overflow in qcow2 discard # gpg: Signature made Tue 04 Jun 2019 16:20:02 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (29 commits) iotests: Fix duplicated diff output on failure iotests: test big qcow2 shrink block/io: bdrv_pdiscard: support int64_t bytes parameter block/qcow2-refcount: add trace-point to qcow2_process_discards block: Remove bdrv_set_aio_context() test-bdrv-drain: Use bdrv_try_set_aio_context() iotests: Attach new devices to node in non-default iothread virtio-scsi-test: Test attaching new overlay with iothreads block: Remove wrong bdrv_set_aio_context() calls blockdev: Use bdrv_try_set_aio_context() for monitor commands block: Move node without parents to main AioContext test-block-iothread: BlockBackend AioContext across root node change test-block-iothread: Test adding parent to iothread node block: Adjust AioContexts when attaching nodes scsi-disk: Use qdev_prop_drive_iothread block: Add qdev_prop_drive_iothread property type block: Add BlockBackend.ctx block: Add Error to blk_set_aio_context() nbd-server: Call blk_set_allow_aio_context_change() test-block-iothread: Check filter node in test_propagate_mirror ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | iotests: Fix duplicated diff output on failureKevin Wolf2019-06-041-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 70ff5b07 wanted to move the diff between actual and reference output to the end after printing the test result line. It really only copied it, though, so the diff is now displayed twice. Remove the old one. Fixes: 70ff5b07fcdd378180ad2d5cc0b0d5e67e7ef325 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | iotests: test big qcow2 shrinkVladimir Sementsov-Ogievskiy2019-06-043-0/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This test checks bug in qcow2_process_discards, fixed by previous commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block/io: bdrv_pdiscard: support int64_t bytes parameterVladimir Sementsov-Ogievskiy2019-06-042-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes at least one overflow in qcow2_process_discards, which passes 64bit region length to bdrv_pdiscard where bytes (or sectors in the past) parameter is int since its introduction in 0b919fae. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block/qcow2-refcount: add trace-point to qcow2_process_discardsVladimir Sementsov-Ogievskiy2019-06-042-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's at least trace ignored failure. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Remove bdrv_set_aio_context()Kevin Wolf2019-06-043-27/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers of bdrv_set_aio_context() are eliminated now, they have moved to bdrv_try_set_aio_context() and related safe functions. Remove bdrv_set_aio_context(). With this, we can now know that the .set_aio_ctx callback must be present in bdrv_set_aio_context_ignore() because bdrv_can_set_aio_context() would have returned false previously, so instead of checking the condition, we can assert it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Use bdrv_try_set_aio_context()Kevin Wolf2019-06-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | No reason to use the unchecked version in tests, even more so when these are the last callers of bdrv_set_aio_context() outside of block.c. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | iotests: Attach new devices to node in non-default iothreadKevin Wolf2019-06-043-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tests that devices refuse to be attached to a node that has already been moved to a different iothread if they can't be or aren't configured to work in the same iothread. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | virtio-scsi-test: Test attaching new overlay with iothreadsKevin Wolf2019-06-043-0/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tests that blockdev-add can correctly add a qcow2 overlay to an image used by a virtio-scsi disk in an iothread. The interesting point here is whether the newly added node gets correctly moved into the iothread AioContext. If it isn't, we get an assertion failure in virtio-scsi while processing the next request: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Remove wrong bdrv_set_aio_context() callsKevin Wolf2019-06-043-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mirror and commit block jobs use bdrv_set_aio_context() to move their filter node into the right AioContext before hooking it up in the graph. Similarly, bdrv_open_backing_file() explicitly moves the backing file node into the right AioContext first. This isn't necessary any more, they get automatically moved into the right context now when attaching them. However, in the case of bdrv_open_backing_file() with a node reference, it's actually not only unnecessary, but even wrong: The unchecked bdrv_set_aio_context() changes the AioContext of the child node even if other parents require it to retain the old context. So this is not only a simplification, but a bug fix, too. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1684342 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | blockdev: Use bdrv_try_set_aio_context() for monitor commandsKevin Wolf2019-06-041-16/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Monitor commands can handle errors, so they can easily be converted to using the safer bdrv_try_set_aio_context() function. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Move node without parents to main AioContextKevin Wolf2019-06-043-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A node should only be in a non-default AioContext if a user is attached to it that requires this. When the last parent of a node is gone, it can move back to the main AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-block-iothread: BlockBackend AioContext across root node changeKevin Wolf2019-06-041-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test that BlockBackends preserve their assigned AioContext even when the root node goes away. Inserting a new root node will move it to the right AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-block-iothread: Test adding parent to iothread nodeKevin Wolf2019-06-041-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Opening a new parent node for a node that has already been moved into a different AioContext must cause the new parent to move into the same context. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Adjust AioContexts when attaching nodesKevin Wolf2019-06-045-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far, we only made sure that updating the AioContext of a node affected the whole subtree. However, if a node is newly attached to a new parent, we also need to make sure that both the subtree of the node and the parent are in the same AioContext. This tries to move the new child node to the parent AioContext and returns an error if this isn't possible. BlockBackends now actually apply their AioContext to their root node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | scsi-disk: Use qdev_prop_drive_iothreadKevin Wolf2019-06-044-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes use of qdev_prop_drive_iothread for scsi-disk so that the disk can be attached to a node that is already in the target AioContext. We need to check that the HBA actually supports iothreads, otherwise scsi-disk must make sure that the node is already in the main AioContext. This changes the error message for conflicting iothread settings. Previously, virtio-scsi produced the error message, now it comes from blk_set_aio_context(). Update a test case accordingly. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Add qdev_prop_drive_iothread property typeKevin Wolf2019-06-043-7/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some qdev block devices have support for iothreads and take care of the AioContext they are running in, but most devices don't know about any of this. For the latter category, the qdev drive property must make sure that their BlockBackend is in the main AioContext. Unfortunately, while the current code just does the same thing for devices that do support iothreads, this is not correct and it would show as soon as we actually try to keep a consistent AioContext assignment across all nodes and users of a block graph subtree: If a node is already in a non-default AioContext because of one of its users, attaching a new device should still be possible if that device can work in the same AioContext. Switching the node back to the main context first and only then into the device AioContext causes failure (because the existing user wouldn't allow the switch to the main context). So devices that support iothreads need a different kind of drive property that leaves the node in its current AioContext, but by using this type, the device promises to check later that it can work with this context. This patch adds the qdev infrastructure that allows devices to signal that they handle iothreads and qdev should leave the AioContext alone. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Add BlockBackend.ctxKevin Wolf2019-06-0433-64/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Add Error to blk_set_aio_context()Kevin Wolf2019-06-047-34/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an Error parameter to blk_set_aio_context() and use bdrv_child_try_set_aio_context() internally to check whether all involved nodes can actually support the AioContext switch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | nbd-server: Call blk_set_allow_aio_context_change()Kevin Wolf2019-06-043-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NBD server uses an AioContext notifier, so it can tolerate that its BlockBackend is switched to a different AioContext. Before we start actually calling bdrv_try_set_aio_context(), which checks for consistency, outside of test cases, we need to make sure that the NBD server actually allows this. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
| * | | test-block-iothread: Check filter node in test_propagate_mirrorKevin Wolf2019-06-041-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Just make the test cover the AioContext of the filter node as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
| * | | nvme: add Get/Set Feature Timestamp supportKenneth Heitke2019-06-044-2/+110
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com> Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block/linux-aio: Drop unused BlockAIOCB submission methodJulia Suvorova2019-06-042-65/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Callback-based laio_submit() and laio_cancel() were left after rewriting Linux AIO backend to coroutines in hope that they would be used in other code that could bypass coroutines. They can be safely removed because they have not been used since that time. Signed-off-by: Julia Suvorova <jusual@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>