| Commit message (Collapse) | Author | Files | Lines |
|
When talking about the job state machine, we refer to the states like
READY, ABORTING, CONCLUDED, and so forth. Except in two places, where
we use JOB_STATUS_CONCLUDED. Replace by CONCLUDED for consistency.
We should arguably use the JobStatus enum values instead. Left for
another day.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-13-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Several doc comments mention block-job-cancel where the more generic
job-cancel would also work. Adjust them to mention both.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-12-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
We deprecated several block-job-FOO commands in commit
b836bf2ab68 (qapi/block-core: deprecate some block-job- APIs). Update
the doc comments to refer to their replacements instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-11-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
The doc comment misspells JSON null as NULL. Fix that.
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-10-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-9-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-8-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Improve awkward phrasing in migrate-incoming While there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-7-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
There is a (Since: 2.11) in a query-hotpluggable-cpus example.
Versioning information ought to be in the command description, not
examples. The command description is basically empty (there is a TODO
about it).
What exactly didn't work before 2.11 is not quite clear from the
documentation. The example was added in commit 4dc3b151882 (s390x:
implement query-hotpluggable-cpus), which suggests the command failed
for the s390x target until then. This was almost eight years ago, and
I doubt anyone still cares about this detail. Simply delete
the problematic (Since: 2.11).
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-6-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Easier on the eyes and for grep.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-5-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
By convention, we put (since X.Y) at the end of the description. Move
the ones that somehow ended up in the middle of the description to the
end.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-3-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Fixes: a937b6aa739f (qapi: Reformat doc comments to conform to current conventions)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-2-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
This change improves performance by moving the hot path of the trace_vhost_commit()(or any other trace function) logic to the header file.
Previously, even when the trace event was disabled, the function call chain:-
trace_vhost_commit()(Or any other trace function) → _nocheck__trace_vhost_commit() → _simple_trace_vhost_commit()
incurred a significant function prologue overhead before checking the trace state.
Disassembly of _simple_trace_vhost_commit() (from the .c file) showed that 11 out of the first 14 instructions were prologue-related, including:
0x10 stp x29, x30, [sp, #-64]! Prologue: allocates 64-byte frame and saves old FP (x29) & LR (x30)
0x14 adrp x3, trace_events_enabled_count Prologue: computes page-base of the trace-enable counter
0x18 adrp x2, __stack_chk_guard Important (maybe prolog don't know?)(stack-protector): starts up the stack-canary load
0x1c mov x29, sp Prologue: sets new frame pointer
0x20 ldr x3, [x3] Prologue: loads the actual trace-enabled count
0x24 stp x19, x20, [sp, #16] Prologue: spills callee-saved regs used by this function (x19, x20)
0x28 and w20, w0, #0xff Tracepoint setup: extracts the low-8 bits of arg0 as the “event boolean”
0x2c ldr x2, [x2] Prologue (cont’d): completes loading of the stack-canary value
0x30 and w19, w1, #0xff Tracepoint setup: extracts low-8 bits of arg1
0x34 ldr w0, [x3] Important: loads the current trace-enabled flag from memory
0x38 ldr x1, [x2] Prologue (cont’d): reads the canary
0x3c str x1, [sp, #56] Prologue (cont’d): writes the canary into the new frame
0x40 mov x1, #0 Prologue (cont’d): zeroes out x1 for the upcoming branch test
0x44 cbnz w0, 0x88 Important: if tracing is disabled (w0==0) skip the heavy path entirely
The trace-enabled check happens after the prologue. This is wasteful when tracing is disabled, which is often the case in production.
To optimize this:
_nocheck__trace_vhost_commit() is now fully inlined in the .h file with
the hot path.It checks trace_event_get_state() before calling into _simple_trace_vhost_commit(), which remains in .c.
This avoids calling into the .c function altogether when the tracepoint is disabled, thereby skipping unnecessary prologue instructions.
This results in better performance by removing redundant instructions in the tracing fast path.
Signed-off-by: Tanish Desai <tanishdesai37@gmail.com>
Message-id: 20250528192528.3968-1-tanishdesai37@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Commit 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
attempted to refactor RTC IRQ wiring which was previously done in
pc_basic_device_init() but forgot about the isapc machine. Fix this by
wiring in the code section dedicated exclusively to the isapc machine.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2961
Fixes: 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
cc: qemu-stable
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Message-Id: <20250526203820.1853-1-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Current memory operations like pinning may take a lot of time at the
destination. Currently they are done after the source of the migration is
stopped, and before the workload is resumed at the destination. This is a
period where neigher traffic can flow, nor the VM workload can continue
(downtime).
We can do better as we know the memory layout of the guest RAM at the
destination from the moment that all devices are initializaed. So
moving that operation allows QEMU to communicate the kernel the maps
while the workload is still running in the source, so Linux can start
mapping them.
As a small drawback, there is a time in the initialization where QEMU
cannot respond to QMP etc. By some testing, this time is about
0.2seconds. This may be further reduced (or increased) depending on the
vdpa driver and the platform hardware, and it is dominated by the cost
of memory pinning.
This matches the time that we move out of the called downtime window.
The downtime is measured as the elapsed trace time between the last
vhost_vdpa_suspend on the source and the last vhost_vdpa_set_vring_enable_one
on the destination. In other words, from "guest CPUs freeze" to the
instant the final Rx/Tx queue-pair is able to start moving data.
Using ConnectX-6 Dx (MLX5) NICs in vhost-vDPA mode with 8 queue-pairs,
the series reduces guest-visible downtime during back-to-back live
migrations by more than half:
- 39G VM: 4.72s -> 2.09s (-2.63s, ~56% improvement)
- 128G VM: 14.72s -> 5.83s (-8.89s, ~60% improvement)
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-8-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
As we are moving to keep the mapping through all the vdpa device life
instead of resetting it at VirtIO reset, we need to move all its
dependencies to the initialization too. In particular devices with
x-svq=on need a valid iova_tree from the beginning.
Simplify the code also consolidating the two creation points: the first
data vq in case of SVQ active and CVQ start in case only CVQ uses it.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-7-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Since commit f6fe3e333f ("vdpa: move memory listener to
vhost_vdpa_shared") this piece of code repeatedly assign
shared->listener members. This was not a problem as it was not used
until device start.
However next patches move the listener registration to this
vhost_vdpa_init function. When the listener is registered it is added
to an embedded linked list, so setting its members again will cause
memory corruption to the linked list node.
Do the right thing and only set it in the first vdpa device.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-6-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Check if the listener has been registered or not, so it needs to be
registered again at start.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-5-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The backend does not reset them until the vdpa file descriptor is closed
so there is no harm in doing it only once.
This allows the destination of a live migration to premap memory in
batches, using VHOST_BACKEND_F_IOTLB_BATCH.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
It will be used directly by vhost_vdpa_init.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-3-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
To map the guest memory while it is migrating we need to create the
iova_tree, as long as the destination uses x-svq=on. Checking to not
override it.
The function vhost_vdpa_net_client_stop clear it if the device is
stopped. If the guest starts the device again, the iova tree is
recreated by vhost_vdpa_net_data_start_first or vhost_vdpa_net_cvq_start
if needed, so old behavior is kept.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-2-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The vring call fd is set even when the guest does not use MSI-X (e.g., in the
case of virtio PMD), leading to unnecessary CPU overhead for processing
interrupts.
The commit 96a3d98d2c("vhost: don't set vring call if no vector") optimized the
case where MSI-X is enabled but the queue vector is unset. However, there's an
additional case where the guest uses INTx and the INTx_DISABLED bit in the PCI
config is set, meaning that no interrupt notifier will actually be used.
In such cases, the vring call fd should also be cleared to avoid redundant
interrupt handling.
Fixes: 96a3d98d2c("vhost: don't set vring call if no vector")
Reported-by: Zhiyuan Yuan <yuanzhiyuan@chinatelecom.cn>
Signed-off-by: Jidong Xia <xiajd@chinatelecom.cn>
Signed-off-by: Huaitong Han <hanht2@chinatelecom.cn>
Message-Id: <20250522100548.212740-1-hanht2@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redha |