summary refs log tree commit diff stats
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/about/emulation.rst2
-rw-r--r--docs/devel/crypto.rst10
-rw-r--r--docs/devel/index-internals.rst1
-rw-r--r--docs/devel/luks-detached-header.rst182
-rw-r--r--docs/interop/firmware.json47
-rw-r--r--docs/interop/qemu-ga.rst20
-rw-r--r--docs/specs/acpi_hw_reduced_hotplug.rst3
-rw-r--r--docs/specs/index.rst2
-rw-r--r--docs/specs/rapl-msr.rst155
-rw-r--r--docs/specs/spdm.rst134
-rw-r--r--docs/system/index.rst1
-rw-r--r--docs/system/sriov.rst36
-rw-r--r--docs/tools/index.rst1
-rw-r--r--docs/tools/qemu-vmsr-helper.rst89
14 files changed, 676 insertions, 7 deletions
diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
index b5ff9c5f69..3bfe8cc14a 100644
--- a/docs/about/emulation.rst
+++ b/docs/about/emulation.rst
@@ -42,7 +42,7 @@ depending on the guest architecture.
     - :ref:`Yes<QEMU-PC-System-emulator>`
     - Yes
     - The ubiquitous desktop PC CPU architecture, 32 and 64 bit.
-  * - Loongarch
+  * - LoongArch
     - Yes
     - Yes
     - A MIPS-like 64bit RISC architecture developed in China
diff --git a/docs/devel/crypto.rst b/docs/devel/crypto.rst
new file mode 100644
index 0000000000..39b1c910e7
--- /dev/null
+++ b/docs/devel/crypto.rst
@@ -0,0 +1,10 @@
+.. _crypto-ref:
+
+====================
+Cryptography in QEMU
+====================
+
+.. toctree::
+   :maxdepth: 2
+
+   luks-detached-header
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 5636e9cf1d..4ac7725d72 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -20,3 +20,4 @@ Details about QEMU's various subsystems including how to add features to them.
    vfio-iommufd
    writing-monitor-commands
    virtio-backends
+   crypto
diff --git a/docs/devel/luks-detached-header.rst b/docs/devel/luks-detached-header.rst
new file mode 100644
index 0000000000..94ec285c27
--- /dev/null
+++ b/docs/devel/luks-detached-header.rst
@@ -0,0 +1,182 @@
+================================
+LUKS volume with detached header
+================================
+
+Introduction
+============
+
+This document gives an overview of the design of LUKS volume with detached
+header and how to use it.
+
+Background
+==========
+
+The LUKS format has ability to store the header in a separate volume from
+the payload. We could extend the LUKS driver in QEMU to support this use
+case.
+
+Normally a LUKS volume has a layout:
+
+::
+
+         +-----------------------------------------------+
+         |         |                |                    |
+ disk    | header  |  key material  |  disk payload data |
+         |         |                |                    |
+         +-----------------------------------------------+
+
+With a detached LUKS header, you need 2 disks so getting:
+
+::
+
+         +--------------------------+
+ disk1   |   header  | key material |
+         +--------------------------+
+         +---------------------+
+ disk2   |  disk payload data  |
+         +---------------------+
+
+There are a variety of benefits to doing this:
+
+ * Secrecy - the disk2 cannot be identified as containing LUKS
+             volume since there's no header
+ * Control - if access to the disk1 is restricted, then even
+             if someone has access to disk2 they can't unlock
+             it. Might be useful if you have disks on NFS but
+             want to restrict which host can launch a VM
+             instance from it, by dynamically providing access
+             to the header to a designated host
+ * Flexibility - your application data volume may be a given
+                 size and it is inconvenient to resize it to
+                 add encryption.You can store the LUKS header
+                 separately and use the existing storage
+                 volume for payload
+ * Recovery - corruption of a bit in the header may make the
+              entire payload inaccessible. It might be
+              convenient to take backups of the header. If
+              your primary disk header becomes corrupt, you
+              can unlock the data still by pointing to the
+              backup detached header
+
+Architecture
+============
+
+Take the qcow2 encryption, for example. The architecture of the
+LUKS volume with detached header is shown in the diagram below.
+
+There are two children of the root node: a file and a header.
+Data from the disk payload is stored in the file node. The
+LUKS header and key material are located in the header node,
+as previously mentioned.
+
+::
+
+                       +-----------------------------+
+  Root node            |          foo[luks]          |
+                       +-----------------------------+
+                          |                       |
+                     file |                header |
+                          |                       |
+               +---------------------+    +------------------+
+  Child node   |payload-format[qcow2]|    |header-format[raw]|
+               +---------------------+    +------------------+
+                          |                       |
+                     file |                 file  |
+                          |                       |
+               +----------------------+  +---------------------+
+  Child node   |payload-protocol[file]|  |header-protocol[file]|
+               +----------------------+  +---------------------+
+                          |                       |
+                          |                       |
+                          |                       |
+                     Host storage            Host storage
+
+Usage
+=====
+
+Create a LUKS disk with a detached header using qemu-img
+--------------------------------------------------------
+
+Shell commandline::
+
+  # qemu-img create --object secret,id=sec0,data=abc123 -f luks \
+    -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0 \
+    -o detached-header=true test-header.img
+  # qemu-img create -f qcow2 test-payload.qcow2 200G
+  # qemu-img info 'json:{"driver":"luks","file":{"filename": \
+    "test-payload.img"},"header":{"filename":"test-header.img"}}'
+
+Set up a VM's LUKS volume with a detached header
+------------------------------------------------
+
+Qemu commandline::
+
+  # qemu-system-x86_64 ... \
+    -object '{"qom-type":"secret","id":"libvirt-3-format-secret", \
+    "data":"abc123"}' \
+    -blockdev '{"driver":"file","filename":"/path/to/test-header.img", \
+    "node-name":"libvirt-1-storage"}' \
+    -blockdev '{"node-name":"libvirt-1-format","read-only":false, \
+    "driver":"raw","file":"libvirt-1-storage"}' \
+    -blockdev '{"driver":"file","filename":"/path/to/test-payload.qcow2", \
+    "node-name":"libvirt-2-storage"}' \
+    -blockdev '{"node-name":"libvirt-2-format","read-only":false, \
+    "driver":"qcow2","file":"libvirt-2-storage"}' \
+    -blockdev '{"node-name":"libvirt-3-format","driver":"luks", \
+    "file":"libvirt-2-format","header":"libvirt-1-format","key-secret": \
+    "libvirt-3-format-secret"}' \
+    -device '{"driver":"virtio-blk-pci","bus":XXX,"addr":YYY,"drive": \
+    "libvirt-3-format","id":"virtio-disk1"}'
+
+Add LUKS volume to a VM with a detached header
+----------------------------------------------
+
+1. object-add the secret for decrypting the cipher stored in
+   LUKS header above::
+
+    # virsh qemu-monitor-command vm '{"execute":"object-add", \
+      "arguments":{"qom-type":"secret", "id": \
+      "libvirt-4-format-secret", "data":"abc123"}}'
+
+2. block-add the protocol node for LUKS header::
+
+    # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \
+      "arguments":{"node-name":"libvirt-1-storage", "driver":"file", \
+      "filename": "/path/to/test-header.img" }}'
+
+3. block-add the raw-drived node for LUKS header::
+
+    # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \
+      "arguments":{"node-name":"libvirt-1-format", "driver":"raw", \
+      "file":"libvirt-1-storage"}}'
+
+4. block-add the protocol node for disk payload image::
+
+    # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \
+      "arguments":{"node-name":"libvirt-2-storage", "driver":"file", \
+      "filename":"/path/to/test-payload.qcow2"}}'
+
+5. block-add the qcow2-drived format node for disk payload data::
+
+    # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \
+      "arguments":{"node-name":"libvirt-2-format", "driver":"qcow2", \
+      "file":"libvirt-2-storage"}}'
+
+6. block-add the luks-drived format node to link the qcow2 disk
+   with the LUKS header by specifying the field "header"::
+
+    # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \
+      "arguments":{"node-name":"libvirt-3-format", "driver":"luks", \
+      "file":"libvirt-2-format", "header":"libvirt-1-format", \
+      "key-secret":"libvirt-2-format-secret"}}'
+
+7. hot-plug the virtio-blk device finally::
+
+    # virsh qemu-monitor-command vm '{"execute":"device_add", \
+      "arguments": {"driver":"virtio-blk-pci", \
+      "drive": "libvirt-3-format", "id":"virtio-disk2"}}
+
+TODO
+====
+
+1. Support the shared detached LUKS header within the VM.
diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json
index 54a1fc6c10..57f55f6c54 100644
--- a/docs/interop/firmware.json
+++ b/docs/interop/firmware.json
@@ -14,8 +14,10 @@
 # = Firmware
 ##
 
-{ 'include' : 'machine.json' }
-{ 'include' : 'block-core.json' }
+{ 'pragma': {
+    'member-name-exceptions': [
+        'FirmwareArchitecture' # x86_64
+    ] } }
 
 ##
 # @FirmwareOSInterface:
@@ -61,6 +63,27 @@
   'data' : [ 'flash', 'kernel', 'memory' ] }
 
 ##
+# @FirmwareArchitecture:
+#
+# Enumeration of architectures for which Qemu uses additional
+# firmware files.
+#
+# @aarch64: 64-bit Arm.
+#
+# @arm: 32-bit Arm.
+#
+# @i386: 32-bit x86.
+#
+# @loongarch64: 64-bit LoongArch. (since: 7.1)
+#
+# @x86_64: 64-bit x86.
+#
+# Since: 3.0
+##
+{ 'enum' : 'FirmwareArchitecture',
+  'data' : [ 'aarch64', 'arm', 'i386', 'loongarch64', 'x86_64' ] }
+
+##
 # @FirmwareTarget:
 #
 # Defines the machine types that firmware may execute on.
@@ -81,7 +104,7 @@
 # Since: 3.0
 ##
 { 'struct' : 'FirmwareTarget',
-  'data'   : { 'architecture' : 'SysEmuTarget',
+  'data'   : { 'architecture' : 'FirmwareArchitecture',
                'machines'     : [ 'str' ] } }
 
 ##
@@ -201,6 +224,20 @@
              'verbose-dynamic', 'verbose-static' ] }
 
 ##
+# @FirmwareFormat:
+#
+# Formats that are supported for firmware images.
+#
+# @raw: Raw disk image format.
+#
+# @qcow2: The QCOW2 image format.
+#
+# Since: 3.0
+##
+{ 'enum': 'FirmwareFormat',
+  'data': [ 'raw', 'qcow2' ] }
+
+##
 # @FirmwareFlashFile:
 #
 # Defines common properties that are necessary for loading a firmware
@@ -219,7 +256,7 @@
 ##
 { 'struct' : 'FirmwareFlashFile',
   'data'   : { 'filename' : 'str',
-               'format'   : 'BlockdevDriver' } }
+               'format'   : 'FirmwareFormat' } }
 
 
 ##
@@ -433,7 +470,7 @@
 #
 # Since: 3.0
 #
-# Examples:
+# .. qmp-example::
 #
 #     {
 #         "description": "SeaBIOS",
diff --git a/docs/interop/qemu-ga.rst b/docs/interop/qemu-ga.rst
index 72fb75a6f5..9c7380896a 100644
--- a/docs/interop/qemu-ga.rst
+++ b/docs/interop/qemu-ga.rst
@@ -28,11 +28,30 @@ configuration options on the command line. For the same key, the last
 option wins, but the lists accumulate (see below for configuration
 file format).
 
+If an allowed RPCs list is defined in the configuration, then all
+RPCs will be blocked by default, except for the allowed list.
+
+If a blocked RPCs list is defined in the configuration, then all
+RPCs will be allowed by default, except for the blocked list.
+
+If both allowed and blocked RPCs lists are defined in the configuration,
+then all RPCs will be blocked by default, then the allowed list will
+be applied, followed by the blocked list.
+
+While filesystems are frozen, all except for a designated safe set
+of RPCs will blocked, regardless of what the general configuration
+declares.
+
 Options
 -------
 
 .. program:: qemu-ga
 
+.. option:: -c, --config=PATH
+
+  Configuration file path (the default is |CONFDIR|\ ``/qemu-ga.conf``,
+  unless overriden by the QGA_CONF environment variable)
+
 .. option:: -m, --method=METHOD
 
   Transport method: one of ``unix-listen``, ``virtio-serial``, or
@@ -131,6 +150,7 @@ fsfreeze-hook  string
 statedir       string
 verbose        boolean
 block-rpcs     string list
+allow-rpcs     string list
 =============  ===========
 
 See also
diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst b/docs/specs/acpi_hw_reduced_hotplug.rst
index 0bd3f9399f..3acd6fcd8b 100644
--- a/docs/specs/acpi_hw_reduced_hotplug.rst
+++ b/docs/specs/acpi_hw_reduced_hotplug.rst
@@ -64,7 +64,8 @@ GED IO interface (4 byte access)
        0: Memory hotplug event
        1: System power down event
        2: NVDIMM hotplug event
-    3-31: Reserved
+       3: CPU hotplug event
+    4-31: Reserved
 
 **write_access:**
 
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index 1484e3e760..be899b49c2 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -29,7 +29,9 @@ guest hardware that is specific to QEMU.
    edu
    ivshmem-spec
    pvpanic
+   spdm
    standard-vga
    virt-ctlr
    vmcoreinfo
    vmgenid
+   rapl-msr
diff --git a/docs/specs/rapl-msr.rst b/docs/specs/rapl-msr.rst
new file mode 100644
index 0000000000..1202ee89be
--- /dev/null
+++ b/docs/specs/rapl-msr.rst
@@ -0,0 +1,155 @@
+================
+RAPL MSR support
+================
+
+The RAPL interface (Running Average Power Limit) is advertising the accumulated
+energy consumption of various power domains (e.g. CPU packages, DRAM, etc.).
+
+The consumption is reported via MSRs (model specific registers) like
+MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are 64 bits
+registers that represent the accumulated energy consumption in micro Joules.
+
+Thanks to the MSR Filtering patch [#a]_ not all MSRs are handled by KVM. Some
+of them can now be handled by the userspace (QEMU). It uses a mechanism called
+"MSR filtering" where a list of MSRs is given at init time of a VM to KVM so
+that a callback is put in place. The design of this patch uses only this
+mechanism for handling the MSRs between guest/host.
+
+At the moment the following MSRs are involved:
+
+.. code:: C
+
+    #define MSR_RAPL_POWER_UNIT             0x00000606
+    #define MSR_PKG_POWER_LIMIT             0x00000610
+    #define MSR_PKG_ENERGY_STATUS           0x00000611
+    #define MSR_PKG_POWER_INFO              0x00000614
+
+The ``*_POWER_UNIT``, ``*_POWER_LIMIT``, ``*_POWER INFO`` are part of the RAPL
+spec and specify the power limit of the package, provide range of parameter(min
+power, max power,..) and also the information of the multiplier for the energy
+counter to calculate the power. Those MSRs are populated once at the beginning
+by reading the host CPU MSRs and are given back to the guest 1:1 when
+requested.
+
+The MSR_PKG_ENERGY_STATUS is a counter; it represents the total amount of
+energy consumed since the last time the register was cleared. If you multiply
+it with the UNIT provided above you'll get the power in micro-joules. This
+counter is always increasing and it increases more or less faster depending on
+the consumption of the package. This counter is supposed to overflow at some
+point.
+
+Each core belonging to the same Package reading the MSR_PKG_ENERGY_STATUS (i.e
+"rdmsr 0x611") will retrieve the same value. The value represents the energy
+for the whole package. Whatever Core reading it will get the same value and a
+core that belongs to PKG-0 will not be able to get the value of PKG-1 and
+vice-versa.
+
+High level implementation
+-------------------------
+
+In order to update the value of the virtual MSR, a QEMU thread is created.
+The thread is basically just an infinity loop that does:
+
+1. Snapshot of the time metrics of all QEMU threads (Time spent scheduled in
+   Userspace and System)
+
+2. Snapshot of the actual MSR_PKG_ENERGY_STATUS counter of all packages where
+   the QEMU threads are running on.
+
+3. Sleep for 1 second - During this pause the vcpu and other non-vcpu threads
+   will do what they have to do and so the energy counter will increase.
+
+4. Repeat 2. and 3. and calculate the delta of every metrics representing the
+   time spent scheduled for each QEMU thread *and* the energy spent by the
+   packages during the pause.
+
+5. Filter the vcpu threads and the non-vcpu threads.
+
+6. Retrieve the topology of the Virtual Machine. This helps identify which
+   vCPU is running on which virtual package.
+
+7. The total energy spent by the non-vcpu threads is divided by the number
+   of vcpu threads so that each vcpu thread will get an equal part of the
+   energy spent by the QEMU workers.
+
+8. Calculate the ratio of energy spent per vcpu threads.
+
+9. Calculate the energy for each virtual package.
+
+10. The virtual MSRs are updated for each virtual package. Each vCPU that
+    belongs to the same package will return the same value when accessing the
+    the MSR.
+
+11. Loop back to 1.
+
+Ratio calculation
+-----------------
+
+In Linux, a process has an execution time associated with it. The scheduler is
+dividing the time in clock ticks. The number of clock ticks per second can be
+found by the sysconf system call. A typical value of clock ticks per second is
+100. So a core can run a process at the maximum of 100 ticks per second. If a
+package has 4 cores, 400 ticks maximum can be scheduled on all the cores
+of the package for a period of 1 second.
+
+The /proc/[pid]/stat [#b]_ is a sysfs file that can give the executed time of a
+process with the [pid] as the process ID. It gives the amount of ticks the
+process has been scheduled in userspace (utime) and kernel space (stime).
+
+By reading those metrics for a thread, one can calculate the ratio of time the
+package has spent executing the thread.
+
+Example:
+
+A 4 cores package can schedule a maximum of 400 ticks per second with 100 ticks
+per second per core. If a thread was scheduled for 100 ticks between a second
+on this package, that means my thread has been scheduled for 1/4 of the whole
+package. With that, the calculation of the energy spent by the thread on this
+package during this whole second is 1/4 of the total energy spent by the
+package.
+
+Usage
+-----
+
+Currently this feature is only working on an Intel CPU that has the RAPL driver
+mounted and available in the sysfs. if not, QEMU fails at start-up.
+
+This feature is activated with -accel
+kvm,rapl=true,rapl-helper-socket=/path/sock.sock
+
+It is important that the socket path is the same as the one
+:program:`qemu-vmsr-helper` is listening to.
+
+qemu-vmsr-helper
+----------------
+
+The qemu-vmsr-helper is working very much like the qemu-pr-helper. Instead of
+making persistent reservation, qemu-vmsr-helper is here to overcome the
+CVE-2020-8694 which remove user access to the rapl msr attributes.
+
+A socket communication is established between QEMU processes that has the RAPL
+MSR support activated and the qemu-vmsr-helper. A systemd service and socket
+activation is provided in contrib/systemd/qemu-vmsr-helper.(service/socket).
+
+The systemd socket uses 600, like contrib/systemd/qemu-pr-helper.socket. The
+socket can be passed via SCM_RIGHTS by libvirt, or its permissions can be
+changed (e.g. 660 and root:kvm for a Debian system for example). Libvirt could
+also start a separate helper if needed. All in all, the policy is left to the
+user.
+
+See the qemu-pr-helper documentation or manpage for further details.
+
+Current Limitations
+-------------------
+
+- Works only on Intel host CPUs because AMD CPUs are using different MSR
+  addresses.
+
+- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at the
+  moment.
+
+References
+----------
+
+.. [#a] https://patchwork.kernel.org/project/kvm/patch/20200916202951.23760-7-graf@amazon.com/
+.. [#b] https://man7.org/linux/man-pages/man5/proc.5.html
diff --git a/docs/specs/spdm.rst b/docs/specs/spdm.rst
new file mode 100644
index 0000000000..f7de080ff0
--- /dev/null
+++ b/docs/specs/spdm.rst
@@ -0,0 +1,134 @@
+======================================================
+QEMU Security Protocols and Data Models (SPDM) Support
+======================================================
+
+SPDM enables authentication, attestation and key exchange to assist in
+providing infrastructure security enablement. It's a standard published
+by the `DMTF`_.
+
+QEMU supports connecting to a SPDM responder implementation. This allows an
+external application to emulate the SPDM responder logic for an SPDM device.
+
+Setting up a SPDM server
+========================
+
+When using QEMU with SPDM devices QEMU will connect to a server which
+implements the SPDM functionality.
+
+SPDM-Utils
+----------
+
+You can use `SPDM Utils`_ to emulate a responder. This is the simplest method.
+
+SPDM-Utils is a Linux applications to manage, test and develop devices
+supporting DMTF Security Protocol and Data Model (SPDM). It is written in Rust
+and utilises libspdm.
+
+To use SPDM-Utils you will need to do the following steps. Details are included
+in the SPDM-Utils README.
+
+ 1. `Build libspdm`_
+ 2. `Build SPDM Utils`_
+ 3. `Run it as a server`_
+
+spdm-emu
+--------
+
+You can use `spdm emu`_ to model the
+SPDM responder.
+
+.. code-block:: shell
+
+    $ cd spdm-emu
+    $ git submodule init; git submodule update --recursive
+    $ mkdir build; cd build
+    $ cmake -DARCH=x64 -DTOOLCHAIN=GCC -DTARGET=Debug -DCRYPTO=openssl ..
+    $ make -j32
+    $ make copy_sample_key # Build certificates, required for SPDM authentication.
+
+It is worth noting that the certificates should be in compliance with
+PCIe r6.1 sec 6.31.3. This means you will need to add the following to
+openssl.cnf
+
+.. code-block::
+
+    subjectAltName = otherName:2.23.147;UTF8:Vendor=1b36:Device=0010:CC=010802:REV=02:SSVID=1af4:SSID=1100
+    2.23.147 = ASN1:OID:2.23.147
+
+and then manually regenerate some certificates with:
+
+.. code-block:: shell
+
+    $ openssl req -nodes -newkey ec:param.pem -keyout end_responder.key \
+        -out end_responder.req -sha384 -batch \
+        -subj "/CN=DMTF libspdm ECP384 responder cert"
+
+    $ openssl x509 -req -in end_responder.req -out end_responder.cert \
+        -CA inter.cert -CAkey inter.key -sha384 -days 3650 -set_serial 3 \
+        -extensions v3_end -extfile ../openssl.cnf
+
+    $ openssl asn1parse -in end_responder.cert -out end_responder.cert.der
+
+    $ cat ca.cert.der inter.cert.der end_responder.cert.der > bundle_responder.certchain.der
+
+You can use SPDM-Utils instead as it will generate the correct certificates
+automatically.
+
+The responder can then be launched with
+
+.. code-block:: shell
+
+    $ cd bin
+    $ ./spdm_responder_emu --trans PCI_DOE
+
+Connecting an SPDM NVMe device
+==============================
+
+Once a SPDM server is running we can start QEMU and connect to the server.
+
+For an NVMe device first let's setup a block we can use
+
+.. code-block:: shell
+
+    $ cd qemu-spdm/linux/image
+    $ dd if=/dev/zero of=blknvme bs=1M count=2096 # 2GB NNMe Drive
+
+Then you can add this to your QEMU command line:
+
+.. code-block:: shell
+
+    -drive file=blknvme,if=none,id=mynvme,format=raw \
+        -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323
+
+At which point QEMU will try to connect to the SPDM server.
+
+Note that if using x64-64 you will want to use the q35 machine instead
+of the default. So the entire QEMU command might look like this
+
+.. code-block:: shell
+
+    qemu-system-x86_64 -M q35 \
+        --kernel bzImage \
+        -drive file=rootfs.ext2,if=virtio,format=raw \
+        -append "root=/dev/vda console=ttyS0" \
+        -net none -nographic \
+        -drive file=blknvme,if=none,id=mynvme,format=raw \
+        -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323
+
+.. _DMTF:
+   https://www.dmtf.org/standards/SPDM
+
+.. _SPDM Utils:
+   https://github.com/westerndigitalcorporation/spdm-utils
+
+.. _spdm emu:
+   https://github.com/dmtf/spdm-emu
+
+.. _Build libspdm:
+   https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-libspdm
+
+.. _Build SPDM Utils:
+   https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-the-binary
+
+.. _Run it as a server:
+   https://github.com/westerndigitalcorporation/spdm-utils#qemu-spdm-device-emulation
diff --git a/docs/system/index.rst b/docs/system/index.rst
index c21065e519..718e9d3c56 100644
--- a/docs/system/index.rst
+++ b/docs/system/index.rst
@@ -39,3 +39,4 @@ or Hypervisor.Framework.
    multi-process
    confidential-guest-support
    vm-templating
+   sriov
diff --git a/docs/system/sriov.rst b/docs/system/sriov.rst
new file mode 100644
index 0000000000..a851a66a4b
--- /dev/null
+++ b/docs/system/sriov.rst
@@ -0,0 +1,36 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Compsable SR-IOV device
+=======================
+
+SR-IOV (Single Root I/O Virtualization) is an optional extended capability of a
+PCI Express device. It allows a single physical function (PF) to appear as
+multiple virtual functions (VFs) for the main purpose of eliminating software
+overhead in I/O from virtual machines.
+
+There are devices with predefined SR-IOV configurations, but it is also possible
+to compose an SR-IOV device yourself. Composing an SR-IOV device is currently
+only supported by virtio-net-pci.
+
+Users can configure an SR-IOV-capable virtio-net device by adding
+virtio-net-pci functions to a bus. Below is a command line example:
+
+.. code-block:: shell
+
+    -netdev user,id=n -netdev user,id=o
+    -netdev user,id=p -netdev user,id=q
+    -device pcie-root-port,id=b
+    -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f
+    -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f
+    -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f
+    -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f
+
+The VFs specify the paired PF with ``sriov-pf`` property. The PF must be
+added after all VFs. It is the user's responsibility to ensure that VFs have
+function numbers larger than one of the PF, and that the function numbers
+have a consistent stride.
+
+You may also need to perform additional steps to activate the SR-IOV feature on
+your guest. For Linux, refer to [1]_.
+
+.. [1] https://docs.kernel.org/PCI/pci-iov-howto.html
diff --git a/docs/tools/index.rst b/docs/tools/index.rst
index 8e65ce0dfc..33ad438e86 100644
--- a/docs/tools/index.rst
+++ b/docs/tools/index.rst
@@ -16,3 +16,4 @@ command line utilities and other standalone programs.
    qemu-pr-helper
    qemu-trace-stap
    virtfs-proxy-helper
+   qemu-vmsr-helper
diff --git a/docs/tools/qemu-vmsr-helper.rst b/docs/tools/qemu-vmsr-helper.rst
new file mode 100644
index 0000000000..6ec87b49d9
--- /dev/null
+++ b/docs/tools/qemu-vmsr-helper.rst
@@ -0,0 +1,89 @@
+==================================
+QEMU virtual RAPL MSR helper
+==================================
+
+Synopsis
+--------
+
+**qemu-vmsr-helper** [*OPTION*]
+
+Description
+-----------
+
+Implements the virtual RAPL MSR helper for QEMU.
+
+Accessing the RAPL (Running Average Power Limit) MSR enables the RAPL powercap
+driver to advertise and monitor the power consumption or accumulated energy
+consumption of different power domains, such as CPU packages, DRAM, and other
+components when available.
+
+However those register are accesible under priviliged access (CAP_SYS_RAWIO).
+QEMU can use an external helper to access those priviliged register.
+
+:program:`qemu-vmsr-helper` is that external helper; it creates a listener
+socket which will accept incoming connections for communication with QEMU.
+
+If you want to run VMs in a setup like this, this helper should be started as a
+system service, and you should read the QEMU manual section on "RAPL MSR
+support" to find out how to configure QEMU to connect to the socket created by
+:program:`qemu-vmsr-helper`.
+
+After connecting to the socket, :program:`qemu-vmsr-helper` can
+optionally drop root privileges, except for those capabilities that
+are needed for its operation.
+
+:program:`qemu-vmsr-helper` can also use the systemd socket activation
+protocol.  In this case, the systemd socket unit should specify a
+Unix stream socket, like this::
+
+    [Socket]
+    ListenStream=/var/run/qemu-vmsr-helper.sock
+
+Options
+-------
+
+.. program:: qemu-vmsr-helper
+
+.. option:: -d, --daemon
+
+  run in the background (and create a PID file)
+
+.. option:: -q, --quiet
+
+  decrease verbosity
+
+.. option:: -v, --verbose
+
+  increase verbosity
+
+.. option:: -f, --pidfile=PATH
+
+  PID file when running as a daemon. By default the PID file
+  is created in the system runtime state directory, for example
+  :file:`/var/run/qemu-vmsr-helper.pid`.
+
+.. option:: -k, --socket=PATH
+
+  path to the socket. By default the socket is created in
+  the system runtime state directory, for example
+  :file:`/var/run/qemu-vmsr-helper.sock`.
+
+.. option:: -T, --trace [[enable=]PATTERN][,events=FILE][,file=FILE]
+
+  .. include:: ../qemu-option-trace.rst.inc
+
+.. option:: -u, --user=USER
+
+  user to drop privileges to
+
+.. option:: -g, --group=GROUP
+
+  group to drop privileges to
+
+.. option:: -h, --help
+
+  Display a help message and exit.
+
+.. option:: -V, --version
+
+  Display version information and exit.