diff options
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/about/emulation.rst | 2 | ||||
| -rw-r--r-- | docs/devel/crypto.rst | 10 | ||||
| -rw-r--r-- | docs/devel/index-internals.rst | 1 | ||||
| -rw-r--r-- | docs/devel/luks-detached-header.rst | 182 | ||||
| -rw-r--r-- | docs/interop/firmware.json | 47 | ||||
| -rw-r--r-- | docs/interop/qemu-ga.rst | 20 | ||||
| -rw-r--r-- | docs/specs/acpi_hw_reduced_hotplug.rst | 3 | ||||
| -rw-r--r-- | docs/specs/index.rst | 2 | ||||
| -rw-r--r-- | docs/specs/rapl-msr.rst | 155 | ||||
| -rw-r--r-- | docs/specs/spdm.rst | 134 | ||||
| -rw-r--r-- | docs/system/index.rst | 1 | ||||
| -rw-r--r-- | docs/system/sriov.rst | 36 | ||||
| -rw-r--r-- | docs/tools/index.rst | 1 | ||||
| -rw-r--r-- | docs/tools/qemu-vmsr-helper.rst | 89 |
14 files changed, 676 insertions, 7 deletions
diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst index b5ff9c5f69..3bfe8cc14a 100644 --- a/docs/about/emulation.rst +++ b/docs/about/emulation.rst @@ -42,7 +42,7 @@ depending on the guest architecture. - :ref:`Yes<QEMU-PC-System-emulator>` - Yes - The ubiquitous desktop PC CPU architecture, 32 and 64 bit. - * - Loongarch + * - LoongArch - Yes - Yes - A MIPS-like 64bit RISC architecture developed in China diff --git a/docs/devel/crypto.rst b/docs/devel/crypto.rst new file mode 100644 index 0000000000..39b1c910e7 --- /dev/null +++ b/docs/devel/crypto.rst @@ -0,0 +1,10 @@ +.. _crypto-ref: + +==================== +Cryptography in QEMU +==================== + +.. toctree:: + :maxdepth: 2 + + luks-detached-header diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index 5636e9cf1d..4ac7725d72 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -20,3 +20,4 @@ Details about QEMU's various subsystems including how to add features to them. vfio-iommufd writing-monitor-commands virtio-backends + crypto diff --git a/docs/devel/luks-detached-header.rst b/docs/devel/luks-detached-header.rst new file mode 100644 index 0000000000..94ec285c27 --- /dev/null +++ b/docs/devel/luks-detached-header.rst @@ -0,0 +1,182 @@ +================================ +LUKS volume with detached header +================================ + +Introduction +============ + +This document gives an overview of the design of LUKS volume with detached +header and how to use it. + +Background +========== + +The LUKS format has ability to store the header in a separate volume from +the payload. We could extend the LUKS driver in QEMU to support this use +case. + +Normally a LUKS volume has a layout: + +:: + + +-----------------------------------------------+ + | | | | + disk | header | key material | disk payload data | + | | | | + +-----------------------------------------------+ + +With a detached LUKS header, you need 2 disks so getting: + +:: + + +--------------------------+ + disk1 | header | key material | + +--------------------------+ + +---------------------+ + disk2 | disk payload data | + +---------------------+ + +There are a variety of benefits to doing this: + + * Secrecy - the disk2 cannot be identified as containing LUKS + volume since there's no header + * Control - if access to the disk1 is restricted, then even + if someone has access to disk2 they can't unlock + it. Might be useful if you have disks on NFS but + want to restrict which host can launch a VM + instance from it, by dynamically providing access + to the header to a designated host + * Flexibility - your application data volume may be a given + size and it is inconvenient to resize it to + add encryption.You can store the LUKS header + separately and use the existing storage + volume for payload + * Recovery - corruption of a bit in the header may make the + entire payload inaccessible. It might be + convenient to take backups of the header. If + your primary disk header becomes corrupt, you + can unlock the data still by pointing to the + backup detached header + +Architecture +============ + +Take the qcow2 encryption, for example. The architecture of the +LUKS volume with detached header is shown in the diagram below. + +There are two children of the root node: a file and a header. +Data from the disk payload is stored in the file node. The +LUKS header and key material are located in the header node, +as previously mentioned. + +:: + + +-----------------------------+ + Root node | foo[luks] | + +-----------------------------+ + | | + file | header | + | | + +---------------------+ +------------------+ + Child node |payload-format[qcow2]| |header-format[raw]| + +---------------------+ +------------------+ + | | + file | file | + | | + +----------------------+ +---------------------+ + Child node |payload-protocol[file]| |header-protocol[file]| + +----------------------+ +---------------------+ + | | + | | + | | + Host storage Host storage + +Usage +===== + +Create a LUKS disk with a detached header using qemu-img +-------------------------------------------------------- + +Shell commandline:: + + # qemu-img create --object secret,id=sec0,data=abc123 -f luks \ + -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0 \ + -o detached-header=true test-header.img + # qemu-img create -f qcow2 test-payload.qcow2 200G + # qemu-img info 'json:{"driver":"luks","file":{"filename": \ + "test-payload.img"},"header":{"filename":"test-header.img"}}' + +Set up a VM's LUKS volume with a detached header +------------------------------------------------ + +Qemu commandline:: + + # qemu-system-x86_64 ... \ + -object '{"qom-type":"secret","id":"libvirt-3-format-secret", \ + "data":"abc123"}' \ + -blockdev '{"driver":"file","filename":"/path/to/test-header.img", \ + "node-name":"libvirt-1-storage"}' \ + -blockdev '{"node-name":"libvirt-1-format","read-only":false, \ + "driver":"raw","file":"libvirt-1-storage"}' \ + -blockdev '{"driver":"file","filename":"/path/to/test-payload.qcow2", \ + "node-name":"libvirt-2-storage"}' \ + -blockdev '{"node-name":"libvirt-2-format","read-only":false, \ + "driver":"qcow2","file":"libvirt-2-storage"}' \ + -blockdev '{"node-name":"libvirt-3-format","driver":"luks", \ + "file":"libvirt-2-format","header":"libvirt-1-format","key-secret": \ + "libvirt-3-format-secret"}' \ + -device '{"driver":"virtio-blk-pci","bus":XXX,"addr":YYY,"drive": \ + "libvirt-3-format","id":"virtio-disk1"}' + +Add LUKS volume to a VM with a detached header +---------------------------------------------- + +1. object-add the secret for decrypting the cipher stored in + LUKS header above:: + + # virsh qemu-monitor-command vm '{"execute":"object-add", \ + "arguments":{"qom-type":"secret", "id": \ + "libvirt-4-format-secret", "data":"abc123"}}' + +2. block-add the protocol node for LUKS header:: + + # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \ + "arguments":{"node-name":"libvirt-1-storage", "driver":"file", \ + "filename": "/path/to/test-header.img" }}' + +3. block-add the raw-drived node for LUKS header:: + + # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \ + "arguments":{"node-name":"libvirt-1-format", "driver":"raw", \ + "file":"libvirt-1-storage"}}' + +4. block-add the protocol node for disk payload image:: + + # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \ + "arguments":{"node-name":"libvirt-2-storage", "driver":"file", \ + "filename":"/path/to/test-payload.qcow2"}}' + +5. block-add the qcow2-drived format node for disk payload data:: + + # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \ + "arguments":{"node-name":"libvirt-2-format", "driver":"qcow2", \ + "file":"libvirt-2-storage"}}' + +6. block-add the luks-drived format node to link the qcow2 disk + with the LUKS header by specifying the field "header":: + + # virsh qemu-monitor-command vm '{"execute":"blockdev-add", \ + "arguments":{"node-name":"libvirt-3-format", "driver":"luks", \ + "file":"libvirt-2-format", "header":"libvirt-1-format", \ + "key-secret":"libvirt-2-format-secret"}}' + +7. hot-plug the virtio-blk device finally:: + + # virsh qemu-monitor-command vm '{"execute":"device_add", \ + "arguments": {"driver":"virtio-blk-pci", \ + "drive": "libvirt-3-format", "id":"virtio-disk2"}} + +TODO +==== + +1. Support the shared detached LUKS header within the VM. diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 54a1fc6c10..57f55f6c54 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -14,8 +14,10 @@ # = Firmware ## -{ 'include' : 'machine.json' } -{ 'include' : 'block-core.json' } +{ 'pragma': { + 'member-name-exceptions': [ + 'FirmwareArchitecture' # x86_64 + ] } } ## # @FirmwareOSInterface: @@ -61,6 +63,27 @@ 'data' : [ 'flash', 'kernel', 'memory' ] } ## +# @FirmwareArchitecture: +# +# Enumeration of architectures for which Qemu uses additional +# firmware files. +# +# @aarch64: 64-bit Arm. +# +# @arm: 32-bit Arm. +# +# @i386: 32-bit x86. +# +# @loongarch64: 64-bit LoongArch. (since: 7.1) +# +# @x86_64: 64-bit x86. +# +# Since: 3.0 +## +{ 'enum' : 'FirmwareArchitecture', + 'data' : [ 'aarch64', 'arm', 'i386', 'loongarch64', 'x86_64' ] } + +## # @FirmwareTarget: # # Defines the machine types that firmware may execute on. @@ -81,7 +104,7 @@ # Since: 3.0 ## { 'struct' : 'FirmwareTarget', - 'data' : { 'architecture' : 'SysEmuTarget', + 'data' : { 'architecture' : 'FirmwareArchitecture', 'machines' : [ 'str' ] } } ## @@ -201,6 +224,20 @@ 'verbose-dynamic', 'verbose-static' ] } ## +# @FirmwareFormat: +# +# Formats that are supported for firmware images. +# +# @raw: Raw disk image format. +# +# @qcow2: The QCOW2 image format. +# +# Since: 3.0 +## +{ 'enum': 'FirmwareFormat', + 'data': [ 'raw', 'qcow2' ] } + +## # @FirmwareFlashFile: # # Defines common properties that are necessary for loading a firmware @@ -219,7 +256,7 @@ ## { 'struct' : 'FirmwareFlashFile', 'data' : { 'filename' : 'str', - 'format' : 'BlockdevDriver' } } + 'format' : 'FirmwareFormat' } } ## @@ -433,7 +470,7 @@ # # Since: 3.0 # -# Examples: +# .. qmp-example:: # # { # "description": "SeaBIOS", diff --git a/docs/interop/qemu-ga.rst b/docs/interop/qemu-ga.rst index 72fb75a6f5..9c7380896a 100644 --- a/docs/interop/qemu-ga.rst +++ b/docs/interop/qemu-ga.rst @@ -28,11 +28,30 @@ configuration options on the command line. For the same key, the last option wins, but the lists accumulate (see below for configuration file format). +If an allowed RPCs list is defined in the configuration, then all +RPCs will be blocked by default, except for the allowed list. + +If a blocked RPCs list is defined in the configuration, then all +RPCs will be allowed by default, except for the blocked list. + +If both allowed and blocked RPCs lists are defined in the configuration, +then all RPCs will be blocked by default, then the allowed list will +be applied, followed by the blocked list. + +While filesystems are frozen, all except for a designated safe set +of RPCs will blocked, regardless of what the general configuration +declares. + Options ------- .. program:: qemu-ga +.. option:: -c, --config=PATH + + Configuration file path (the default is |CONFDIR|\ ``/qemu-ga.conf``, + unless overriden by the QGA_CONF environment variable) + .. option:: -m, --method=METHOD Transport method: one of ``unix-listen``, ``virtio-serial``, or @@ -131,6 +150,7 @@ fsfreeze-hook string statedir string verbose boolean block-rpcs string list +allow-rpcs string list ============= =========== See also diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst b/docs/specs/acpi_hw_reduced_hotplug.rst index 0bd3f9399f..3acd6fcd8b 100644 --- a/docs/specs/acpi_hw_reduced_hotplug.rst +++ b/docs/specs/acpi_hw_reduced_hotplug.rst @@ -64,7 +64,8 @@ GED IO interface (4 byte access) 0: Memory hotplug event 1: System power down event 2: NVDIMM hotplug event - 3-31: Reserved + 3: CPU hotplug event + 4-31: Reserved **write_access:** diff --git a/docs/specs/index.rst b/docs/specs/index.rst index 1484e3e760..be899b49c2 100644 --- a/docs/specs/index.rst +++ b/docs/specs/index.rst @@ -29,7 +29,9 @@ guest hardware that is specific to QEMU. edu ivshmem-spec pvpanic + spdm standard-vga virt-ctlr vmcoreinfo vmgenid + rapl-msr diff --git a/docs/specs/rapl-msr.rst b/docs/specs/rapl-msr.rst new file mode 100644 index 0000000000..1202ee89be --- /dev/null +++ b/docs/specs/rapl-msr.rst @@ -0,0 +1,155 @@ +================ +RAPL MSR support +================ + +The RAPL interface (Running Average Power Limit) is advertising the accumulated +energy consumption of various power domains (e.g. CPU packages, DRAM, etc.). + +The consumption is reported via MSRs (model specific registers) like +MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are 64 bits +registers that represent the accumulated energy consumption in micro Joules. + +Thanks to the MSR Filtering patch [#a]_ not all MSRs are handled by KVM. Some +of them can now be handled by the userspace (QEMU). It uses a mechanism called +"MSR filtering" where a list of MSRs is given at init time of a VM to KVM so +that a callback is put in place. The design of this patch uses only this +mechanism for handling the MSRs between guest/host. + +At the moment the following MSRs are involved: + +.. code:: C + + #define MSR_RAPL_POWER_UNIT 0x00000606 + #define MSR_PKG_POWER_LIMIT 0x00000610 + #define MSR_PKG_ENERGY_STATUS 0x00000611 + #define MSR_PKG_POWER_INFO 0x00000614 + +The ``*_POWER_UNIT``, ``*_POWER_LIMIT``, ``*_POWER INFO`` are part of the RAPL +spec and specify the power limit of the package, provide range of parameter(min +power, max power,..) and also the information of the multiplier for the energy +counter to calculate the power. Those MSRs are populated once at the beginning +by reading the host CPU MSRs and are given back to the guest 1:1 when +requested. + +The MSR_PKG_ENERGY_STATUS is a counter; it represents the total amount of +energy consumed since the last time the register was cleared. If you multiply +it with the UNIT provided above you'll get the power in micro-joules. This +counter is always increasing and it increases more or less faster depending on +the consumption of the package. This counter is supposed to overflow at some +point. + +Each core belonging to the same Package reading the MSR_PKG_ENERGY_STATUS (i.e +"rdmsr 0x611") will retrieve the same value. The value represents the energy +for the whole package. Whatever Core reading it will get the same value and a +core that belongs to PKG-0 will not be able to get the value of PKG-1 and +vice-versa. + +High level implementation +------------------------- + +In order to update the value of the virtual MSR, a QEMU thread is created. +The thread is basically just an infinity loop that does: + +1. Snapshot of the time metrics of all QEMU threads (Time spent scheduled in + Userspace and System) + +2. Snapshot of the actual MSR_PKG_ENERGY_STATUS counter of all packages where + the QEMU threads are running on. + +3. Sleep for 1 second - During this pause the vcpu and other non-vcpu threads + will do what they have to do and so the energy counter will increase. + +4. Repeat 2. and 3. and calculate the delta of every metrics representing the + time spent scheduled for each QEMU thread *and* the energy spent by the + packages during the pause. + +5. Filter the vcpu threads and the non-vcpu threads. + +6. Retrieve the topology of the Virtual Machine. This helps identify which + vCPU is running on which virtual package. + +7. The total energy spent by the non-vcpu threads is divided by the number + of vcpu threads so that each vcpu thread will get an equal part of the + energy spent by the QEMU workers. + +8. Calculate the ratio of energy spent per vcpu threads. + +9. Calculate the energy for each virtual package. + +10. The virtual MSRs are updated for each virtual package. Each vCPU that + belongs to the same package will return the same value when accessing the + the MSR. + +11. Loop back to 1. + +Ratio calculation +----------------- + +In Linux, a process has an execution time associated with it. The scheduler is +dividing the time in clock ticks. The number of clock ticks per second can be +found by the sysconf system call. A typical value of clock ticks per second is +100. So a core can run a process at the maximum of 100 ticks per second. If a +package has 4 cores, 400 ticks maximum can be scheduled on all the cores +of the package for a period of 1 second. + +The /proc/[pid]/stat [#b]_ is a sysfs file that can give the executed time of a +process with the [pid] as the process ID. It gives the amount of ticks the +process has been scheduled in userspace (utime) and kernel space (stime). + +By reading those metrics for a thread, one can calculate the ratio of time the +package has spent executing the thread. + +Example: + +A 4 cores package can schedule a maximum of 400 ticks per second with 100 ticks +per second per core. If a thread was scheduled for 100 ticks between a second +on this package, that means my thread has been scheduled for 1/4 of the whole +package. With that, the calculation of the energy spent by the thread on this +package during this whole second is 1/4 of the total energy spent by the +package. + +Usage +----- + +Currently this feature is only working on an Intel CPU that has the RAPL driver +mounted and available in the sysfs. if not, QEMU fails at start-up. + +This feature is activated with -accel +kvm,rapl=true,rapl-helper-socket=/path/sock.sock + +It is important that the socket path is the same as the one +:program:`qemu-vmsr-helper` is listening to. + +qemu-vmsr-helper +---------------- + +The qemu-vmsr-helper is working very much like the qemu-pr-helper. Instead of +making persistent reservation, qemu-vmsr-helper is here to overcome the +CVE-2020-8694 which remove user access to the rapl msr attributes. + +A socket communication is established between QEMU processes that has the RAPL +MSR support activated and the qemu-vmsr-helper. A systemd service and socket +activation is provided in contrib/systemd/qemu-vmsr-helper.(service/socket). + +The systemd socket uses 600, like contrib/systemd/qemu-pr-helper.socket. The +socket can be passed via SCM_RIGHTS by libvirt, or its permissions can be +changed (e.g. 660 and root:kvm for a Debian system for example). Libvirt could +also start a separate helper if needed. All in all, the policy is left to the +user. + +See the qemu-pr-helper documentation or manpage for further details. + +Current Limitations +------------------- + +- Works only on Intel host CPUs because AMD CPUs are using different MSR + addresses. + +- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at the + moment. + +References +---------- + +.. [#a] https://patchwork.kernel.org/project/kvm/patch/20200916202951.23760-7-graf@amazon.com/ +.. [#b] https://man7.org/linux/man-pages/man5/proc.5.html diff --git a/docs/specs/spdm.rst b/docs/specs/spdm.rst new file mode 100644 index 0000000000..f7de080ff0 --- /dev/null +++ b/docs/specs/spdm.rst @@ -0,0 +1,134 @@ +====================================================== +QEMU Security Protocols and Data Models (SPDM) Support +====================================================== + +SPDM enables authentication, attestation and key exchange to assist in +providing infrastructure security enablement. It's a standard published +by the `DMTF`_. + +QEMU supports connecting to a SPDM responder implementation. This allows an +external application to emulate the SPDM responder logic for an SPDM device. + +Setting up a SPDM server +======================== + +When using QEMU with SPDM devices QEMU will connect to a server which +implements the SPDM functionality. + +SPDM-Utils +---------- + +You can use `SPDM Utils`_ to emulate a responder. This is the simplest method. + +SPDM-Utils is a Linux applications to manage, test and develop devices +supporting DMTF Security Protocol and Data Model (SPDM). It is written in Rust +and utilises libspdm. + +To use SPDM-Utils you will need to do the following steps. Details are included +in the SPDM-Utils README. + + 1. `Build libspdm`_ + 2. `Build SPDM Utils`_ + 3. `Run it as a server`_ + +spdm-emu +-------- + +You can use `spdm emu`_ to model the +SPDM responder. + +.. code-block:: shell + + $ cd spdm-emu + $ git submodule init; git submodule update --recursive + $ mkdir build; cd build + $ cmake -DARCH=x64 -DTOOLCHAIN=GCC -DTARGET=Debug -DCRYPTO=openssl .. + $ make -j32 + $ make copy_sample_key # Build certificates, required for SPDM authentication. + +It is worth noting that the certificates should be in compliance with +PCIe r6.1 sec 6.31.3. This means you will need to add the following to +openssl.cnf + +.. code-block:: + + subjectAltName = otherName:2.23.147;UTF8:Vendor=1b36:Device=0010:CC=010802:REV=02:SSVID=1af4:SSID=1100 + 2.23.147 = ASN1:OID:2.23.147 + +and then manually regenerate some certificates with: + +.. code-block:: shell + + $ openssl req -nodes -newkey ec:param.pem -keyout end_responder.key \ + -out end_responder.req -sha384 -batch \ + -subj "/CN=DMTF libspdm ECP384 responder cert" + + $ openssl x509 -req -in end_responder.req -out end_responder.cert \ + -CA inter.cert -CAkey inter.key -sha384 -days 3650 -set_serial 3 \ + -extensions v3_end -extfile ../openssl.cnf + + $ openssl asn1parse -in end_responder.cert -out end_responder.cert.der + + $ cat ca.cert.der inter.cert.der end_responder.cert.der > bundle_responder.certchain.der + +You can use SPDM-Utils instead as it will generate the correct certificates +automatically. + +The responder can then be launched with + +.. code-block:: shell + + $ cd bin + $ ./spdm_responder_emu --trans PCI_DOE + +Connecting an SPDM NVMe device +============================== + +Once a SPDM server is running we can start QEMU and connect to the server. + +For an NVMe device first let's setup a block we can use + +.. code-block:: shell + + $ cd qemu-spdm/linux/image + $ dd if=/dev/zero of=blknvme bs=1M count=2096 # 2GB NNMe Drive + +Then you can add this to your QEMU command line: + +.. code-block:: shell + + -drive file=blknvme,if=none,id=mynvme,format=raw \ + -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323 + +At which point QEMU will try to connect to the SPDM server. + +Note that if using x64-64 you will want to use the q35 machine instead +of the default. So the entire QEMU command might look like this + +.. code-block:: shell + + qemu-system-x86_64 -M q35 \ + --kernel bzImage \ + -drive file=rootfs.ext2,if=virtio,format=raw \ + -append "root=/dev/vda console=ttyS0" \ + -net none -nographic \ + -drive file=blknvme,if=none,id=mynvme,format=raw \ + -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323 + +.. _DMTF: + https://www.dmtf.org/standards/SPDM + +.. _SPDM Utils: + https://github.com/westerndigitalcorporation/spdm-utils + +.. _spdm emu: + https://github.com/dmtf/spdm-emu + +.. _Build libspdm: + https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-libspdm + +.. _Build SPDM Utils: + https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-the-binary + +.. _Run it as a server: + https://github.com/westerndigitalcorporation/spdm-utils#qemu-spdm-device-emulation diff --git a/docs/system/index.rst b/docs/system/index.rst index c21065e519..718e9d3c56 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -39,3 +39,4 @@ or Hypervisor.Framework. multi-process confidential-guest-support vm-templating + sriov diff --git a/docs/system/sriov.rst b/docs/system/sriov.rst new file mode 100644 index 0000000000..a851a66a4b --- /dev/null +++ b/docs/system/sriov.rst @@ -0,0 +1,36 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +Compsable SR-IOV device +======================= + +SR-IOV (Single Root I/O Virtualization) is an optional extended capability of a +PCI Express device. It allows a single physical function (PF) to appear as +multiple virtual functions (VFs) for the main purpose of eliminating software +overhead in I/O from virtual machines. + +There are devices with predefined SR-IOV configurations, but it is also possible +to compose an SR-IOV device yourself. Composing an SR-IOV device is currently +only supported by virtio-net-pci. + +Users can configure an SR-IOV-capable virtio-net device by adding +virtio-net-pci functions to a bus. Below is a command line example: + +.. code-block:: shell + + -netdev user,id=n -netdev user,id=o + -netdev user,id=p -netdev user,id=q + -device pcie-root-port,id=b + -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f + -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f + -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f + -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f + +The VFs specify the paired PF with ``sriov-pf`` property. The PF must be +added after all VFs. It is the user's responsibility to ensure that VFs have +function numbers larger than one of the PF, and that the function numbers +have a consistent stride. + +You may also need to perform additional steps to activate the SR-IOV feature on +your guest. For Linux, refer to [1]_. + +.. [1] https://docs.kernel.org/PCI/pci-iov-howto.html diff --git a/docs/tools/index.rst b/docs/tools/index.rst index 8e65ce0dfc..33ad438e86 100644 --- a/docs/tools/index.rst +++ b/docs/tools/index.rst @@ -16,3 +16,4 @@ command line utilities and other standalone programs. qemu-pr-helper qemu-trace-stap virtfs-proxy-helper + qemu-vmsr-helper diff --git a/docs/tools/qemu-vmsr-helper.rst b/docs/tools/qemu-vmsr-helper.rst new file mode 100644 index 0000000000..6ec87b49d9 --- /dev/null +++ b/docs/tools/qemu-vmsr-helper.rst @@ -0,0 +1,89 @@ +================================== +QEMU virtual RAPL MSR helper +================================== + +Synopsis +-------- + +**qemu-vmsr-helper** [*OPTION*] + +Description +----------- + +Implements the virtual RAPL MSR helper for QEMU. + +Accessing the RAPL (Running Average Power Limit) MSR enables the RAPL powercap +driver to advertise and monitor the power consumption or accumulated energy +consumption of different power domains, such as CPU packages, DRAM, and other +components when available. + +However those register are accesible under priviliged access (CAP_SYS_RAWIO). +QEMU can use an external helper to access those priviliged register. + +:program:`qemu-vmsr-helper` is that external helper; it creates a listener +socket which will accept incoming connections for communication with QEMU. + +If you want to run VMs in a setup like this, this helper should be started as a +system service, and you should read the QEMU manual section on "RAPL MSR +support" to find out how to configure QEMU to connect to the socket created by +:program:`qemu-vmsr-helper`. + +After connecting to the socket, :program:`qemu-vmsr-helper` can +optionally drop root privileges, except for those capabilities that +are needed for its operation. + +:program:`qemu-vmsr-helper` can also use the systemd socket activation +protocol. In this case, the systemd socket unit should specify a +Unix stream socket, like this:: + + [Socket] + ListenStream=/var/run/qemu-vmsr-helper.sock + +Options +------- + +.. program:: qemu-vmsr-helper + +.. option:: -d, --daemon + + run in the background (and create a PID file) + +.. option:: -q, --quiet + + decrease verbosity + +.. option:: -v, --verbose + + increase verbosity + +.. option:: -f, --pidfile=PATH + + PID file when running as a daemon. By default the PID file + is created in the system runtime state directory, for example + :file:`/var/run/qemu-vmsr-helper.pid`. + +.. option:: -k, --socket=PATH + + path to the socket. By default the socket is created in + the system runtime state directory, for example + :file:`/var/run/qemu-vmsr-helper.sock`. + +.. option:: -T, --trace [[enable=]PATTERN][,events=FILE][,file=FILE] + + .. include:: ../qemu-option-trace.rst.inc + +.. option:: -u, --user=USER + + user to drop privileges to + +.. option:: -g, --group=GROUP + + group to drop privileges to + +.. option:: -h, --help + + Display a help message and exit. + +.. option:: -V, --version + + Display version information and exit. |