diff options
Diffstat (limited to 'docs/specs')
| -rw-r--r-- | docs/specs/acpi_hw_reduced_hotplug.rst | 3 | ||||
| -rw-r--r-- | docs/specs/index.rst | 2 | ||||
| -rw-r--r-- | docs/specs/rapl-msr.rst | 155 | ||||
| -rw-r--r-- | docs/specs/spdm.rst | 134 |
4 files changed, 293 insertions, 1 deletions
diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst b/docs/specs/acpi_hw_reduced_hotplug.rst index 0bd3f9399f..3acd6fcd8b 100644 --- a/docs/specs/acpi_hw_reduced_hotplug.rst +++ b/docs/specs/acpi_hw_reduced_hotplug.rst @@ -64,7 +64,8 @@ GED IO interface (4 byte access) 0: Memory hotplug event 1: System power down event 2: NVDIMM hotplug event - 3-31: Reserved + 3: CPU hotplug event + 4-31: Reserved **write_access:** diff --git a/docs/specs/index.rst b/docs/specs/index.rst index 1484e3e760..be899b49c2 100644 --- a/docs/specs/index.rst +++ b/docs/specs/index.rst @@ -29,7 +29,9 @@ guest hardware that is specific to QEMU. edu ivshmem-spec pvpanic + spdm standard-vga virt-ctlr vmcoreinfo vmgenid + rapl-msr diff --git a/docs/specs/rapl-msr.rst b/docs/specs/rapl-msr.rst new file mode 100644 index 0000000000..1202ee89be --- /dev/null +++ b/docs/specs/rapl-msr.rst @@ -0,0 +1,155 @@ +================ +RAPL MSR support +================ + +The RAPL interface (Running Average Power Limit) is advertising the accumulated +energy consumption of various power domains (e.g. CPU packages, DRAM, etc.). + +The consumption is reported via MSRs (model specific registers) like +MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are 64 bits +registers that represent the accumulated energy consumption in micro Joules. + +Thanks to the MSR Filtering patch [#a]_ not all MSRs are handled by KVM. Some +of them can now be handled by the userspace (QEMU). It uses a mechanism called +"MSR filtering" where a list of MSRs is given at init time of a VM to KVM so +that a callback is put in place. The design of this patch uses only this +mechanism for handling the MSRs between guest/host. + +At the moment the following MSRs are involved: + +.. code:: C + + #define MSR_RAPL_POWER_UNIT 0x00000606 + #define MSR_PKG_POWER_LIMIT 0x00000610 + #define MSR_PKG_ENERGY_STATUS 0x00000611 + #define MSR_PKG_POWER_INFO 0x00000614 + +The ``*_POWER_UNIT``, ``*_POWER_LIMIT``, ``*_POWER INFO`` are part of the RAPL +spec and specify the power limit of the package, provide range of parameter(min +power, max power,..) and also the information of the multiplier for the energy +counter to calculate the power. Those MSRs are populated once at the beginning +by reading the host CPU MSRs and are given back to the guest 1:1 when +requested. + +The MSR_PKG_ENERGY_STATUS is a counter; it represents the total amount of +energy consumed since the last time the register was cleared. If you multiply +it with the UNIT provided above you'll get the power in micro-joules. This +counter is always increasing and it increases more or less faster depending on +the consumption of the package. This counter is supposed to overflow at some +point. + +Each core belonging to the same Package reading the MSR_PKG_ENERGY_STATUS (i.e +"rdmsr 0x611") will retrieve the same value. The value represents the energy +for the whole package. Whatever Core reading it will get the same value and a +core that belongs to PKG-0 will not be able to get the value of PKG-1 and +vice-versa. + +High level implementation +------------------------- + +In order to update the value of the virtual MSR, a QEMU thread is created. +The thread is basically just an infinity loop that does: + +1. Snapshot of the time metrics of all QEMU threads (Time spent scheduled in + Userspace and System) + +2. Snapshot of the actual MSR_PKG_ENERGY_STATUS counter of all packages where + the QEMU threads are running on. + +3. Sleep for 1 second - During this pause the vcpu and other non-vcpu threads + will do what they have to do and so the energy counter will increase. + +4. Repeat 2. and 3. and calculate the delta of every metrics representing the + time spent scheduled for each QEMU thread *and* the energy spent by the + packages during the pause. + +5. Filter the vcpu threads and the non-vcpu threads. + +6. Retrieve the topology of the Virtual Machine. This helps identify which + vCPU is running on which virtual package. + +7. The total energy spent by the non-vcpu threads is divided by the number + of vcpu threads so that each vcpu thread will get an equal part of the + energy spent by the QEMU workers. + +8. Calculate the ratio of energy spent per vcpu threads. + +9. Calculate the energy for each virtual package. + +10. The virtual MSRs are updated for each virtual package. Each vCPU that + belongs to the same package will return the same value when accessing the + the MSR. + +11. Loop back to 1. + +Ratio calculation +----------------- + +In Linux, a process has an execution time associated with it. The scheduler is +dividing the time in clock ticks. The number of clock ticks per second can be +found by the sysconf system call. A typical value of clock ticks per second is +100. So a core can run a process at the maximum of 100 ticks per second. If a +package has 4 cores, 400 ticks maximum can be scheduled on all the cores +of the package for a period of 1 second. + +The /proc/[pid]/stat [#b]_ is a sysfs file that can give the executed time of a +process with the [pid] as the process ID. It gives the amount of ticks the +process has been scheduled in userspace (utime) and kernel space (stime). + +By reading those metrics for a thread, one can calculate the ratio of time the +package has spent executing the thread. + +Example: + +A 4 cores package can schedule a maximum of 400 ticks per second with 100 ticks +per second per core. If a thread was scheduled for 100 ticks between a second +on this package, that means my thread has been scheduled for 1/4 of the whole +package. With that, the calculation of the energy spent by the thread on this +package during this whole second is 1/4 of the total energy spent by the +package. + +Usage +----- + +Currently this feature is only working on an Intel CPU that has the RAPL driver +mounted and available in the sysfs. if not, QEMU fails at start-up. + +This feature is activated with -accel +kvm,rapl=true,rapl-helper-socket=/path/sock.sock + +It is important that the socket path is the same as the one +:program:`qemu-vmsr-helper` is listening to. + +qemu-vmsr-helper +---------------- + +The qemu-vmsr-helper is working very much like the qemu-pr-helper. Instead of +making persistent reservation, qemu-vmsr-helper is here to overcome the +CVE-2020-8694 which remove user access to the rapl msr attributes. + +A socket communication is established between QEMU processes that has the RAPL +MSR support activated and the qemu-vmsr-helper. A systemd service and socket +activation is provided in contrib/systemd/qemu-vmsr-helper.(service/socket). + +The systemd socket uses 600, like contrib/systemd/qemu-pr-helper.socket. The +socket can be passed via SCM_RIGHTS by libvirt, or its permissions can be +changed (e.g. 660 and root:kvm for a Debian system for example). Libvirt could +also start a separate helper if needed. All in all, the policy is left to the +user. + +See the qemu-pr-helper documentation or manpage for further details. + +Current Limitations +------------------- + +- Works only on Intel host CPUs because AMD CPUs are using different MSR + addresses. + +- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at the + moment. + +References +---------- + +.. [#a] https://patchwork.kernel.org/project/kvm/patch/20200916202951.23760-7-graf@amazon.com/ +.. [#b] https://man7.org/linux/man-pages/man5/proc.5.html diff --git a/docs/specs/spdm.rst b/docs/specs/spdm.rst new file mode 100644 index 0000000000..f7de080ff0 --- /dev/null +++ b/docs/specs/spdm.rst @@ -0,0 +1,134 @@ +====================================================== +QEMU Security Protocols and Data Models (SPDM) Support +====================================================== + +SPDM enables authentication, attestation and key exchange to assist in +providing infrastructure security enablement. It's a standard published +by the `DMTF`_. + +QEMU supports connecting to a SPDM responder implementation. This allows an +external application to emulate the SPDM responder logic for an SPDM device. + +Setting up a SPDM server +======================== + +When using QEMU with SPDM devices QEMU will connect to a server which +implements the SPDM functionality. + +SPDM-Utils +---------- + +You can use `SPDM Utils`_ to emulate a responder. This is the simplest method. + +SPDM-Utils is a Linux applications to manage, test and develop devices +supporting DMTF Security Protocol and Data Model (SPDM). It is written in Rust +and utilises libspdm. + +To use SPDM-Utils you will need to do the following steps. Details are included +in the SPDM-Utils README. + + 1. `Build libspdm`_ + 2. `Build SPDM Utils`_ + 3. `Run it as a server`_ + +spdm-emu +-------- + +You can use `spdm emu`_ to model the +SPDM responder. + +.. code-block:: shell + + $ cd spdm-emu + $ git submodule init; git submodule update --recursive + $ mkdir build; cd build + $ cmake -DARCH=x64 -DTOOLCHAIN=GCC -DTARGET=Debug -DCRYPTO=openssl .. + $ make -j32 + $ make copy_sample_key # Build certificates, required for SPDM authentication. + +It is worth noting that the certificates should be in compliance with +PCIe r6.1 sec 6.31.3. This means you will need to add the following to +openssl.cnf + +.. code-block:: + + subjectAltName = otherName:2.23.147;UTF8:Vendor=1b36:Device=0010:CC=010802:REV=02:SSVID=1af4:SSID=1100 + 2.23.147 = ASN1:OID:2.23.147 + +and then manually regenerate some certificates with: + +.. code-block:: shell + + $ openssl req -nodes -newkey ec:param.pem -keyout end_responder.key \ + -out end_responder.req -sha384 -batch \ + -subj "/CN=DMTF libspdm ECP384 responder cert" + + $ openssl x509 -req -in end_responder.req -out end_responder.cert \ + -CA inter.cert -CAkey inter.key -sha384 -days 3650 -set_serial 3 \ + -extensions v3_end -extfile ../openssl.cnf + + $ openssl asn1parse -in end_responder.cert -out end_responder.cert.der + + $ cat ca.cert.der inter.cert.der end_responder.cert.der > bundle_responder.certchain.der + +You can use SPDM-Utils instead as it will generate the correct certificates +automatically. + +The responder can then be launched with + +.. code-block:: shell + + $ cd bin + $ ./spdm_responder_emu --trans PCI_DOE + +Connecting an SPDM NVMe device +============================== + +Once a SPDM server is running we can start QEMU and connect to the server. + +For an NVMe device first let's setup a block we can use + +.. code-block:: shell + + $ cd qemu-spdm/linux/image + $ dd if=/dev/zero of=blknvme bs=1M count=2096 # 2GB NNMe Drive + +Then you can add this to your QEMU command line: + +.. code-block:: shell + + -drive file=blknvme,if=none,id=mynvme,format=raw \ + -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323 + +At which point QEMU will try to connect to the SPDM server. + +Note that if using x64-64 you will want to use the q35 machine instead +of the default. So the entire QEMU command might look like this + +.. code-block:: shell + + qemu-system-x86_64 -M q35 \ + --kernel bzImage \ + -drive file=rootfs.ext2,if=virtio,format=raw \ + -append "root=/dev/vda console=ttyS0" \ + -net none -nographic \ + -drive file=blknvme,if=none,id=mynvme,format=raw \ + -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323 + +.. _DMTF: + https://www.dmtf.org/standards/SPDM + +.. _SPDM Utils: + https://github.com/westerndigitalcorporation/spdm-utils + +.. _spdm emu: + https://github.com/dmtf/spdm-emu + +.. _Build libspdm: + https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-libspdm + +.. _Build SPDM Utils: + https://github.com/westerndigitalcorporation/spdm-utils?tab=readme-ov-file#build-the-binary + +.. _Run it as a server: + https://github.com/westerndigitalcorporation/spdm-utils#qemu-spdm-device-emulation |