summary refs log tree commit diff stats
path: root/target/i386/machine.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* target/i386: Add support for save/load of exception error codeXin Wang2025-08-201-0/+19
| | | | | | | | | | | | | | | | | | | | | For now, qemu save/load CPU exception info(such as exception_nr and has_error_code), while the exception error_code is ignored. This will cause the dest hypervisor reinject a vCPU exception with error_code(0), potentially causing a guest kernel panic. For instance, if src VM stopped with an user-mode write #PF (error_code 6), the dest hypervisor will reinject an #PF with error_code(0) when vCPU resume, then guest kernel panic as: BUG: unable to handle page fault for address: 00007f80319cb010 #PF: supervisor read access in user mode #PF: error_code(0x0000) - not-present page RIP: 0033:0x40115d To fix it, support save/load exception error_code. Signed-off-by: Xin Wang <wangxinxin.wang@huawei.com> Link: https://lore.kernel.org/r/20250819145834.3998-1-wangxinxin.wang@huawei.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hw/i386/x86: Remove X86MachineClass::save_tsc_khz fieldPhilippe Mathieu-Daudé2025-05-301-3/+2
| | | | | | | | | | | | | | The X86MachineClass::save_tsc_khz boolean was only used by the pc-q35-2.5 and pc-i440fx-2.5 machines, which got removed. Remove it and simplify tsc_khz_needed(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20250512083948.39294-11-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
* include/exec: Split out watchpoint.hRichard Henderson2025-04-231-1/+1
| | | | | | | | | | Relatively few objects in qemu care about watchpoints, so split out to a new header. Removes an instance of CONFIG_USER_ONLY from hw/core/cpu.h. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* exec: Declare tlb_flush*() in 'exec/cputlb.h'Philippe Mathieu-Daudé2025-03-081-1/+1
| | | | | | | | | | Move CPU TLB related methods to "exec/cputlb.h". Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-ID: <20241114011310.3615-19-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* include: Rename sysemu/ -> system/Philippe Mathieu-Daudé2024-12-201-3/+3
| | | | | | | | | | | | | Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* target/i386: Add support save/load HWCR MSRGao Shiyuan2024-10-171-0/+20
| | | | | | | | | | | | KVM commit 191c8137a939 ("x86/kvm: Implement HWCR support") introduced support for emulating HWCR MSR. Add support for QEMU to save/load this MSR for migration purposes. Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com> Signed-off-by: Wang Liang <wangliang44@baidu.com> Link: https://lore.kernel.org/r/20241009095109.66843-1-gaoshiyuan@baidu.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: Add get/set/migrate support for FRED MSRsXin Li2024-06-081-0/+28
| | | | | | | | | | | | FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP. Save/restore/migrate FRED MSRs if FRED is exposed to the guest. Tested-by: Shan Kang <shan.kang@intel.com> Signed-off-by: Xin Li <xin3.li@intel.com> Message-ID: <20231109072012.8078-7-xin3.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: Constify VMState in machine.cRichard Henderson2023-12-291-64/+64
| | | | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231221031652.119827-9-richard.henderson@linaro.org>
* i386: spelling fixesMichael Tokarev2023-09-201-2/+2
| | | | | Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
* i386/xen: handle PV timer hypercallsJoao Martins2023-03-011-0/+1
| | | | | | | | | | | | | | Introduce support for one shot and periodic mode of Xen PV timers, whereby timer interrupts come through a special virq event channel with deadlines being set through: 1) set_timer_op hypercall (only oneshot) 2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer hypercalls Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* hw/xen: Implement EVTCHNOP_bind_virqDavid Woodhouse2023-03-011-0/+2
| | | | | | | | | | | | | | Add the array of virq ports to each vCPU so that we can deliver timers, debug ports, etc. Global virqs are allocated against vCPU 0 initially, but can be migrated to other vCPUs (when we implement that). The kernel needs to know about VIRQ_TIMER in order to accelerate timers, so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value of the singleshot timer across migration, as the kernel will handle the hypercalls automatically now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* i386/xen: implement HVMOP_set_evtchn_upcall_vectorAnkur Arora2023-03-011-0/+1
| | | | | | | | | | | | | | | The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall vector, to be delivered to the local APIC just like an MSI (with an EOI). This takes precedence over the system-wide delivery method set by the HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by Windows and Xen (PV shim) guests but normally not by Linux. Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* i386/xen: handle VCPUOP_register_runstate_memory_areaJoao Martins2023-03-011-0/+1
| | | | | | | | | Allow guest to setup the vcpu runstates which is used as steal clock. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* i386/xen: handle VCPUOP_register_vcpu_time_infoJoao Martins2023-03-011-0/+1
| | | | | | | | In order to support Linux vdso in Xen. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* i386/xen: handle VCPUOP_register_vcpu_infoJoao Martins2023-03-011-0/+19
| | | | | | | | | | | | | | | | Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its *own* vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple faultChenyi Qiang2022-10-101-0/+20
| | | | | | | | | | | | | | | | For the direct triple faults, i.e. hardware detected and KVM morphed to VM-Exit, KVM will never lose them. But for triple faults sythesized by KVM, e.g. the RSM path, if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that the event.triple_fault_pending field contains a valid state if the KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: Enable Arch LBR migration states in vmstateYang Weijiang2022-05-141-0/+38
| | | | | | | | | | The Arch LBR record MSRs and control MSRs will be migrated to destination guest if the vcpus were running with Arch LBR active. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-8-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: Support XFD and AMX xsave data migrationZeng Guang2022-03-151-0/+46
| | | | | | | | | | | | | | | | XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: use KVM_{GET|SET}_SREGS2 when supported.Maxim Levitsky2022-01-121-0/+29
| | | | | | | | | | This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: SVM: add migration support for nested TSC scalingMaxim Levitsky2021-11-021-0/+22
| | | | | | Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* i386: Add get/set/migrate support for SGX_LEPUBKEYHASH MSRsSean Christopherson2021-09-301-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | On real hardware, on systems that supports SGX Launch Control, those MSRs are initialized to digest of Intel's signing key; on systems that don't support SGX Launch Control, those MSRs are not available but hardware always uses digest of Intel's signing key in EINIT. KVM advertises SGX LC via CPUID if and only if the MSRs are writable. Unconditionally initialize those MSRs to digest of Intel's signing key when CPU is realized and reset to reflect the fact. This avoids potential bug in case kvm_arch_put_registers() is called before kvm_arch_get_registers() is called, in which case guest's virtual SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those to digest of Intel's signing key by default, since KVM allows those MSRs to be updated by Qemu to support live migration. Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they are writable by the guest. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-11-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: Moved int_ctl into CPUX86State structureLara Lazier2021-09-131-1/+21
| | | | | | | | | Moved int_ctl into the CPUX86State structure. It removes some unnecessary stores and loads, and prepares for tracking the vIRQ state even when it is masked due to vGIF. Signed-off-by: Lara Lazier <laramglazier@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* vmstate: Constify some VMStateDescriptionsKeqian Zhu2021-05-021-1/+1
| | | | | | | | | | Constify vmstate_ecc_state and vmstate_x86_cpu. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210408140706.23412-1-zhukeqian1@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* i386: Make migration fail when Hyper-V reenlightenment was enabled but ↵Vitaly Kuznetsov2021-03-191-0/+20
| | | | | | | | | | | | | | | | | | | | | | | 'user_tsc_khz' is unset KVM doesn't fully support Hyper-V reenlightenment notifications on migration. In particular, it doesn't support emulating TSC frequency of the source host by trapping all TSC accesses so unless TSC scaling is supported on the destination host and KVM_SET_TSC_KHZ succeeds, it is unsafe to proceed with migration. KVM_SET_TSC_KHZ is called from two sites: kvm_arch_init_vcpu() and kvm_arch_put_registers(). The later (intentionally) doesn't propagate errors allowing migrations to succeed even when TSC scaling is not supported on the destination. This doesn't suit 're-enlightenment' use-case as we have to guarantee that TSC frequency stays constant. Require 'tsc-frequency=' command line option to be specified for successful migration when re-enlightenment was enabled by the guest. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210319123801.1111090-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* i386: Fix 'hypercall_hypercall' typoVitaly Kuznetsov2021-03-191-2/+2
| | | | | | | | | Even the name of this section is 'cpu/msr_hyperv_hypercall', 'hypercall_hypercall' is clearly a typo. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210318160249.1084178-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Various spelling fixesMichael Tokarev2021-03-091-1/+1
| | | | | | | | | | | An assorted set of spelling fixes in various places. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20210309111510.79495-1-mjt@msgid.tls.msk.ru> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* target/i86: implement PKSPaolo Bonzini2021-02-081-4/+20
| | | | | | | | | | | | | | | Protection Keys for Supervisor-mode pages is a simple extension of the PKU feature that QEMU already implements. For supervisor-mode pages, protection key restrictions come from a new MSR. The MSR has no XSAVE state associated to it. PKS is only respected in long mode. However, in principle it is possible to set the MSR even outside long mode, and in fact even the XSAVE state for PKRU could be set outside long mode using XRSTOR. So do not limit the migration subsections for PKRU and PKRS to long mode. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* i386: move kvm accel files into kvm/Claudio Fontana2020-12-161-2/+2
| | | | | | | | Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20201212155530.23098-2-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* target/i386: support KVM_FEATURE_ASYNC_PF_INTVitaly Kuznetsov2020-09-301-0/+19
| | | | | | | | | | | | | | | Linux-5.8 introduced interrupt based mechanism for 'page ready' events delivery and disabled the old, #PF based one (see commit 2635b5c4a0e4 "KVM: x86: interrupt based APF 'page ready' event delivery"). Linux guest switches to using in in 5.9 (see commit b1d405751cd5 "KVM: x86: Switch KVM guest to using interrupts for page ready APF delivery"). The feature has a new KVM_FEATURE_ASYNC_PF_INT bit assigned and the interrupt vector is set in MSR_KVM_ASYNC_PF_INT MSR. Support this in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200908141206.357450-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: add support for AMD nested live migrationPaolo Bonzini2020-07-101-1/+30
| | | | | | | | | | | | | | | | | | Support for nested guest live migration is part of Linux 5.8, add the corresponding code to QEMU. The migration format consists of a few flags, is an opaque 4k blob. The blob is in VMCB format (the control area represents the L1 VMCB control fields, the save area represents the pre-vmentry state; KVM does not use the host save area since the AMD manual allows that) but QEMU does not really care about that. However, the flags need to be copied to hflags/hflags2 and back. In addition, support for retrieving and setting the AMD nested virtualization states allows the L1 guest to be reset while running a nested guest, but a small bug in CPU reset needs to be fixed for that to work. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix some comment spelling errors.Cameron Esfahani2019-12-181-4/+4
| | | | | | | Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <086c197db928384b8697edfa64755e2cb46c8100.1575685843.git.dirty@apple.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: move more x86-generic functions out of PC filesPaolo Bonzini2019-12-171-1/+1
| | | | | | | These are needed by microvm too, so move them outside of PC-specific files. With this patch, microvm.c need not include pc.h anymore. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: add support for MSR_IA32_TSX_CTRLPaolo Bonzini2019-11-211-0/+20
| | | | | | | | | | The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the Trusty Side-channel Extension). By virtualizing the MSR, KVM guests can disable TSX and avoid paying the price of mitigating TSX-based attacks on microarchitectural side channels. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hw/i386: Move save_tsc_khz from PCMachineClass to X86MachineClassLiam Merwick2019-11-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempting to migrate a VM using the microvm machine class results in the source QEMU aborting with the following message/backtrace: target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an instance of type generic-pc-machine abort() object_class_dynamic_cast_assert() vmstate_save_state_v() vmstate_save_state() vmstate_save() qemu_savevm_state_complete_precopy() migration_thread() migration_thread() migration_thread() qemu_thread_start() start_thread() clone() The access to the machine class returned by MACHINE_GET_CLASS() in tsc_khz_needed() is crashing as it is trying to dereference a different type of machine class object (TYPE_PC_MACHINE) to that of this microVM. This can be resolved by extending the changes in the following commit f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it") and moving the save_tsc_khz field in PCMachineClass to X86MachineClass. Fixes: f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it") Signed-off-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: Add support for save/load IA32_UMWAIT_CONTROL MSRTao Xu2019-10-231-0/+20
| | | | | | | | | | | | | | | UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR index E1H to determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in guest. Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191011074103.30393-3-tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: i386: halt poll control MSR supportMarcelo Tosatti2019-08-201-0/+20
| | | | | | | | | | | | | Add support for halt poll control MSR: save/restore, migration and new feature name. The purpose of this MSR is to allow the guest to disable host halt poll. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20190603230408.GA7938@amt.cnet> [Do not enable by default, as pointed out by Mark Kanda. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Include hw/boards.h a bit lessMarkus Armbruster2019-08-161-1/+0
| | | | | | | | | | | | | | | | hw/boards.h pulls in almost 60 headers. The less we include it into headers, the better. As a first step, drop superfluous inclusions, and downgrade some more to what's actually needed. Gets rid of just one inclusion into a header. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-23-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
* Include hw/hw.h exactly where neededMarkus Armbruster2019-08-161-1/+0
| | | | | | | | | | | | | | | | In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
* target/i386: skip KVM_GET/SET_NESTED_STATE if VMX disabled, or for SVMPaolo Bonzini2019-07-191-20/+1
| | | | | | | | | | | | | | | Do not allocate env->nested_state unless we later need to migrate the nested virtualization state. With this change, nested_state_needed() will return false if the VMX flag is not included in the virtual machine. KVM_GET/SET_NESTED_STATE is also disabled for SVM which is safer (we know that at least the NPT root and paging mode have to be saved/loaded), and thus the corresponding subsection can go away as well. Inspired by a patch from Liran Alon. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: kvm: Demand nested migration kernel capabilities only when vCPU ↵Liran Alon2019-07-191-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | may have enabled VMX Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when the code was written it was thought there is no way to reliably know if a vCPU is utilising VMX or not at runtime. However, it turns out that this can be known to some extent: In order for a vCPU to enter VMX operation it must have CR4.VMXE set. Since it was set, CR4.VMXE must remain set as long as the vCPU is in VMX operation. This is because CR4.VMXE is one of the bits set in MSR_IA32_VMX_CR4_FIXED1. There is one exception to the above statement when vCPU enters SMM mode. When a vCPU enters SMM mode, it temporarily exits VMX operation and may also reset CR4.VMXE during execution in SMM mode. When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation and CR4.VMXE is restored to its original state of being set. Therefore, when the vCPU is not in SMM mode, we can infer whether VMX is being used by examining CR4.VMXE. Otherwise, we cannot know for certain but assume the worse that vCPU may utilise VMX. Summaring all the above, a vCPU may have enabled VMX in case CR4.VMXE is set or vCPU is in SMM mode. Therefore, remove migration blocker and check before migration (cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then require relevant kernel capabilities. While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in guest-mode and there is a pending/injected exception. Otherwise, this kernel capability is not required for proper migration. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Tested-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: kvm: Fix when nested state is needed for migrationLiran Alon2019-07-051-3/+2
| | | | | | | | | | | | | | | | When vCPU is in VMX operation and enters SMM mode, it temporarily exits VMX operation but KVM maintained nested-state still stores the VMXON region physical address, i.e. even when the vCPU is in SMM mode then (nested_state->hdr.vmx.vmxon_pa != -1ull). Therefore, there is no need to explicitly check for KVM_STATE_NESTED_SMM_VMXON to determine if it is necessary to save nested-state as part of migration stream. Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190624230514.53326-1-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: kvm: Add nested migration blocker only when kernel lacks ↵Liran Alon2019-06-211-1/+1
| | | | | | | | | | | | | | | | | | required capabilities Previous commits have added support for migration of nested virtualization workloads. This was done by utilising two new KVM capabilities: KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD. Both which are required in order to correctly migrate such workloads. Therefore, change code to add a migration blocker for vCPUs exposed with Intel VMX or AMD SVM in case one of these kernel capabilities is missing. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Message-Id: <20190619162140.133674-11-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOADLiran Alon2019-06-211-1/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | Kernel commit c4f55198c7c2 ("kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD") introduced a new KVM capability which allows userspace to correctly distinguish between pending and injected exceptions. This distinguish is important in case of nested virtualization scenarios because a L2 pending exception can still be intercepted by the L1 hypervisor while a L2 injected exception cannot. Furthermore, when an exception is attempted to be injected by QEMU, QEMU should specify the exception payload (CR2 in case of #PF or DR6 in case of #DB) instead of having the payload already delivered in the respective vCPU register. Because in case exception is injected to L2 guest and is intercepted by L1 hypervisor, then payload needs to be reported to L1 intercept (VMExit handler) while still preserving respective vCPU register unchanged. This commit adds support for QEMU to properly utilise this new KVM capability (KVM_CAP_EXCEPTION_PAYLOAD). Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-10-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: kvm: Add support for save and restore nested stateLiran Alon2019-06-211-0/+198
| | | | | | | | | | | | | | | | Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") introduced new IOCTLs to extract and restore vCPU state related to Intel VMX & AMD SVM. Utilize these IOCTLs to add support for migration of VMs which are running nested hypervisors. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Tested-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-9-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* i386/kvm: convert hyperv enlightenments properties from bools to bitsVitaly Kuznetsov2019-06-211-1/+1
| | | | | | | | | | Representing Hyper-V properties as bits will allow us to check features and dependencies between them in a natural way. Suggested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20190517141924.19024-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2019-06-11-v3' ↵Peter Maydell2019-06-121-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Miscellaneous patches for 2019-06-11 # gpg: Signature made Wed 12 Jun 2019 12:20:41 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-misc-2019-06-11-v3: MAINTAINERS: Polish headline decorations MAINTAINERS: Improve section headlines MAINTAINERS: Remove duplicate entries of qemu-devel@nongnu.org Clean up a header guard symbols (again) Supply missing header guards Clean up a few header guard symbols scripts/clean-header-guards: Fix handling of trailing comments Normalize position of header guard Include qemu-common.h exactly where needed Include qemu/module.h where needed, drop it from qemu-common.h qemu-common: Move qemu_isalnum() etc. to qemu/ctype.h qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * qemu-common: Move tcg_enabled() etc. to sysemu/tcg.hMarkus Armbruster2019-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]
* | i386: Save EFER for 32-bit targetsPavel Dovgalyuk2019-06-111-0/+24
|/ | | | | | | | | | | | | | | i386 (32 bit) emulation uses EFER in wrmsr and in MMU fault processing. But it does not included in VMState, because "efer" field is disabled with This patch adds a section for 32-bit targets which saves EFER when it's value is non-zero. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <155913371654.8429.1659082639780315242.stgit@pasha-Precision-3630-Tower> Reviewed-by: Peter Xu <peterx@redhat.com> [ehabkost: indentation fix] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* hyperv: qom-ify SynICRoman Kagan2018-10-191-0/+9
| | | | | | | | | | | Make Hyper-V SynIC a device which is attached as a child to a CPU. For now it only makes SynIC visibile in the qom hierarchy, and maintains its internal fields in sync with the respecitve msrs of the parent cpu (the fields will be used in followup patches). Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Message-Id: <20180921082217.29481-3-rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* i386: do not migrate MSR_SMI_COUNT on machine types <2.12Paolo Bonzini2018-07-301-1/+1
| | | | | | | | MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it on older machine types, or the subsection causes a load failure for guests that use SMM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>