hypervisor: 0.913 debug: 0.901 x86: 0.888 operating system: 0.737 kernel: 0.672 KVM: 0.636 virtual: 0.626 TCG: 0.371 user-level: 0.216 register: 0.110 files: 0.107 socket: 0.102 PID: 0.097 performance: 0.063 device: 0.051 network: 0.047 VMM: 0.035 ppc: 0.032 permissions: 0.024 boot: 0.021 risc-v: 0.020 vnc: 0.019 architecture: 0.016 semantic: 0.014 alpha: 0.013 assembly: 0.010 peripherals: 0.008 i386: 0.004 arm: 0.003 graphic: 0.001 mistranslation: 0.001 [BUG] Migration hv_time rollback Hi, We are experiencing timestamp rollbacks during live-migration of Windows 10 guests with the following qemu configuration (linux 5.4.46 and qemu master): ``` $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] ``` I have tracked the bug to the fact that `kvmclock` is not exposed and disabled from qemu PoV but is in fact used by `hv-time` (in KVM). I think we should enable the `kvmclock` (qemu device) if `hv-time` is present and add Hyper-V support for the `kvmclock_current_nsec` function. I'm asking for advice because I am unsure this is the _right_ approach and how to keep migration compatibility between qemu versions. Thank you all, -- Antoine 'xdbob' Damhet signature.asc Description: PGP signature cc'ing in Vitaly who knows about the hv stuff. * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > Hi, > > We are experiencing timestamp rollbacks during live-migration of > Windows 10 guests with the following qemu configuration (linux 5.4.46 > and qemu master): > ``` > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > ``` How big a jump are you seeing, and how did you notice it in the guest? Dave > I have tracked the bug to the fact that `kvmclock` is not exposed and > disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > > I think we should enable the `kvmclock` (qemu device) if `hv-time` is > present and add Hyper-V support for the `kvmclock_current_nsec` > function. > > I'm asking for advice because I am unsure this is the _right_ approach > and how to keep migration compatibility between qemu versions. > > Thank you all, > > -- > Antoine 'xdbob' Damhet -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK "Dr. David Alan Gilbert" writes: > cc'ing in Vitaly who knows about the hv stuff. > cc'ing Marcelo who knows about clocksources :-) > * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > > Hi, > > > > We are experiencing timestamp rollbacks during live-migration of > > Windows 10 guests Are you migrating to the same hardware (with the same TSC frequency)? Is TSC used as the clocksource on the host? > > with the following qemu configuration (linux 5.4.46 > > and qemu master): > > ``` > > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > > ``` Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is not going to check for KVM identification anyway so we pretend we're Hyper-V. Also, have you tried adding more Hyper-V enlightenments? > > How big a jump are you seeing, and how did you notice it in the guest? > > Dave > > > I have tracked the bug to the fact that `kvmclock` is not exposed and > > disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > > > > I think we should enable the `kvmclock` (qemu device) if `hv-time` is > > present and add Hyper-V support for the `kvmclock_current_nsec` > > function. AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by the guest: if (!(env->system_time_msr & 1ULL)) { /* KVM clock not active */ return 0; } and this is (and way) always false for Windows guests. > > > > I'm asking for advice because I am unsure this is the _right_ approach > > and how to keep migration compatibility between qemu versions. > > > > Thank you all, > > > > -- > > Antoine 'xdbob' Damhet -- Vitaly On Wed, Sep 16, 2020 at 01:59:43PM +0200, Vitaly Kuznetsov wrote: > "Dr. David Alan Gilbert" writes: > > > cc'ing in Vitaly who knows about the hv stuff. > > > > cc'ing Marcelo who knows about clocksources :-) > > > * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > >> Hi, > >> > >> We are experiencing timestamp rollbacks during live-migration of > >> Windows 10 guests > > Are you migrating to the same hardware (with the same TSC frequency)? Is > TSC used as the clocksource on the host? Yes we are migrating to the exact same hardware. And yes TSC is used as a clocksource in the host (but the bug is still happening with `hpet` as a clocksource). > > >> with the following qemu configuration (linux 5.4.46 > >> and qemu master): > >> ``` > >> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > >> ``` > > Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is > not going to check for KVM identification anyway so we pretend we're > Hyper-V. Some softwares explicitly checks for the presence of KVM and then crash if they find it in CPUID :/ > > Also, have you tried adding more Hyper-V enlightenments? Yes, I published a stripped-down command-line for a minimal reproducer but even `hv-frequencies` and `hv-reenlightenment` don't help. > > > > > How big a jump are you seeing, and how did you notice it in the guest? > > > > Dave > > > >> I have tracked the bug to the fact that `kvmclock` is not exposed and > >> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > >> > >> I think we should enable the `kvmclock` (qemu device) if `hv-time` is > >> present and add Hyper-V support for the `kvmclock_current_nsec` > >> function. > > AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by > the guest: > > if (!(env->system_time_msr & 1ULL)) { > /* KVM clock not active */ > return 0; > } > > and this is (and way) always false for Windows guests. Hooo, I missed this piece. When is `clock_is_reliable` expected to be false ? Because if it is I still think we should be able to query at least `HV_X64_MSR_REFERENCE_TSC` > > >> > >> I'm asking for advice because I am unsure this is the _right_ approach > >> and how to keep migration compatibility between qemu versions. > >> > >> Thank you all, > >> > >> -- > >> Antoine 'xdbob' Damhet > > -- > Vitaly > -- Antoine 'xdbob' Damhet signature.asc Description: PGP signature On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: > cc'ing in Vitaly who knows about the hv stuff. Thanks > > * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > > Hi, > > > > We are experiencing timestamp rollbacks during live-migration of > > Windows 10 guests with the following qemu configuration (linux 5.4.46 > > and qemu master): > > ``` > > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > > ``` > > How big a jump are you seeing, and how did you notice it in the guest? I'm seeing jumps of about the guest uptime (indicating a reset of the counter). It's expected because we won't call `KVM_SET_CLOCK` to restore any value. We first noticed it because after some migrations `dwm.exe` crashes with the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in the past." error code. I can also confirm the following hack makes the behavior disappear: ``` diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index 64283358f9..f334bdf35f 100644 --- a/hw/i386/kvm/clock.c +++ b/hw/i386/kvm/clock.c @@ -332,11 +332,7 @@ void kvmclock_create(void) { X86CPU *cpu = X86_CPU(first_cpu); - if (kvm_enabled() && - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); - } + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); } static void kvmclock_register_types(void) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 32b1453e6a..11d980ba85 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -158,9 +158,7 @@ static void pc_init1(MachineState *machine, x86_cpus_init(x86ms, pcmc->default_cpu_version); - if (kvm_enabled() && pcmc->kvmclock_enabled) { - kvmclock_create(); - } + kvmclock_create(); if (pcmc->pci_enabled) { pci_memory = g_new(MemoryRegion, 1); ``` > > Dave > > > I have tracked the bug to the fact that `kvmclock` is not exposed and > > disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > > > > I think we should enable the `kvmclock` (qemu device) if `hv-time` is > > present and add Hyper-V support for the `kvmclock_current_nsec` > > function. > > > > I'm asking for advice because I am unsure this is the _right_ approach > > and how to keep migration compatibility between qemu versions. > > > > Thank you all, > > > > -- > > Antoine 'xdbob' Damhet > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > -- Antoine 'xdbob' Damhet signature.asc Description: PGP signature Antoine Damhet writes: > On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: > > cc'ing in Vitaly who knows about the hv stuff. > > Thanks > > > > > * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > > > Hi, > > > > > > We are experiencing timestamp rollbacks during live-migration of > > > Windows 10 guests with the following qemu configuration (linux 5.4.46 > > > and qemu master): > > > ``` > > > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > > > ``` > > > > How big a jump are you seeing, and how did you notice it in the guest? > > I'm seeing jumps of about the guest uptime (indicating a reset of the > counter). It's expected because we won't call `KVM_SET_CLOCK` to > restore any value. > > We first noticed it because after some migrations `dwm.exe` crashes with > the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in > the past." error code. > > I can also confirm the following hack makes the behavior disappear: > > ``` > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 64283358f9..f334bdf35f 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -332,11 +332,7 @@ void kvmclock_create(void) > { > X86CPU *cpu = X86_CPU(first_cpu); > > - if (kvm_enabled() && > - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | > - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { > - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); > - } > + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); > } > Oh, I think I see what's going on. When you add 'kvm=off' cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on migration. In case we really want to support 'kvm=off' I think we can add Hyper-V features check here along with KVM, this should do the job. -- Vitaly Vitaly Kuznetsov writes: > Antoine Damhet writes: > > > On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: > >> cc'ing in Vitaly who knows about the hv stuff. > > > > Thanks > > > >> > >> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: > >> > Hi, > >> > > >> > We are experiencing timestamp rollbacks during live-migration of > >> > Windows 10 guests with the following qemu configuration (linux 5.4.46 > >> > and qemu master): > >> > ``` > >> > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] > >> > ``` > >> > >> How big a jump are you seeing, and how did you notice it in the guest? > > > > I'm seeing jumps of about the guest uptime (indicating a reset of the > > counter). It's expected because we won't call `KVM_SET_CLOCK` to > > restore any value. > > > > We first noticed it because after some migrations `dwm.exe` crashes with > > the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in > > the past." error code. > > > > I can also confirm the following hack makes the behavior disappear: > > > > ``` > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > > index 64283358f9..f334bdf35f 100644 > > --- a/hw/i386/kvm/clock.c > > +++ b/hw/i386/kvm/clock.c > > @@ -332,11 +332,7 @@ void kvmclock_create(void) > > { > > X86CPU *cpu = X86_CPU(first_cpu); > > > > - if (kvm_enabled() && > > - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | > > - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) > > { > > - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); > > - } > > + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); > > } > > > > > Oh, I think I see what's going on. When you add 'kvm=off' > cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so > kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on > migration. > > In case we really want to support 'kvm=off' I think we can add Hyper-V > features check here along with KVM, this should do the job. Does the untested diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index 64283358f91d..e03b2ca6d8f6 100644 --- a/hw/i386/kvm/clock.c +++ b/hw/i386/kvm/clock.c @@ -333,8 +333,9 @@ void kvmclock_create(void) X86CPU *cpu = X86_CPU(first_cpu); if (kvm_enabled() && - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { + ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | + (1ULL << KVM_FEATURE_CLOCKSOURCE2))) || + (cpu->env.features[FEAT_HYPERV_EAX] & HV_TIME_REF_COUNT_AVAILABLE))) { sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); } } help? (I don't think we need to remove all 'if (kvm_enabled())' checks from machine types as 'kvm=off' should not be related). -- Vitaly On Wed, Sep 16, 2020 at 02:50:56PM +0200, Vitaly Kuznetsov wrote: [...] > >> > > > > > > Oh, I think I see what's going on. When you add 'kvm=off' > > cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so > > kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on > > migration. > > > > In case we really want to support 'kvm=off' I think we can add Hyper-V > > features check here along with KVM, this should do the job. > > Does the untested > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 64283358f91d..e03b2ca6d8f6 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -333,8 +333,9 @@ void kvmclock_create(void) > X86CPU *cpu = X86_CPU(first_cpu); > > if (kvm_enabled() && > - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | > - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { > + ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | > + (1ULL << > KVM_FEATURE_CLOCKSOURCE2))) || > + (cpu->env.features[FEAT_HYPERV_EAX] & > HV_TIME_REF_COUNT_AVAILABLE))) { > sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); > } > } > > help? It appears to work :) > > (I don't think we need to remove all 'if (kvm_enabled())' checks from > machine types as 'kvm=off' should not be related). Indeed (I didn't look at the macro, it was just quick & dirty). > > -- > Vitaly > > -- Antoine 'xdbob' Damhet signature.asc Description: PGP signature On 16/09/20 13:29, Dr. David Alan Gilbert wrote: > > I have tracked the bug to the fact that `kvmclock` is not exposed and > > disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > > > > I think we should enable the `kvmclock` (qemu device) if `hv-time` is > > present and add Hyper-V support for the `kvmclock_current_nsec` > > function. Yes, this seems correct. I would have to check but it may even be better to _always_ send kvmclock data in the live migration stream. Paolo Paolo Bonzini writes: > On 16/09/20 13:29, Dr. David Alan Gilbert wrote: > >> I have tracked the bug to the fact that `kvmclock` is not exposed and > >> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). > >> > >> I think we should enable the `kvmclock` (qemu device) if `hv-time` is > >> present and add Hyper-V support for the `kvmclock_current_nsec` > >> function. > > Yes, this seems correct. I would have to check but it may even be > better to _always_ send kvmclock data in the live migration stream. > The question I have is: with 'kvm=off', do we actually restore TSC reading on migration? (and I guess the answer is 'no' or Hyper-V TSC page would 'just work' I guess). So yea, maybe dropping the 'cpu->env.features[FEAT_KVM]' check is the right fix. -- Vitaly