diff options
Diffstat (limited to 'results/classifier/001/other/64571620')
| -rw-r--r-- | results/classifier/001/other/64571620 | 785 |
1 files changed, 785 insertions, 0 deletions
diff --git a/results/classifier/001/other/64571620 b/results/classifier/001/other/64571620 new file mode 100644 index 00000000..17e9f6ea --- /dev/null +++ b/results/classifier/001/other/64571620 @@ -0,0 +1,785 @@ +other: 0.922 +mistranslation: 0.917 +semantic: 0.903 +instruction: 0.894 + +[BUG] Migration hv_time rollback + +Hi, + +We are experiencing timestamp rollbacks during live-migration of +Windows 10 guests with the following qemu configuration (linux 5.4.46 +and qemu master): +``` +$ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +``` + +I have tracked the bug to the fact that `kvmclock` is not exposed and +disabled from qemu PoV but is in fact used by `hv-time` (in KVM). + +I think we should enable the `kvmclock` (qemu device) if `hv-time` is +present and add Hyper-V support for the `kvmclock_current_nsec` +function. + +I'm asking for advice because I am unsure this is the _right_ approach +and how to keep migration compatibility between qemu versions. + +Thank you all, + +-- +Antoine 'xdbob' Damhet +signature.asc +Description: +PGP signature + +cc'ing in Vitaly who knows about the hv stuff. + +* Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +Hi, +> +> +We are experiencing timestamp rollbacks during live-migration of +> +Windows 10 guests with the following qemu configuration (linux 5.4.46 +> +and qemu master): +> +``` +> +$ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +``` +How big a jump are you seeing, and how did you notice it in the guest? + +Dave + +> +I have tracked the bug to the fact that `kvmclock` is not exposed and +> +disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +> +I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +present and add Hyper-V support for the `kvmclock_current_nsec` +> +function. +> +> +I'm asking for advice because I am unsure this is the _right_ approach +> +and how to keep migration compatibility between qemu versions. +> +> +Thank you all, +> +> +-- +> +Antoine 'xdbob' Damhet +-- +Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK + +"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: + +> +cc'ing in Vitaly who knows about the hv stuff. +> +cc'ing Marcelo who knows about clocksources :-) + +> +* Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +> Hi, +> +> +> +> We are experiencing timestamp rollbacks during live-migration of +> +> Windows 10 guests +Are you migrating to the same hardware (with the same TSC frequency)? Is +TSC used as the clocksource on the host? + +> +> with the following qemu configuration (linux 5.4.46 +> +> and qemu master): +> +> ``` +> +> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +> ``` +Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is +not going to check for KVM identification anyway so we pretend we're +Hyper-V. + +Also, have you tried adding more Hyper-V enlightenments? + +> +> +How big a jump are you seeing, and how did you notice it in the guest? +> +> +Dave +> +> +> I have tracked the bug to the fact that `kvmclock` is not exposed and +> +> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +> +> +> I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +> present and add Hyper-V support for the `kvmclock_current_nsec` +> +> function. +AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by +the guest: + + if (!(env->system_time_msr & 1ULL)) { + /* KVM clock not active */ + return 0; + } + +and this is (and way) always false for Windows guests. + +> +> +> +> I'm asking for advice because I am unsure this is the _right_ approach +> +> and how to keep migration compatibility between qemu versions. +> +> +> +> Thank you all, +> +> +> +> -- +> +> Antoine 'xdbob' Damhet +-- +Vitaly + +On Wed, Sep 16, 2020 at 01:59:43PM +0200, Vitaly Kuznetsov wrote: +> +"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: +> +> +> cc'ing in Vitaly who knows about the hv stuff. +> +> +> +> +cc'ing Marcelo who knows about clocksources :-) +> +> +> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +>> Hi, +> +>> +> +>> We are experiencing timestamp rollbacks during live-migration of +> +>> Windows 10 guests +> +> +Are you migrating to the same hardware (with the same TSC frequency)? Is +> +TSC used as the clocksource on the host? +Yes we are migrating to the exact same hardware. And yes TSC is used as +a clocksource in the host (but the bug is still happening with `hpet` as +a clocksource). + +> +> +>> with the following qemu configuration (linux 5.4.46 +> +>> and qemu master): +> +>> ``` +> +>> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +>> ``` +> +> +Out of pure curiosity, what's the purpose of doing 'kvm=off'? Windows is +> +not going to check for KVM identification anyway so we pretend we're +> +Hyper-V. +Some softwares explicitly checks for the presence of KVM and then crash +if they find it in CPUID :/ + +> +> +Also, have you tried adding more Hyper-V enlightenments? +Yes, I published a stripped-down command-line for a minimal reproducer +but even `hv-frequencies` and `hv-reenlightenment` don't help. + +> +> +> +> +> How big a jump are you seeing, and how did you notice it in the guest? +> +> +> +> Dave +> +> +> +>> I have tracked the bug to the fact that `kvmclock` is not exposed and +> +>> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +>> +> +>> I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +>> present and add Hyper-V support for the `kvmclock_current_nsec` +> +>> function. +> +> +AFAICT kvmclock_current_nsec() checks whether kvmclock was enabled by +> +the guest: +> +> +if (!(env->system_time_msr & 1ULL)) { +> +/* KVM clock not active */ +> +return 0; +> +} +> +> +and this is (and way) always false for Windows guests. +Hooo, I missed this piece. When is `clock_is_reliable` expected to be +false ? Because if it is I still think we should be able to query at +least `HV_X64_MSR_REFERENCE_TSC` + +> +> +>> +> +>> I'm asking for advice because I am unsure this is the _right_ approach +> +>> and how to keep migration compatibility between qemu versions. +> +>> +> +>> Thank you all, +> +>> +> +>> -- +> +>> Antoine 'xdbob' Damhet +> +> +-- +> +Vitaly +> +-- +Antoine 'xdbob' Damhet +signature.asc +Description: +PGP signature + +On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: +> +cc'ing in Vitaly who knows about the hv stuff. +Thanks + +> +> +* Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +> Hi, +> +> +> +> We are experiencing timestamp rollbacks during live-migration of +> +> Windows 10 guests with the following qemu configuration (linux 5.4.46 +> +> and qemu master): +> +> ``` +> +> $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +> ``` +> +> +How big a jump are you seeing, and how did you notice it in the guest? +I'm seeing jumps of about the guest uptime (indicating a reset of the +counter). It's expected because we won't call `KVM_SET_CLOCK` to +restore any value. + +We first noticed it because after some migrations `dwm.exe` crashes with +the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in +the past." error code. + +I can also confirm the following hack makes the behavior disappear: + +``` +diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c +index 64283358f9..f334bdf35f 100644 +--- a/hw/i386/kvm/clock.c ++++ b/hw/i386/kvm/clock.c +@@ -332,11 +332,7 @@ void kvmclock_create(void) + { + X86CPU *cpu = X86_CPU(first_cpu); + +- if (kvm_enabled() && +- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { +- sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +- } ++ sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); + } + + static void kvmclock_register_types(void) +diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c +index 32b1453e6a..11d980ba85 100644 +--- a/hw/i386/pc_piix.c ++++ b/hw/i386/pc_piix.c +@@ -158,9 +158,7 @@ static void pc_init1(MachineState *machine, + + x86_cpus_init(x86ms, pcmc->default_cpu_version); + +- if (kvm_enabled() && pcmc->kvmclock_enabled) { +- kvmclock_create(); +- } ++ kvmclock_create(); + + if (pcmc->pci_enabled) { + pci_memory = g_new(MemoryRegion, 1); +``` + +> +> +Dave +> +> +> I have tracked the bug to the fact that `kvmclock` is not exposed and +> +> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +> +> +> I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +> present and add Hyper-V support for the `kvmclock_current_nsec` +> +> function. +> +> +> +> I'm asking for advice because I am unsure this is the _right_ approach +> +> and how to keep migration compatibility between qemu versions. +> +> +> +> Thank you all, +> +> +> +> -- +> +> Antoine 'xdbob' Damhet +> +> +> +-- +> +Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK +> +-- +Antoine 'xdbob' Damhet +signature.asc +Description: +PGP signature + +Antoine Damhet <antoine.damhet@blade-group.com> writes: + +> +On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: +> +> cc'ing in Vitaly who knows about the hv stuff. +> +> +Thanks +> +> +> +> +> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +> > Hi, +> +> > +> +> > We are experiencing timestamp rollbacks during live-migration of +> +> > Windows 10 guests with the following qemu configuration (linux 5.4.46 +> +> > and qemu master): +> +> > ``` +> +> > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +> > ``` +> +> +> +> How big a jump are you seeing, and how did you notice it in the guest? +> +> +I'm seeing jumps of about the guest uptime (indicating a reset of the +> +counter). It's expected because we won't call `KVM_SET_CLOCK` to +> +restore any value. +> +> +We first noticed it because after some migrations `dwm.exe` crashes with +> +the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in +> +the past." error code. +> +> +I can also confirm the following hack makes the behavior disappear: +> +> +``` +> +diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c +> +index 64283358f9..f334bdf35f 100644 +> +--- a/hw/i386/kvm/clock.c +> ++++ b/hw/i386/kvm/clock.c +> +@@ -332,11 +332,7 @@ void kvmclock_create(void) +> +{ +> +X86CPU *cpu = X86_CPU(first_cpu); +> +> +- if (kvm_enabled() && +> +- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +> +- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { +> +- sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +> +- } +> ++ sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +> +} +> +Oh, I think I see what's going on. When you add 'kvm=off' +cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so +kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on +migration. + +In case we really want to support 'kvm=off' I think we can add Hyper-V +features check here along with KVM, this should do the job. + +-- +Vitaly + +Vitaly Kuznetsov <vkuznets@redhat.com> writes: + +> +Antoine Damhet <antoine.damhet@blade-group.com> writes: +> +> +> On Wed, Sep 16, 2020 at 12:29:56PM +0100, Dr. David Alan Gilbert wrote: +> +>> cc'ing in Vitaly who knows about the hv stuff. +> +> +> +> Thanks +> +> +> +>> +> +>> * Antoine Damhet (antoine.damhet@blade-group.com) wrote: +> +>> > Hi, +> +>> > +> +>> > We are experiencing timestamp rollbacks during live-migration of +> +>> > Windows 10 guests with the following qemu configuration (linux 5.4.46 +> +>> > and qemu master): +> +>> > ``` +> +>> > $ qemu-system-x86_64 -enable-kvm -cpu host,kvm=off,hv_time [...] +> +>> > ``` +> +>> +> +>> How big a jump are you seeing, and how did you notice it in the guest? +> +> +> +> I'm seeing jumps of about the guest uptime (indicating a reset of the +> +> counter). It's expected because we won't call `KVM_SET_CLOCK` to +> +> restore any value. +> +> +> +> We first noticed it because after some migrations `dwm.exe` crashes with +> +> the "(NTSTATUS) 0x8898009b - QueryPerformanceCounter returned a time in +> +> the past." error code. +> +> +> +> I can also confirm the following hack makes the behavior disappear: +> +> +> +> ``` +> +> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c +> +> index 64283358f9..f334bdf35f 100644 +> +> --- a/hw/i386/kvm/clock.c +> +> +++ b/hw/i386/kvm/clock.c +> +> @@ -332,11 +332,7 @@ void kvmclock_create(void) +> +> { +> +> X86CPU *cpu = X86_CPU(first_cpu); +> +> +> +> - if (kvm_enabled() && +> +> - cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +> +> - (1ULL << KVM_FEATURE_CLOCKSOURCE2))) +> +> { +> +> - sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +> +> - } +> +> + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +> +> } +> +> +> +> +> +Oh, I think I see what's going on. When you add 'kvm=off' +> +cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so +> +kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on +> +migration. +> +> +In case we really want to support 'kvm=off' I think we can add Hyper-V +> +features check here along with KVM, this should do the job. +Does the untested + +diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c +index 64283358f91d..e03b2ca6d8f6 100644 +--- a/hw/i386/kvm/clock.c ++++ b/hw/i386/kvm/clock.c +@@ -333,8 +333,9 @@ void kvmclock_create(void) + X86CPU *cpu = X86_CPU(first_cpu); + + if (kvm_enabled() && +- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { ++ ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | ++ (1ULL << KVM_FEATURE_CLOCKSOURCE2))) +|| ++ (cpu->env.features[FEAT_HYPERV_EAX] & HV_TIME_REF_COUNT_AVAILABLE))) { + sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); + } + } + +help? + +(I don't think we need to remove all 'if (kvm_enabled())' checks from +machine types as 'kvm=off' should not be related). + +-- +Vitaly + +On Wed, Sep 16, 2020 at 02:50:56PM +0200, Vitaly Kuznetsov wrote: +[...] + +> +>> +> +> +> +> +> +> Oh, I think I see what's going on. When you add 'kvm=off' +> +> cpu->env.features[FEAT_KVM] is reset (see x86_cpu_expand_features()) so +> +> kvmclock QEMU device is not created and nobody calls KVM_SET_CLOCK on +> +> migration. +> +> +> +> In case we really want to support 'kvm=off' I think we can add Hyper-V +> +> features check here along with KVM, this should do the job. +> +> +Does the untested +> +> +diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c +> +index 64283358f91d..e03b2ca6d8f6 100644 +> +--- a/hw/i386/kvm/clock.c +> ++++ b/hw/i386/kvm/clock.c +> +@@ -333,8 +333,9 @@ void kvmclock_create(void) +> +X86CPU *cpu = X86_CPU(first_cpu); +> +> +if (kvm_enabled() && +> +- cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +> +- (1ULL << KVM_FEATURE_CLOCKSOURCE2))) { +> ++ ((cpu->env.features[FEAT_KVM] & ((1ULL << KVM_FEATURE_CLOCKSOURCE) | +> ++ (1ULL << +> +KVM_FEATURE_CLOCKSOURCE2))) || +> ++ (cpu->env.features[FEAT_HYPERV_EAX] & +> +HV_TIME_REF_COUNT_AVAILABLE))) { +> +sysbus_create_simple(TYPE_KVM_CLOCK, -1, NULL); +> +} +> +} +> +> +help? +It appears to work :) + +> +> +(I don't think we need to remove all 'if (kvm_enabled())' checks from +> +machine types as 'kvm=off' should not be related). +Indeed (I didn't look at the macro, it was just quick & dirty). + +> +> +-- +> +Vitaly +> +> +-- +Antoine 'xdbob' Damhet +signature.asc +Description: +PGP signature + +On 16/09/20 13:29, Dr. David Alan Gilbert wrote: +> +> I have tracked the bug to the fact that `kvmclock` is not exposed and +> +> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +> +> +> I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +> present and add Hyper-V support for the `kvmclock_current_nsec` +> +> function. +Yes, this seems correct. I would have to check but it may even be +better to _always_ send kvmclock data in the live migration stream. + +Paolo + +Paolo Bonzini <pbonzini@redhat.com> writes: + +> +On 16/09/20 13:29, Dr. David Alan Gilbert wrote: +> +>> I have tracked the bug to the fact that `kvmclock` is not exposed and +> +>> disabled from qemu PoV but is in fact used by `hv-time` (in KVM). +> +>> +> +>> I think we should enable the `kvmclock` (qemu device) if `hv-time` is +> +>> present and add Hyper-V support for the `kvmclock_current_nsec` +> +>> function. +> +> +Yes, this seems correct. I would have to check but it may even be +> +better to _always_ send kvmclock data in the live migration stream. +> +The question I have is: with 'kvm=off', do we actually restore TSC +reading on migration? (and I guess the answer is 'no' or Hyper-V TSC +page would 'just work' I guess). So yea, maybe dropping the +'cpu->env.features[FEAT_KVM]' check is the right fix. + +-- +Vitaly + |