diff options
Diffstat (limited to 'results/classifier/zero-shot/108/other/1013714')
| -rw-r--r-- | results/classifier/zero-shot/108/other/1013714 | 68 |
1 files changed, 68 insertions, 0 deletions
diff --git a/results/classifier/zero-shot/108/other/1013714 b/results/classifier/zero-shot/108/other/1013714 new file mode 100644 index 000000000..cf78b5558 --- /dev/null +++ b/results/classifier/zero-shot/108/other/1013714 @@ -0,0 +1,68 @@ +semantic: 0.859 +permissions: 0.835 +other: 0.828 +debug: 0.826 +graphic: 0.817 +device: 0.801 +vnc: 0.793 +KVM: 0.790 +performance: 0.762 +PID: 0.759 +network: 0.748 +boot: 0.741 +files: 0.723 +socket: 0.538 + +Data corruption after block migration (LV->LV) + +We quite frequently use the live block migration feature to move VM's between nodes without downtime. These sometimes result in data corruption on the receiving end. It only happens if the VM is actually doing I/O (doesn't have to be all that much to trigger the issue). + +We use logical volumes and each VM has two disks. We use cache=none for all VM disks. + +All guests use virtio (a mix of various Linux distro's and Windows 2008R2). + +We currently have two stacks in use and have seen the issue on both of them: + +Fedora - qemu-kvm 0.13 +Scientific Linux 6.2 (RHEL derived) - qemu-kvm package 0.12.1.2 + +Even though we don't run the most recent versions of KVM I highly suspect this issue is still unreported and that filing a bug is therefore appropriate. (There doesn't seem to be any similar bug report in launchpad or RedHat's bugzilla and nothing related in change logs, release notes and git commit logs.) + +I have no idea where to look or where to start debugging this issue, but if there is any way I can provide useful debug information please let me know. + +Hi, I suggest that you try a newer version. There were several fixes that I think went only in 0.14, in particular commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e, commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e, commit 62155e2b51e3c282ddc30adbb6d7b8d3bb7c386e. RHEL6.2 doesn't have them. With the fixes, it's quite less likely that live block migration will eat your data. + +However, we were also thinking of deprecating block migration, so we are interesting of hearing about your setup. The replacement would be more powerful (it would allow migrating storage separately from the VM), more efficient (storage and RAM streams would run in parallel on different TCP ports), and easier for us to test and maintain. + +However, it would be more complicated to set the new mechanism up for migration without shared storage. This is what live block migration does, and it sounds like your usecase requires migration without shared storage. Likely, a true replacement of live block migration would not be ready in time for the next release (1.2), hence its removal would also be delayed. + +Hello Paolo, + +Thank you for your quick response! + +Did you intend to mention 3 different commits or did you accidentally paste the same commit thrice? ;) I came across that commit but somehow thought it was already included in 0.13. Thanks! + +We're of course in no position to ask, but I'll do it anyway: Would you be in a position to add patches for these commits to the qemu-kvm package for RHEL6 (assuming they apply at all)? Or perhaps ask one of the RH package maintainers to do so? We'd be very grateful! + +A little bit of background (our use case for using live block migration): We are an ISP and provide virtual private servers on KVM. + +The way we see it traditional centralized shared storage introduces one big, expensive and complicated SPOF into a VM platform. + +We actually have no problems dealing with the limitations of local storage. For example, we have automated (offline) VM migrations to other hosts when customers need to upgrade and the current host doesn't have enough resources. It would be great if live block migration would be stable enough to do this online to reduce downtime for customers. + +We sometimes use live block migration to reduce the server load by migrating off a busy VM. It doesn't really matter if the migration takes a while to complete. We also use it to migrate all VM's off a host in case the hardware is being retired or we need to reinstall. + +Live block migration is just not very useful for generic system maintenance, like a reboot for a kernel or firmware update. In that case we simply reboot the host (and most customers don't mind that once in a while). + +We would appreciate it if live block migration would not be removed until its superior replacement is ready. We don't mind if it's more complex to work with, as long as it's well documented ;) + +Hello Paolo, + +We backported most of the block migration fixes from upstream to the RHEL6 qemu-kvm package ourselves and are now unable to reproduce the disk corruption problem anymore. So I guess the issues are (mostly) fixed upstream. + +You can close this bug, but I would still appreciate it if you could fix this in the RHEL6 package (other RH customers might appreciate that as well ;). We could even provide the patches if you like. + + + +Closing according to comment #3 + |