summary refs log tree commit diff stats
path: root/results/classifier/105/semantic/1368815
diff options
context:
space:
mode:
Diffstat (limited to 'results/classifier/105/semantic/1368815')
-rw-r--r--results/classifier/105/semantic/1368815288
1 files changed, 288 insertions, 0 deletions
diff --git a/results/classifier/105/semantic/1368815 b/results/classifier/105/semantic/1368815
new file mode 100644
index 00000000..9552ecb7
--- /dev/null
+++ b/results/classifier/105/semantic/1368815
@@ -0,0 +1,288 @@
+semantic: 0.710
+graphic: 0.688
+assembly: 0.639
+device: 0.637
+mistranslation: 0.635
+other: 0.620
+instruction: 0.601
+vnc: 0.444
+network: 0.336
+boot: 0.309
+socket: 0.275
+KVM: 0.245
+
+qemu-img convert intermittently corrupts output images
+
+-- Found in releases qemu-2.0.0, qemu-2.0.2, qemu-2.1.0. Tested on Ubuntu 14.04 using Ext4 filesystems.
+
+The command
+
+  qemu-img convert -O raw inputimage.qcow2 outputimage.raw
+
+intermittently creates corrupted output images, when the input image is not yet fully synchronized to disk. While the issue has actually been discovered in operation of of OpenStack nova, it can be reproduced "easily" on command line using
+
+  cat $SRC_PATH > $TMP_PATH && $QEMU_IMG_PATH convert -O raw $TMP_PATH $DST_PATH && cksum $DST_PATH
+
+on filesystems exposing this behavior. (The difficult part of this exercise is to prepare a filesystem to reliably trigger this race. On my test machine some filesystems are affected while other aren't, and unfortunately I haven't found the relevant difference between them, yet. Possible it's timing issues completely out of userspace control ...)
+
+The root cause, however, is the same as in
+
+  http://lists.gnu.org/archive/html/coreutils/2011-04/msg00069.html
+
+and it can be solved the same way as suggested in
+
+  http://lists.gnu.org/archive/html/coreutils/2011-04/msg00102.html
+
+In qemu, file block/raw-posix.c use the FIEMAP_FLAG_SYNC, i.e change 
+
+    f.fm.fm_flags = 0;
+
+to
+
+    f.fm.fm_flags = FIEMAP_FLAG_SYNC;
+
+As discussed in the thread mentioned above, retrieving a page cache coherent map of file extents is possible only after fsync on that file.
+
+See also
+
+  https://bugs.launchpad.net/nova/+bug/1350766
+
+In that bug report filed against nova, fsync had been suggested to be performed by the framework invoking qemu-img. However, as the choice of fiemap -- implying this otherwise unneeded fsync of a temporary file  -- is not made by the caller but by qemu-img, I agree with the nova bug reviewer's objection to put it into nova. The fsync should instead be triggered by qemu-img utilizing the FIEMAP_FLAG_SYNC, specifically intended for that purpose.
+
+Is there a minimum version of qemu that would be required to use the FIEMAP_FLAG_SYNC flag?
+
+The affected code was introduced with version 1.2.0. However, due to https://bugs.launchpad.net/qemu/+bug/1193628 I can't build these old releases to verify whether they actually expose the same behaviour.
+
+It seems the dust settles a bit: Found the relevant difference between my various filesystems, and how to reproduce the failure: Susceptible filesystems don't have the extent feature of ext4 enabled.
+
+You can create such a filesystem using
+
+  mke2fs -t ext4 -O ^extent /dev/...
+  mount /mnt /dev/...
+ 
+Adapting the command line example provided above you can see
+
+  rm -f /mnt/tmp.qcow2
+  cat $SRC_PATH > /mnt/tmp.qcow2 && qemu-img convert -O raw  /mnt/tmp.qcow /mnt/tmp.qcow
+  cksum  /mnt/tmp.qcow
+
+creating corrupt (usually nullified) result images. By inserting a sleep of at least 33 seconds between the cat command and the qemu-img invocation I'm getting proper output.
+
+To me it's unclear now, where the actual defect is located. Creating ext4 filesystems with certain features disabled (such as the exetent tree) is apparently supported and ok. Is the fiemap ioctl supposed to handle this gracefully, for example by assuming FIEMAP_FLAG_SYNC in absence of an extent tree? Or are clients such as qemu-img supposed to always FIEMAP_FLAG_SYNC to be safe?
+
+I see seek hole is supported in the latest qemu-img so I would reorder so that's tried first like:
+
+    if lseek(SEEK_HOLE) == ENOTSUP
+        use_that
+        if fiemap(FIEMAP_FLAG_SYNC)
+            use_that
+
+The fallback cascade Pádraig mentions is already implemented in qemu-2.1.0, in function raw_co_get_block_status. Just swap
+
+  ret = try_fiemap( ... )
+
+and
+
+  ret = try_seek_hole( ... )
+
+to reverse the order. I can confirm that it works just fine on 3.13 kernel (all version since 3.1, according to lseek(2)), while older versions will fall back to fiemap, which needs to be protected with FIEMAP_FLAG_SYNC in try_fiemap, to be safe.
+
+This should work under all conditions, and avoid redundant syncs where possible, right?
+
+  
+
+Marking as High since duplicate bug 1350766 was marked High.
+
+openstack review at:
+  https://review.openstack.org/#/c/123957/
+
+Qemu patches at:
+  http://patchwork.ozlabs.org/patch/393494/ ; and
+  http://patchwork.ozlabs.org/patch/393495/
+
+FWIW the following 2 commits in qemu master resolve the issue for qemu-img.
+
+  http://git.qemu.org/?p=qemu.git;a=commit;h=38c4d0aea3e1264c86e282d99560330adf2b6e25
+  http://git.qemu.org/?p=qemu.git;a=commit;h=7c15903789953ead14a417882657d52dc0c19a24
+
+If possible they should be back ported to trusty and utopic.
+
+You'll also need something like:
+
+  http://git.qemu.org/?p=qemu.git;a=commit;h=4f11aa8a40351b28c0e67c7276e0003b38cc46ac
+
+before my 2 patches.
+
+Thanks for the information.  Looks like we can apply these in debian too.
+
+Status changed to 'Confirmed' because the bug affects multiple users.
+
+This bug was fixed in the package qemu - 2.1+dfsg-4ubuntu7
+
+---------------
+qemu (2.1+dfsg-4ubuntu7) vivid; urgency=medium
+
+  * Apply two patches to fix intermittent qemu-img corruption
+    (LP: #1368815)
+    - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap
+    - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap
+ -- Serge Hallyn <email address hidden>   Wed, 29 Oct 2014 22:31:43 -0500
+
+Hi Serge,
+
+
+Is there any chance these fixes will go into trusty?
+
+Hi Tony,
+
+yes, I've uploaded a proposed fix for trusty-proposed earlier today.  It should be available for testing as soon as it is accepted.
+
+Awesome.
+
+Thanks!
+
+Hello Michael, or anyone else affected,
+
+Accepted qemu into utopic-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu/2.1+dfsg-4ubuntu6.2 in a few hours, and then in the -proposed repository.
+
+Please help us by testing this new package.  See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.  Your feedback will aid us getting this update out to other Ubuntu users.
+
+If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed.  In either case, details of your testing will help us make a better decision.
+
+Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in advance!
+
+Hello Michael, or anyone else affected,
+
+Accepted qemu into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/qemu/2.0.0+dfsg-2ubuntu1.8 in a few hours, and then in the -proposed repository.
+
+Please help us by testing this new package.  See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.  Your feedback will aid us getting this update out to other Ubuntu users.
+
+If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed.  In either case, details of your testing will help us make a better decision.
+
+Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in advance!
+
+Tested qemu-utils  2.0.0+dfsg-2ubuntu1.8. Successful.
+
+The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates.  Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report.  In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
+
+This bug was fixed in the package qemu - 2.0.0+dfsg-2ubuntu1.8
+
+---------------
+qemu (2.0.0+dfsg-2ubuntu1.8) trusty-proposed; urgency=medium
+
+  * debian/qemu-system-x86.qemu-kvm.upstart: create /dev/kvm in a
+    container. (LP: #1370199)
+  * Cherrypick upstream patch to fix intermittent qemu-img corruption
+    (LP: #1368815)
+    - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap
+    - (note - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap (which was
+      also needed in utopic) appears to be unneeded here as the code being
+      changed has not yet been switched to using try_fiemap)
+ -- Serge Hallyn <email address hidden>   Thu, 20 Nov 2014 11:24:51 -0600
+
+@Michael,
+
+by any chance would you be albe to test on utopic?
+
+I couldn't reproduce the bug on the old qemu myself, however Michael has verified the (same) fix on trusty, and the full qa-regression-test passed for me on utopic-proposed.  So I would request that we call this verification-done.
+
+Looking at the fixes, I also see the following commits remove the above changes, which could mean we might encounter this again:
+c4875e5 raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
+d1f06fe raw-posix: The SEEK_HOLE code is flawed, rewrite it
+
+Note there is also a related issue:
+bug 1292234
+So far testing with the proposed qemu version or upstream I still encounter issues on ext4 w/ ^extent and ext3 filesystems.
+
+Filed a separate issue for MOS https://bugs.launchpad.net/mos/+bug/1401261
+
+Hi Chris,
+Markus' rework will not reintroduce this bug as it completely removes all fiemap code.
+
+bug 129224 is a different issue, I'll comment on that bug.
+
+You say: you encounter issues with upstream with ^extent and ext3 filesystems.  Just to be clear: Are you saying that *this* bug is still a problem for you?
+
+if it's a different bug then I write it up and I'll take a look.
+
+Tony,
+
+Yea, its a different bug. I tested with the above patched package and upstream qemu from git, and I can still hit bug 129224. I was hoping this also fixed my issue, but unfortunately it seems to be a different issue that occurs when using the same types of filesystems. I have a solid reproducer on my desk so let me know which experiments / areas of code / etc I should look at.
+
+Just to clarify it's bug 1292234 in the previous comment.
+
+Chris,
+I've read through 1292234 and I'll have a play with your reproducer locally and see if I can gain any insight.
+
+I'm sorry my fix didn't help 1292234, but glad you can't hit 1368815 with upstream, I was kinda having kittens here ;P
+
+This bug was fixed in the package qemu - 2.1+dfsg-4ubuntu6.2
+
+---------------
+qemu (2.1+dfsg-4ubuntu6.2) utopic-proposed; urgency=medium
+
+  * Apply two patches to fix intermittent qemu-img corruption
+    (LP: #1368815)
+    - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap
+    - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap
+ -- Serge Hallyn <email address hidden>   Thu, 20 Nov 2014 16:33:09 -0600
+
+I'm happy to tackle to also fix cinder with a version of the nova fix (for consistency).  I propose waiting until the nova fix lands
+
+I'd elevate this to high so it matches nova and ubuntu but I don't have permissions to do so.
+
+Fix proposed to branch: master
+Review: https://review.openstack.org/141259
+
+> - 501-block-raw-posix-fix-disk-corruption-in-try-fiemap
+>   - (note - 502-block-raw-posic-use-seek-hole-ahead-of-fiemap (which was
+>     also needed in utopic) appears to be unneeded here as the code being
+>      changed has not yet been switched to using try_fiemap)
+
+Actually such a enforces fsync and drastically reduces the performance of conversion.
+I propose to use seek_hole instead of FIEMAP (which is basically what 
+ 502-block-raw-posic-use-seek-hole-ahead-of-fiemap does). 
+
+
+The second part of the fix (which does not reduce the performance) for qemu 2.0 (apparently uploading two patches at once is not so easy)
+
+Patchg 0500-block-raw-posix-Try-both-FIEMAP-and-SEEK_HOLE.patch appears to be part of a bigger re-write of the related code.   and is ON TOP of the patches already applied in this bug.
+
+
+No doubt the rewirtten code is "better" but backporting it contains more risk than the 2 simple fixes I already nominated.
+
+> Patch 0500-block-raw-posix-Try-both-FIEMAP-and-SEEK_HOLE.patch appears to be part of a bigger re-write
+> of the related code. and is ON TOP of the patches already applied in this bug.
+
+Yep, sorry for not mentioning this. As far as I understand qemu-2.1 package contains this partially rewritten
+code too (without any recent changes like disabling FIEMAP completely and rewriting the code using SEEK_HOLE).
+
+> No doubt the rewirtten code is "better" but backporting it contains more risk than the 2 simple fixes I already nominated.
+
+Can we completely disable the FIEMAP code and pretend that all blocks are allocated? I'm afraid fsync'ing 100+ GB
+files might be even slower than ignoring the sparseness.
+
+Change abandoned by John Griffith (<email address hidden>) on branch: master
+Review: https://review.openstack.org/141259
+
+Fix proposed to branch: master
+Review: https://review.openstack.org/143575
+
+Change abandoned by Mike Perez (<email address hidden>) on branch: master
+Review: https://review.openstack.org/143575
+Reason: 1 month, no update.
+
+Change abandoned by Tony Breeds (<email address hidden>) on branch: master
+Review: https://review.openstack.org/123957
+Reason: The main distros we care about have landed or are in progress.
+
+Marking as Wont-Fix.
+
+Change abandoned by Mike Perez (<email address hidden>) on branch: master
+Review: https://review.openstack.org/143575
+Reason: No activity for over a month.
+
+Closing based on the assumption that a working qemu-img is available now.
+
+According to comment #8 the fixes have been included in the upstream QEMU repository, so setting the status to "Fix released" now.
+