diff options
Diffstat (limited to 'gitlab/issues_text/target_missing/host_missing/accel_missing/814')
| -rw-r--r-- | gitlab/issues_text/target_missing/host_missing/accel_missing/814 | 37 |
1 files changed, 0 insertions, 37 deletions
diff --git a/gitlab/issues_text/target_missing/host_missing/accel_missing/814 b/gitlab/issues_text/target_missing/host_missing/accel_missing/814 deleted file mode 100644 index 4aaa94721..000000000 --- a/gitlab/issues_text/target_missing/host_missing/accel_missing/814 +++ /dev/null @@ -1,37 +0,0 @@ -On Windows, qcow2 is corrupted on expansion -Description of problem: -On Windows, the qcow2 loses blocks on account of which the filesystem withing is corrupted as data is copied to it, just the same way as in #727 VHDX is corrupted on expansion on both Linux/Windows. - -After filing a bug for WNBD https://github.com/cloudbase/wnbd/issues/63 , I was suggested to try raw and qcow2. In the process I found that qcow2 is also affected. But it is also true that the kernel-5.15.4 ... 5.15.13 series have also been buggy https://bugzilla.kernel.org/show_bug.cgi?id=215460 . -On Linux, qcow2 never showed any signs of corruption. -On Windows, however, qcow2 does corrupt. - -It is possible that, as Linux is so much more efficient at files and disk-IO, the kernel-block-code, qemu-block-code and qemu-qcow2-code do not hit the bug, and so the corruption does not show up as easily in Linux. Windows, being a little slower at this, might be causing the bug to show up in this qcow2 test. Possibly, the issue more likely to show up on slower machines. I am using an 2013-era intel-4rth gen i7-4700mq Haswell machine. - -It is possible that, the resolution for this issue and that for #727 could be the same or very closely related. The bug may not be in qcow2.c or vhdx.c but maybe in the qemu/block subsystem. If the data-block that arrives from the VM-interface/nbd-interface which has to be written to file, but never gets to the virtual-disk code, not allocated and written to, then the data-block is lost. -Steps to reproduce: -1. Prepare virtual-disk1 as empty qcow2. In my-setup, the qcow2 file resides on an 150 GiB ExFAT partition on 512 GiB SSD. I use ExFAT as the ExFAT-filesystem does not have a concept of sparse files, eliminating that factor from troubleshooting. - ```qemu-img.exe create -f qcow2 H:\gkpics01.qcow2 99723771904``` -2. Prepare virtual-disk2 VHDX with synthetic generated data (sgdata). Scriptlets to recreate sgdata are described in https://gitlab.com/qemu-project/qemu/-/issues/727#note_739930694 . In my-setup, the vhdx file resides on an 1 TiB NTFS partition on a 2 TiB HDD. -3. Start qemu with arguments as given above. -4. Inside VM, boot and bringup livecd desktop, close the installer and open a terminal -5. Use gdisk to put an ext4 partition on /dev/sda -6. Put ext4 partition on sda1 ```mkfs.ext4 -L fs_gkpics01 /dev/sda1``` -7. Create mount directories ```mkdir /mnt/a /mnt/b``` -8. Mount the empty partition from virtual-disk-1 ```mount -t ext4 /dev/sda1 /mnt/a``` -9. Mount the sgdata partition from virtual-disk-2 ```mount.ntfs-3g /dev/sdb2 /mnt/b``` or ```mount -t ntfs3 /dev/sdb2 /mnt/b``` -10. Keep a terminal tab open with ```dmesg -w``` running -11. Rsync sgdata ```( sdate=`date` ; cd /mnt/b ; rsync -avH ./photos001 /mnt/a | tee /tmp/rst.txt ; echo $sdate ; date )``` -12. Check sha256sum ```( sdate=`date` ; cd /mnt/a/photos001 ; shas256sum -c ./find.CHECKSUM --quiet ; echo $sdate ; date )``` - corruption will show even without needing to unmount-remount or reboot-remount. - -- About 1.4 GiB free-space left on the ext4 partition. -- Compared to #727, The number of files corrupted are less ``` sha256sum: WARNING: 31 computed checksums did not match ``` -- After, VM guest OS warm reboot, a recheck of the sha256sum shows the same 31 files as corrupted -- After, qemu poweroff, restart qemu, VM guest OS cold boot, a recheck of the sha256sum shows the same 31 files as corrupted -- df shows: sda1 has 95271336 1k-blocks, of which 88840860 are used, 1544820 available, 99% used. The numbers don't add up. Either file-blocks are lost in lost-clusters or the ext4-filesystem has a large journal or the file-system-metadata is too large, or the ext4-filesystem has large cluster-size which results in inefficient space usage. -- An ```unmount /dev/sda1 ; fsck -y /dev/sda1 ; mount -t ext4 /dev/sda1 /mnt/a``` did not find any lost clusters. - -The reason I don't think this is a kernel bug, is because the raw-file as virtual-disk-1 doesn't show this issue. Also, it happens regardless of whether sgdata is on ntfs-3g or ntfs3-paragon. -Additional information: - |