summary refs log tree commit diff stats
path: root/gitlab/issues/target_missing/host_missing/accel_missing/814.toml
diff options
context:
space:
mode:
Diffstat (limited to 'gitlab/issues/target_missing/host_missing/accel_missing/814.toml')
-rw-r--r--gitlab/issues/target_missing/host_missing/accel_missing/814.toml45
1 files changed, 45 insertions, 0 deletions
diff --git a/gitlab/issues/target_missing/host_missing/accel_missing/814.toml b/gitlab/issues/target_missing/host_missing/accel_missing/814.toml
new file mode 100644
index 00000000..f26534c6
--- /dev/null
+++ b/gitlab/issues/target_missing/host_missing/accel_missing/814.toml
@@ -0,0 +1,45 @@
+id = 814
+title = "On Windows, qcow2 is corrupted on expansion"
+state = "opened"
+created_at = "2022-01-12T18:34:54.797Z"
+closed_at = "n/a"
+labels = ["Storage", "hostos: Windows"]
+url = "https://gitlab.com/qemu-project/qemu/-/issues/814"
+host-os = "Windows 10 21H2"
+host-arch = "x86_64"
+qemu-version = "`"
+guest-os = "Fedora Rawhide (Fedora-Workstation-Live-x86_64-Rawhide-20220111.n.1.iso)"
+guest-arch = "x86_64"
+description = """On Windows, the qcow2 loses blocks on account of which the filesystem withing is corrupted as data is copied to it, just the same way as in #727 VHDX is corrupted on expansion on both Linux/Windows.  
+
+After filing a bug for WNBD https://github.com/cloudbase/wnbd/issues/63 , I was suggested to try raw and qcow2. In the process I found that qcow2 is also affected. But it is also true that the kernel-5.15.4 ... 5.15.13 series have also been buggy https://bugzilla.kernel.org/show_bug.cgi?id=215460 . 
+On Linux, qcow2 never showed any signs of corruption.
+On Windows, however, qcow2 does corrupt.
+
+It is possible that, as Linux is so much more efficient at files and disk-IO, the kernel-block-code, qemu-block-code and qemu-qcow2-code do not hit the bug, and so the corruption does not show up as easily in Linux. Windows, being a little slower at this, might be causing the bug to show up in this qcow2 test. Possibly, the issue more likely to show up on slower machines. I am using an 2013-era intel-4rth gen i7-4700mq Haswell machine. 
+
+It is possible that, the resolution for this issue and that for #727 could be the same or very closely related. The bug may not be in qcow2.c or vhdx.c but maybe in the qemu/block subsystem. If the data-block that arrives from the VM-interface/nbd-interface which has to be written to file, but never gets to the virtual-disk code, not allocated and written to, then the data-block is lost."""
+reproduce = """1. Prepare virtual-disk1 as empty qcow2. In my-setup, the qcow2 file resides on an 150 GiB ExFAT partition on 512 GiB SSD. I use ExFAT as the ExFAT-filesystem does not have a concept of sparse files, eliminating that factor from troubleshooting.
+   ```qemu-img.exe create -f qcow2 H:\\gkpics01.qcow2  99723771904```
+2. Prepare virtual-disk2 VHDX with synthetic generated data (sgdata). Scriptlets to recreate sgdata are described in https://gitlab.com/qemu-project/qemu/-/issues/727#note_739930694 . In my-setup, the vhdx file resides on an 1 TiB NTFS partition on a 2 TiB HDD.
+3. Start qemu with arguments as given above. 
+4. Inside VM, boot and bringup livecd desktop, close the installer and open a terminal
+5. Use gdisk to put an ext4 partition on /dev/sda
+6. Put ext4 partition on sda1 ```mkfs.ext4 -L fs_gkpics01 /dev/sda1```
+7. Create mount directories ```mkdir /mnt/a /mnt/b```
+8. Mount the empty partition from virtual-disk-1 ```mount -t ext4 /dev/sda1 /mnt/a```
+9. Mount the sgdata partition from virtual-disk-2 ```mount.ntfs-3g /dev/sdb2 /mnt/b```  or ```mount -t ntfs3 /dev/sdb2 /mnt/b```
+10. Keep a terminal tab open with ```dmesg -w``` running
+11. Rsync sgdata ```( sdate=`date` ; cd /mnt/b ; rsync -avH ./photos001 /mnt/a | tee /tmp/rst.txt ; echo $sdate ; date )```
+12. Check sha256sum ```( sdate=`date` ; cd /mnt/a/photos001 ; shas256sum -c ./find.CHECKSUM --quiet ; echo $sdate ; date )```  
+    corruption will show even without needing to unmount-remount or reboot-remount. 
+
+- About 1.4 GiB free-space left on the ext4 partition. 
+- Compared to #727, The number of files corrupted are less ``` sha256sum: WARNING: 31 computed checksums did not match ```
+- After, VM guest OS warm reboot, a recheck of the sha256sum shows the same 31 files as corrupted
+- After, qemu poweroff, restart qemu, VM guest OS cold boot, a recheck of the sha256sum shows the same 31 files as corrupted
+- df shows: sda1 has 95271336 1k-blocks, of which 88840860 are used, 1544820 available, 99% used. The numbers don't add up. Either file-blocks are lost in lost-clusters or the ext4-filesystem has a large journal or the file-system-metadata is too large, or the ext4-filesystem has large cluster-size which results in inefficient space usage.
+- An ```unmount /dev/sda1 ; fsck -y /dev/sda1 ; mount -t ext4 /dev/sda1 /mnt/a``` did not find any lost clusters.
+
+The reason I don't think this is a kernel bug, is because the raw-file as virtual-disk-1 doesn't show this issue. Also, it happens regardless of whether sgdata is on ntfs-3g or ntfs3-paragon."""
+additional = """"""