From dee4dcba78baf712cab403d47d9db319ab7f95d6 Mon Sep 17 00:00:00 2001 From: Christian Krinitsin Date: Thu, 3 Jul 2025 19:39:53 +0200 Subject: restructure results --- results/classifier/105/assembly/1098729 | 177 ------- results/classifier/105/assembly/1396497 | 100 ---- results/classifier/105/assembly/1402755 | 89 ---- results/classifier/105/assembly/1490611 | 356 -------------- results/classifier/105/assembly/1520 | 62 --- results/classifier/105/assembly/1548170 | 152 ------ results/classifier/105/assembly/1605 | 51 -- results/classifier/105/assembly/1612 | 64 --- results/classifier/105/assembly/1620 | 107 ----- results/classifier/105/assembly/1649 | 30 -- results/classifier/105/assembly/1662050 | 373 --------------- results/classifier/105/assembly/1724570 | 95 ---- results/classifier/105/assembly/1772262 | 67 --- results/classifier/105/assembly/1787002 | 37 -- results/classifier/105/assembly/1806114 | 93 ---- results/classifier/105/assembly/1847793 | 265 ----------- results/classifier/105/assembly/1850000 | 156 ------- results/classifier/105/assembly/1852196 | 95 ---- results/classifier/105/assembly/1862167 | 32 -- results/classifier/105/assembly/1877136 | 77 --- results/classifier/105/assembly/1882671 | 797 -------------------------------- results/classifier/105/assembly/1883784 | 61 --- results/classifier/105/assembly/2013 | 91 ---- results/classifier/105/assembly/2180 | 49 -- results/classifier/105/assembly/2186 | 47 -- results/classifier/105/assembly/2303 | 84 ---- results/classifier/105/assembly/2463 | 22 - results/classifier/105/assembly/2677 | 14 - results/classifier/105/assembly/2871 | 14 - results/classifier/105/assembly/494 | 14 - results/classifier/105/assembly/536 | 14 - results/classifier/105/assembly/710 | 14 - results/classifier/105/assembly/811683 | 319 ------------- results/classifier/105/assembly/884401 | 83 ---- results/classifier/105/assembly/904 | 29 -- results/classifier/105/assembly/968 | 108 ----- 36 files changed, 4238 deletions(-) delete mode 100644 results/classifier/105/assembly/1098729 delete mode 100644 results/classifier/105/assembly/1396497 delete mode 100644 results/classifier/105/assembly/1402755 delete mode 100644 results/classifier/105/assembly/1490611 delete mode 100644 results/classifier/105/assembly/1520 delete mode 100644 results/classifier/105/assembly/1548170 delete mode 100644 results/classifier/105/assembly/1605 delete mode 100644 results/classifier/105/assembly/1612 delete mode 100644 results/classifier/105/assembly/1620 delete mode 100644 results/classifier/105/assembly/1649 delete mode 100644 results/classifier/105/assembly/1662050 delete mode 100644 results/classifier/105/assembly/1724570 delete mode 100644 results/classifier/105/assembly/1772262 delete mode 100644 results/classifier/105/assembly/1787002 delete mode 100644 results/classifier/105/assembly/1806114 delete mode 100644 results/classifier/105/assembly/1847793 delete mode 100644 results/classifier/105/assembly/1850000 delete mode 100644 results/classifier/105/assembly/1852196 delete mode 100644 results/classifier/105/assembly/1862167 delete mode 100644 results/classifier/105/assembly/1877136 delete mode 100644 results/classifier/105/assembly/1882671 delete mode 100644 results/classifier/105/assembly/1883784 delete mode 100644 results/classifier/105/assembly/2013 delete mode 100644 results/classifier/105/assembly/2180 delete mode 100644 results/classifier/105/assembly/2186 delete mode 100644 results/classifier/105/assembly/2303 delete mode 100644 results/classifier/105/assembly/2463 delete mode 100644 results/classifier/105/assembly/2677 delete mode 100644 results/classifier/105/assembly/2871 delete mode 100644 results/classifier/105/assembly/494 delete mode 100644 results/classifier/105/assembly/536 delete mode 100644 results/classifier/105/assembly/710 delete mode 100644 results/classifier/105/assembly/811683 delete mode 100644 results/classifier/105/assembly/884401 delete mode 100644 results/classifier/105/assembly/904 delete mode 100644 results/classifier/105/assembly/968 (limited to 'results/classifier/105/assembly') diff --git a/results/classifier/105/assembly/1098729 b/results/classifier/105/assembly/1098729 deleted file mode 100644 index 13a8a87d..00000000 --- a/results/classifier/105/assembly/1098729 +++ /dev/null @@ -1,177 +0,0 @@ -assembly: 0.688 -device: 0.653 -boot: 0.646 -vnc: 0.643 -semantic: 0.636 -instruction: 0.636 -graphic: 0.631 -other: 0.591 -KVM: 0.563 -socket: 0.558 -network: 0.446 -mistranslation: 0.302 - -qemu-user-static for armhf: segfault in threaded code - - -Currently running QEMU from git (fedf2de31023) and running the armhf version of qemu-user-static which I have renamed qemu-armhf-static to follow the naming convention used in Debian. - -The host systems is a Debian testing x86_64-linux and I have an Debian testing armhf chroot which I invoke using schroot. - -Majority of program in the armhf chroot run fine, but I'm getting qemu segfaults in multi-threaded programs. - -As an example, I've grabbed the threads demo program here: - - https://computing.llnl.gov/tutorials/pthreads/samples/dotprod_mutex.c - -and changed NUMTHRDS from 4 to 10. I compile it as (same compile command on both x86_64 host and armhf guest): - - gcc -Wall -lpthread dotprod_mutex.c -o dotprod_mutex - -When compiled for x86_64 host it runs perfectly and even under Valgrind displays no errors whatsoever. - -However, when I compile the program in my armhs chroot and run it it usually (but not always) segaults or hangs or crashes. Example output: - - - (armhf) $ ./dotprod_mutex - Thread 1 did 100000 to 200000: mysum=100000.000000 global sum=100000.000000 - Thread 0 did 0 to 100000: mysum=100000.000000 global sum=200000.000000 - TCG temporary leak before f6731ca0 - qemu-arm-static: /home/erikd/Git/qemu-posix-timer-hacking/Upstream/tcg/tcg-op.h:2371: - tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) == 0' failed. - - - (armhf) $ ./dotprod_mutex - qemu: uncaught target signal 11 (Segmentation fault) - core dumped - Segmentation fault - - (armhf) $ ./dotprod_mutex - qemu-arm-static: /home/erikd/Git/qemu-posix-timer-hacking/Upstream/tcg/tcg.c:519: - tcg_temp_free_internal: Assertion `idx >= s->nb_globals && idx < s->nb_temps' failed. - - - (armhf) $ ./dotprod_mutex - Thread 1 did 100000 to 200000: mysum=100000.000000 global sum=100000.000000 - qemu: uncaught target signal 11 (Segmentation fault) - core dumped - Segmentation fault - -I can also comple a purely static version of the test program in the armhf chroot using: - - gcc -Wall -static -pthread dotprod_mutex.c -o dotprod-mutex-static - -and then run it simply using: - - qemu-arm-static dotprod-mutex-static - -which fails just like it does in the chroot. - -Begining to think this is memory corruption because of the number of different failure modes. In addition to the crashes in the initial report I have also seen the following: - - - qemu: uncaught target signal 4 (Illegal instruction) - core dumped - - More temporaries freed than allocated! - TCG temporary leak before 0001d1dc - - qemu-arm-static: /home/erikd/Git/qemu-pthread-hacking/tcg/tcg.c:1888: tcg_reg_alloc_op: - Assertion `ts->val_type == 1' failed. - - /home/erikd/Git/qemu-pthread-hacking/tcg/tcg.c:149: tcg fatal error - - -What's the best way to debug the qemu user space emulation? I read this: - - http://wiki.qemu.org/Documentation/Debugging - -but that seems to mainly refer to the qemu machine emulation. - -I added -ggdb to QEMU_CFLAGS in config-host.mak so it builds with debug symbols but gdb still doesn't provide any useful information beyond the following: - - Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". - [New Thread 0x7ffefdb6b700 (LWP 11210)] - [New Thread 0x7ffefdaf5700 (LWP 11211)] - [New Thread 0x7ffefda7f700 (LWP 11212)] - [New Thread 0x7ffefda09700 (LWP 11213)] - [New Thread 0x7ffefd993700 (LWP 11214)] - - Program received signal SIGSEGV, Segmentation fault. - [Switching to Thread 0x7ffefdaf5700 (LWP 11211)] - 0x0000000060363b58 in static_code_gen_buffer () - (gdb) bt - #0 0x0000000060363b58 in static_code_gen_buffer () - #1 0x00000000f50ba518 in ?? () - #2 0x00000000624a9360 in ?? () - #3 0x00007ffefdaf4b80 in ?? () - #4 0x326cebdf4a8e4700 in ?? () - #5 0x00007ffe00000000 in ?? () - #6 0x0000000000000000 in ?? () - -and valgrind doesn't help either. - -Yes, multithreaded guests are liable to crash; where they work they generally work more by luck than design. There is some discussion in LP:668799 of one of the known problems (whose symptoms are usually crashes or hangs). - - -At the top of function cpu_unlink_tb() in translate-all.c: - - /* FIXME: TB unchaining isn't SMP safe. For now just ignore the - problem and hope the cpu will stop of its own accord. For userspace - emulation this often isn't actually as bad as it sounds. Often - signals are used primarily to interrupt blocking syscalls. */ - - -The class of bugs exemplified by the symptoms described here are those where the multithreaded guest program causes QEMU to misbehave because we are sharing the code-translation globals (eg the generated code buffer) between multiple threads and they trod on each others' toes. - -(The race described in the comment in cpu_unlink_tb() has been fixed under LP:668799.) - -I also experimented the bug. -It may SIGSEGV or hang. Or it may work, very rarely. - -But I cannot reproduce it at all if change my app to stay on a single CPU: - -int -main(int argc, char * argv[] ) -{ - -#ifdef QEMU - cpu_set_t cpuSet; - CPU_ZERO(&cpuSet); - CPU_SET(0,&cpuSet); - if (sched_setaffinity(getpid(), sizeof(cpu_set_t), &cpuSet) !=0) - cerr << "sched_setaffinity failed" << endl; -#endif /* QEMU */ - -./build/buildd/qemu-linaro-1.5.0-2013.06/tcg/tcg.c:149: tcg fatal error -/build/buildd/qemu-linaro-1.5.0-2013.06/tcg/tcg.c:149: tcg fatal error - -same for me - -Same problem for me when executing msgmerge in qemu-arm-static. - -https://launchpadlibrarian.net/181070813/buildlog_ubuntu-utopic-armhf.hedgewars_0.9.21~alpha~7716~ubuntu14.10.1_FAILEDTOBUILD.txt.gz - -this started happening after the new deploy of trusty in buildds - -Also happening when manually built from the 2.1.2 release codebase. In my case it impacts the llvm-3.5.0 "make check" testsuite running an an armhf-emulated chroot--it immediately gets SIGSEGV and SIGILL as soon as it starts running tests. - -I cannot make dotprod_mutex.c to crash with the current master (git 8ffe756d). I've tried both linux-arm and linux-arm-static, the latter running under chroot. - -I've tried on three different machines, and have tested with different thread counts: 4, 10, 16, 64 (one of the machines has 64 cores). -I completed 1000 successive runs on each. - -Can you please retest on the current master? I certainly could trigger the bug on the qemu-arm-static that is packaged with Ubuntu 14.04, so it is possible that since then changes in qemu have at least made it harder to trigger the bug. - - - -I can confirm that building QEMU 2.5.0 from source, all the multi-thread issues seem to be fixed. - -Specifically, the mentioned dotprod_mutex.c example, even when modified to use 100 threads, is always running in the qemu-arm User mode emulator. - -Tested in Ubuntu 14.04 x86_64, with all the updates installed. - -Note that instead the QEMU 2.0.0 from the Ubuntu 14.04 repository is having issues even when using workarounds like running it with "taskset 0x1" to force the execution to a single CPU. - - - -We think we've fixed the multithreading issues in QEMU linux-user (in particular the test case that started this bug report works). If there are still problems with a QEMU version later than 2.10, please open fresh bug reports for specific guest programs that fail, giving detailed how-to-reproduce instructions. - - diff --git a/results/classifier/105/assembly/1396497 b/results/classifier/105/assembly/1396497 deleted file mode 100644 index 59c68c8f..00000000 --- a/results/classifier/105/assembly/1396497 +++ /dev/null @@ -1,100 +0,0 @@ -assembly: 0.467 -device: 0.444 -other: 0.422 -instruction: 0.377 -graphic: 0.272 -socket: 0.243 -boot: 0.232 -mistranslation: 0.218 -semantic: 0.196 -vnc: 0.196 -KVM: 0.181 -network: 0.116 - -'qemu-img snapshot' allows new snapshot to be created with the name of an existing snapshot - -qemu-img _may_ be working as designed, but it feels like this could be a bug. I'd certainly prefer to only allow unique snapshot names (unless maybe something like a "--force-non-unique-snapshot-names" was also specified). - -If this really is correct behaviour, it should be documented as qemu-img(1) currently specifies no details whatsoever regarding expected behaviour or valid snapshot names. - -$ qemu-img snapshot -l image.cow -$ qemu-img snapshot -c foo image.cow -$ qemu-img snapshot -l image.cow -Snapshot list: -ID TAG VM SIZE DATE VM CLOCK -1 foo 0 2014-11-26 08:30:53 00:00:00.000 -$ qemu-img snapshot -c foo image.cow -$ qemu-img snapshot -l image.cow -Snapshot list: -ID TAG VM SIZE DATE VM CLOCK -1 foo 0 2014-11-26 08:30:53 00:00:00.000 -2 foo 0 2014-11-26 08:30:58 00:00:00.000 -$ qemu-img snapshot -c foo image.cow -$ qemu-img snapshot -l image.cow -Snapshot list: -ID TAG VM SIZE DATE VM CLOCK -1 foo 0 2014-11-26 08:30:53 00:00:00.000 -2 foo 0 2014-11-26 08:30:58 00:00:00.000 -3 foo 0 2014-11-26 08:31:00 00:00:00.000 -$ qemu-img snapshot -d foo image.cow -$ qemu-img snapshot -l image.cow -Snapshot list: -ID TAG VM SIZE DATE VM CLOCK -2 foo 0 2014-11-26 08:30:58 00:00:00.000 -3 foo 0 2014-11-26 08:31:00 00:00:00.000 -$ qemu-img snapshot -d foo image.cow -$ qemu-img snapshot -l image.cow -Snapshot list: -ID TAG VM SIZE DATE VM CLOCK -3 foo 0 2014-11-26 08:31:00 00:00:00.000 -$ qemu-img snapshot -d foo image.cow -$ qemu-img snapshot -l image.cow -$ - -Note also how snapshot deletion works in reverse order - the oldest snapshot with a given name is deleted first. - -ProblemType: Bug -DistroRelease: Ubuntu 15.04 -Package: qemu-utils 2.1+dfsg-4ubuntu9 -ProcVersionSignature: Ubuntu 3.16.0-25.33-generic 3.16.7 -Uname: Linux 3.16.0-25-generic x86_64 -ApportVersion: 2.14.7-0ubuntu10 -Architecture: amd64 -CurrentDesktop: Unity -Date: Wed Nov 26 08:28:16 2014 -InstallationDate: Installed on 2014-04-11 (228 days ago) -InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Daily amd64 (20140409) -KvmCmdLine: - COMMAND STAT EUID RUID PID PPID %CPU COMMAND - kvm-irqfd-clean S< 0 0 719 2 0.0 [kvm-irqfd-clean] -MachineType: LENOVO 20AQCTO1WW -ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.16.0-25-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 -SourcePackage: qemu -UpgradeStatus: Upgraded to vivid on 2014-05-08 (201 days ago) -dmi.bios.date: 02/10/2014 -dmi.bios.vendor: LENOVO -dmi.bios.version: GJET71WW (2.21 ) -dmi.board.asset.tag: Not Available -dmi.board.name: 20AQCTO1WW -dmi.board.vendor: LENOVO -dmi.board.version: 0B98405 STD -dmi.chassis.asset.tag: No Asset Information -dmi.chassis.type: 10 -dmi.chassis.vendor: LENOVO -dmi.chassis.version: Not Available -dmi.modalias: dmi:bvnLENOVO:bvrGJET71WW(2.21):bd02/10/2014:svnLENOVO:pn20AQCTO1WW:pvrThinkPadT440s:rvnLENOVO:rn20AQCTO1WW:rvr0B98405STD:cvnLENOVO:ct10:cvrNotAvailable: -dmi.product.name: 20AQCTO1WW -dmi.product.version: ThinkPad T440s -dmi.sys.vendor: LENOVO - - - -I'd agree that at least the last part - removing the oldest snapshot first - seems like a bug. - -The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. -If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. - -[Expired for QEMU because there has been no activity for 60 days.] - -[Expired for qemu (Ubuntu) because there has been no activity for 60 days.] - diff --git a/results/classifier/105/assembly/1402755 b/results/classifier/105/assembly/1402755 deleted file mode 100644 index 6cd160e8..00000000 --- a/results/classifier/105/assembly/1402755 +++ /dev/null @@ -1,89 +0,0 @@ -assembly: 0.901 -device: 0.883 -other: 0.879 -KVM: 0.874 -network: 0.873 -vnc: 0.871 -mistranslation: 0.870 -instruction: 0.867 -graphic: 0.854 -semantic: 0.846 -socket: 0.830 -boot: 0.803 - -qemu-kvm: e1000 RX ring is filled with partial-pkt of size 0 - -Hello, -We are using CentOS 6.5 with qemu-kvm-0.12.1.2-2.415 as a host of or VMs. -In the VM we use e1000 as the NIC emulation. -We've modified the e1000 driver to our needs. This modification start the RX engine while the RX ring is empty (RDH == RDT) -and at a later stage we fill the RX descriptors with buffers. This scheme works well on intel chips and VMware. -What we observe in this setup is that from time to time the RX ring is filled with "partial packets" of size 0 (meaning, DD bit is set, -No other status bits are set and packet size is also 0). - -Looking at the e1000 RX routine in qemu-kvm you can observe the following flow: -1. A packet is avail for receive: -2. The routine checks for RCTL_EN - it is enabled -3. The routine checks that the RDH equal RDT (they are as the ring is empty) but also checks if rxov is on (it is still off) so it doesn’t -Exit as it is supposed to. -4. The routine now updates the descriptor status with the DD bit (and vlan which we don’t care) -5. The routine checks if a buffer address is not NULL (it is as NULL since we haven’t filled it yet) – so is logs something. -6. The routine now updates the guest memory with this value (DD is 1) -7. The routine updates the check_rxov flag in order to allow ovf check the next time around. -(but ovf will not occur since in the next iteration RDH != RDT) -8. The routine loops over all the descriptors with the NULL buffer (which is all our ring) and writes the DD bit -9. We get this endless partial packet problem we see. - -qemu-kvm-0.12.1.2-2.415.el6_5.3/qemu-kvm-0.12.1.2/hw/e1000.c -static ssize_t -e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) -{ -: : : -if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) -return -1; - -: : : -do { -if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) { -set_ics(s, 0, E1000_ICS_RXO); -return -1; -} -base = ((uint64_t)s->mac_reg[RDBAH] << 32) + s->mac_reg[RDBAL] + -sizeof(desc) * s->mac_reg[RDH]; -cpu_physical_memory_read(base, (void *)&desc, sizeof(desc)); -desc.special = vlan_special; -desc.status |= (vlan_status | E1000_RXD_STAT_DD); -if (desc.buffer_addr) { -cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr), -(void *)(buf + vlan_offset), size); -desc.length = cpu_to_le16(size); -desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM; -} else // as per intel docs; skip descriptors with null buf addr -DBGOUT(RX, "Null RX descriptor!!\n"); -cpu_physical_memory_write(base, (void *)&desc, sizeof(desc)); - -: : : -if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN]) -s->mac_reg[RDH] = 0; -s->check_rxov = 1; -: : : -} while (desc.buffer_addr == 0); -} - - -A workaround is to enable the RX machine only after the descriptor ring is filled for the first time. - -Moti - -On Mon, Dec 15, 2014 at 04:59:55PM -0000, Moti wrote: -> We are using CentOS 6.5 with qemu-kvm-0.12.1.2-2.415 as a host of or VMs. - -Do you see the problem with qemu.git/master? - -Stefan - - -Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? - -[Expired for QEMU because there has been no activity for 60 days.] - diff --git a/results/classifier/105/assembly/1490611 b/results/classifier/105/assembly/1490611 deleted file mode 100644 index d29c92a6..00000000 --- a/results/classifier/105/assembly/1490611 +++ /dev/null @@ -1,356 +0,0 @@ -assembly: 0.742 -other: 0.707 -graphic: 0.681 -device: 0.649 -network: 0.598 -instruction: 0.588 -semantic: 0.583 -vnc: 0.581 -KVM: 0.453 -socket: 0.434 -mistranslation: 0.412 -boot: 0.358 - -Using qemu >=2.2.1 to convert raw->VHD (fixed) adds extra padding to the result file, which Microsoft Azure rejects as invalid - -Starting with a raw disk image, using "qemu-img convert" to convert from raw to VHD results in the output VHD file's virtual size being aligned to the nearest 516096 bytes (16 heads x 63 sectors per head x 512 bytes per sector), instead of preserving the input file's size as the output VHD's virtual disk size. - -Microsoft Azure requires that disk images (VHDs) submitted for upload have virtual sizes aligned to a megabyte boundary. (Ex. 4096MB, 4097MB, 4098MB, etc. are OK, 4096.5MB is rejected with an error.) This is reflected in Microsoft's documentation: https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-create-upload-vhd-generic/ - -This is reproducible with the following set of commands (including the Azure command line tools from https://github.com/Azure/azure-xplat-cli). For the following example, I used qemu version 2.2.1: - -$ dd if=/dev/zero of=source-disk.img bs=1M count=4096 - -$ stat source-disk.img - File: ‘source-disk.img’ - Size: 4294967296 Blocks: 798656 IO Block: 4096 regular file -Device: fc01h/64513d Inode: 13247963 Links: 1 -Access: (0644/-rw-r--r--) Uid: ( 1000/ smkent) Gid: ( 1000/ smkent) -Access: 2015-08-18 09:48:02.613988480 -0700 -Modify: 2015-08-18 09:48:02.825985646 -0700 -Change: 2015-08-18 09:48:02.825985646 -0700 - Birth: - - -$ qemu-img convert -f raw -o subformat=fixed -O vpc source-disk.img dest-disk.vhd - -$ stat dest-disk.vhd - File: ‘dest-disk.vhd’ - Size: 4296499712 Blocks: 535216 IO Block: 4096 regular file -Device: fc01h/64513d Inode: 13247964 Links: 1 -Access: (0644/-rw-r--r--) Uid: ( 1000/ smkent) Gid: ( 1000/ smkent) -Access: 2015-08-18 09:50:22.252077624 -0700 -Modify: 2015-08-18 09:49:24.424868868 -0700 -Change: 2015-08-18 09:49:24.424868868 -0700 - Birth: - - -$ azure vm image create testimage1 dest-disk.vhd -o linux -l "West US" -info: Executing command vm image create -+ Retrieving storage accounts -info: VHD size : 4097 MB -info: Uploading 4195800.5 KB -Requested:100.0% Completed:100.0% Running: 0 Time: 1m 0s Speed: 6744 KB/s -info: https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd was uploaded successfully -error: The VHD https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd has an unsupported virtual size of 4296499200 bytes. The size must be a whole number (in MBs). -info: Error information has been recorded to /home/smkent/.azure/azure.err -error: vm image create command failed - -I also ran the above commands using qemu 2.4.0, which resulted in the same error as the conversion behavior is the same. - -However, qemu 2.1.1 and earlier (including qemu 2.0.0 installed by Ubuntu 14.04) does not pad the virtual disk size during conversion. Using qemu-img convert from qemu versions <=2.1.1 results in a VHD that is exactly the size of the raw input file plus 512 bytes (for the VHD footer). Those qemu versions do not attempt to realign the disk. As a result, Azure accepts VHD files created using those versions of qemu-img convert for upload. - -Is there a reason why newer qemu realigns the converted VHD file? It would be useful if an option were added to disable this feature, as current versions of qemu cannot be used to create VHD files for Azure using Microsoft's official instructions. - -Which release contains this fix? - -Judging by their comments on bug 1399191, jan-wang1989 doesn't appear to be a QEMU developer. - -I bisected to this commit: -c70221df1f89953e85a3f1f96ceefbd6888bb55f - -Kevin Wolf, I added you since you were the author of the commit that I bisected this bug to. Can you advise at all on this bug? Thank you. - -In short: VHD is a mess and Microsoft isn't compatible with itself. That's the root cause of all the trouble we have with it. - -When the qemu VHD driver was written, it was designed to be compatible with Virtual PC, which used the disk geometry as the definite source for the image size. In order to achieve exactly the same results on qemu, we had to calculate the image size the same way, otherwise VMs would see a larger disk when run in qemu compared to Virtual PC. qemu-img always creates images so that real size and geometry match, which is the rounding you are seeing. The commit that you bisected to fixed that geometry and real size were inconsistent on fixed size disks. - -Now HyperV treats images differently. It still stores a geometry, but it doesn't actually use it to determine the size of a disk. Images created by qemu-img are generally okay because geometry and real size match, but when opening a HyperV image with qemu, the rounding we had to apply for Virtual PC is suddenly wrong. This has given us some trouble before. Now the requirement of Azure, which basically means we can't round any more even though we have to do it for Virtual PC compatibility, completes the mess. - -While we considered hacks for opening images correctly, like checking the creator application in the image, we can't do that automatically when creating an image. I'm afraid the best we can do here is to add an explicit subformat option that the user needs to specify manually. - -Jeff, any opinion on this? And should we finally implement the vpc_open() hack to distinguish by creator_app? - -If you create a subformat option I would humbly recommend focusing on Hyper-V and Azure compat, and then create an option that enforces the legacy behavior with your patch. This is essentially what some Linux distros have started to do. - - -First, I'd say that if you are converting an image over to use on Hyper-V, you would probably be better served using the VHDX format (completely different from VHD) - it is the newer format (and completely different from VHD), and is supported by QEMU as well. It is better defined and more consistent (at least so far) in its specification. - -That said, I think for the specific VHD problem we could look at the Creator field in the image. My only reservations on that are: - -1.) I haven't looked at the Creator field comprehensively across all revisions of Hyper-V and Virtual PC. But in my small sample size, it seems feasible. - -2.) It most likely won't be 100%, because of edge cases (e.g. I don't know what happens when Hyper-V opens a Virtual-PC produced VHD file, and under what circumstances it may or may not alter the Creator field) - -But the above two reservations can be overcome with the appropriate options that can be passed to the VHD format, to override the auto-detection method. - -I have access to both Virtual PC and Hyper-V, so I can put together a small patch series tomorrow to try that out. - -Unfortunately, VHDX is not supported in Azure. - -I would be happy to help test out any patches. - -I've posted a series to qemu-devel to hopefully address this issue. Cole, or anyone else that wants to test it out: http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg05511.html - -For the specific qemu-img convert case mentioned in this bug report, I believe using the new "force-size" option should address it, e.g.: - -# qemu-img convert -f raw -o subformat=fixed,force-size -O vpc source-disk.img dest-disk.vhd - - -It looks like the option is "force_size" rather than "force-size". - -It also seems to be having the intended effect. I regenerate my nixos iso with various configurations: - -qemu-master + jeffcody patches + force_size = 2147484160 -qemu-master + jeffcody patches (no force_size) = 2147992064 -qemu-stable: 2147992064 -qemu-220: 2147484160 - -2147992064 - 512 = 2147991552 (2048.484375 MiB) -2147484160 - 512 = 2147483648 (exactly 2048 MiB) - -It appears to be working. Thanks! - -(Note, I only applied the non-test patches. I have them applied on master in a fork under my user on GitHub if someone else wants to test easily). - -Thanks for testing! It is worth noting there is a v2 on the list now, that changes the creator app string to "qem2" from "qemu" when using the force_size option. That should only matter if you try to use your converted images on QEMU (so QEMU will know on that image to rely by default on the current_size field). - -Sounds good. I don't plan to retest unless you'd like me to; that change shouldn't affect me. - -For the status change, I am really sorry. -I think at that time, I just want to check the status/history of this bug, but I maybe click certain button by mistake. - -Fix is available with QEMU 2.6: -http://git.qemu.org/?p=qemu.git;a=commitdiff;h=fb9245c2610932d33ce14 - -Status changed to 'Confirmed' because the bug affects multiple users. - -Just did a source check of 1:2.6+dfsg-3ubuntu1 in yakkety and the upstream fix is already present (as it is 2.6 based). Working on the backport now to 16.04. - -I have submitted a test build of 2.5+dfsg-5ubuntu10.1~ppa1 to https://launchpad.net/~nacc/+archive/ubuntu/lp1490611. Please test that version on 16.04. - -Apologies, I uploaded an incorrect version to the PPA, please test 2.5+dfsg-5ubuntu10.3~ppa1. - - - -Uploaded to Xenial, thanks. Am I right in thinking that the new option force_size is required to be used with the patch to fix the problem? It's probably worth mentioning this in the Test Case in the bug description; otherwise it will appear to testers that the fix doesn't work. - -On 04.08.2016 [10:49:47 -0000], Robie Basak wrote: -> Uploaded to Xenial, thanks. Am I right in thinking that the new option -> force_size is required to be used with the patch to fix the problem? -> It's probably worth mentioning this in the Test Case in the bug -> description; otherwise it will appear to testers that the fix doesn't -> work. - -Agreed, updated. - - -Rejected: - -Rejected by Brian Murray: There was a security update to qemu today, so the SRU will need to be redone on top of that. - - - -Uploaded. - -It looks like there was a regression caused by that security update see bug 1612089. It might be best to coordinate with the security team, if they are doing a regular SRU, regarding the the fix for this bug. - -Can you rebase your fix on 1:2.5+dfsg-5ubuntu10.4 (due to the regression fix mentioned in #25)? -Another thing about your backport is that it dropped the qem2 bits from the patch. Is there a reason for this? If so please mention it in the debian/patch file. - -On 17.08.2016 [13:12:19 -0000], Chris J Arges wrote: -> Can you rebase your fix on 1:2.5+dfsg-5ubuntu10.4 (due to the -> regression fix mentioned in #25)? - -Will do! - -> Another thing about your backport is that it dropped the qem2 bits -> from the patch. Is there a reason for this? If so please mention it in -> the debian/patch file. - -Ah yes, those sections of the upstream fix do not apply cleanly due to -differing context (not even present). - -That does bring up another question, though: - -@smkbot or anyone else that might know. It seems the original series was -4 patches -(http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg06037.html), -and there was a follow-on of 7 patches that fixed a regression in that -series -(http://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05424.html). -Do we need to backport all 11 patches? In the first series, at least, -there is mention that patch 1/4 fixes an issue for reading VHD images. -While I realize that this particular bug is just for creating/converting -images, would it also be appropriate to backport the full set of fixes -for VHD/VPC? - - - - -On 17.08.2016 [10:20:26 -0700], Nish Aravamudan wrote: -> On 17.08.2016 [13:12:19 -0000], Chris J Arges wrote: -> > Can you rebase your fix on 1:2.5+dfsg-5ubuntu10.4 (due to the -> > regression fix mentioned in #25)? -> -> Will do! - -Done. - -> > Another thing about your backport is that it dropped the qem2 bits -> > from the patch. Is there a reason for this? If so please mention it in -> > the debian/patch file. -> -> Ah yes, those sections of the upstream fix do not apply cleanly due to -> differing context (not even present). - -The latest update includes the relevant context... - -> That does bring up another question, though: -> -> @smkbot or anyone else that might know. It seems the original series was -> 4 patches -> (http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg06037.html), -> and there was a follow-on of 7 patches that fixed a regression in that -> series -> (http://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05424.html). -> Do we need to backport all 11 patches? In the first series, at least, -> there is mention that patch 1/4 fixes an issue for reading VHD images. -> While I realize that this particular bug is just for creating/converting -> images, would it also be appropriate to backport the full set of fixes -> for VHD/VPC? - -On my own review, I believe we want to backport the 1 and 3 patch from -the first series, so that qemu-img is self-consistent, in terms of -reading and writing the new 'qem2' images. The second series does -include fixes, but not related to this bug, so if we were to include -them in a SRU, I'd rather they come in from a related bug report. - -@Stephen or anyone else affected, I've submitted an updated build at -https://launchpad.net/~nacc/+archive/ubuntu/lp1490611 for qemu -1:2.5+dfsg-5ubuntu10.5~ppa1 which should include the relevant. Please -test once it has finished building. - - -Is it correct to assume that current 16.04.2 Xenial with the 2.5 QEMU package, doesn't have this patch and can't generate MiB aligned Azure images with qemu-img (no force_size support) ? - -Any recommended PPA backport of QEMU from 16.10 (2.6+) ? - -cheers. - -Hello Alexandre, - -Yes, sorry, there have been several qemu SRUs pending and this one kept getting pushed back. Note that as far as end-users are concerned wrt. qemu, 16.04.2 is not really a relevant milestone. You'd still need to `apt update; apt upgrade` to get the latest from the repositories -- and yes, that version from the repositories does not yet have this fix. I believe Christian has it on his todo for the next SRU, though; Christian, could you confirm? - -On Fri, Feb 17, 2017 at 11:29 PM, Nish Aravamudan < -