summary refs log tree commit diff stats
path: root/results/classifier/118/none/1686980
diff options
context:
space:
mode:
Diffstat (limited to 'results/classifier/118/none/1686980')
-rw-r--r--results/classifier/118/none/1686980118
1 files changed, 118 insertions, 0 deletions
diff --git a/results/classifier/118/none/1686980 b/results/classifier/118/none/1686980
new file mode 100644
index 000000000..13b131f63
--- /dev/null
+++ b/results/classifier/118/none/1686980
@@ -0,0 +1,118 @@
+peripherals: 0.752
+hypervisor: 0.738
+mistranslation: 0.731
+risc-v: 0.703
+TCG: 0.703
+ppc: 0.686
+x86: 0.685
+VMM: 0.684
+user-level: 0.679
+vnc: 0.637
+i386: 0.621
+KVM: 0.616
+semantic: 0.598
+debug: 0.574
+performance: 0.574
+permissions: 0.547
+assembly: 0.517
+PID: 0.512
+graphic: 0.511
+device: 0.510
+arm: 0.505
+register: 0.481
+architecture: 0.466
+virtual: 0.462
+kernel: 0.454
+socket: 0.449
+network: 0.437
+boot: 0.431
+files: 0.304
+
+qemu is very slow when adding 16,384 virtio-scsi drives
+
+qemu runs very slowly when adding many virtio-scsi drives.  I have attached a small reproducer shell script which demonstrates this.
+
+Using perf shows the following stack trace taking all the time:
+
+    72.42%    71.15%  qemu-system-x86  qemu-system-x86_64       [.] drive_get
+            |          
+             --72.32%--drive_get
+                       |          
+                        --1.24%--__irqentry_text_start
+                                  |          
+                                   --1.22%--smp_apic_timer_interrupt
+                                             |          
+                                              --1.00%--local_apic_timer_interrupt
+                                                        |          
+                                                         --1.00%--hrtimer_interrupt
+                                                                   |          
+                                                                    --0.83%--__hrtimer_run_queues
+                                                                              |          
+                                                                               --0.64%--tick_sched_timer
+
+    21.70%    21.34%  qemu-system-x86  qemu-system-x86_64       [.] blk_legacy_dinfo
+            |
+            ---blk_legacy_dinfo
+
+     3.65%     3.59%  qemu-system-x86  qemu-system-x86_64       [.] blk_next
+            |
+            ---blk_next
+
+
+
+The first place where it ages an insane amount of time is simply processing -drive options. The stack trace I see is this
+
+(gdb) bt
+#0  0x00005583b596719a in drive_get (type=type@entry=IF_NONE, bus=bus@entry=0, unit=unit@entry=2313) at blockdev.c:223
+#1  0x00005583b59679bd in drive_new (all_opts=0x5583b890e080, block_default_type=<optimized out>) at blockdev.c:996
+#2  0x00005583b5971641 in drive_init_func (opaque=<optimized out>, opts=<optimized out>, errp=<optimized out>)
+    at vl.c:1154
+#3  0x00005583b5c1149a in qemu_opts_foreach (list=<optimized out>, func=0x5583b5971630 <drive_init_func>, opaque=0x5583b9980030, errp=0x0) at util/qemu-option.c:1114
+#4  0x00005583b5830d30 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4499
+
+
+We're iterating over every -drive option. Now because we're using if=none, and thus unit==0, line 996 of blockdev.c looks calling drive_get() until we find a matching drive, in order to identify the unit number. So we have a loop over every drive, calling drive_new which loops over every drive calling drive_get which loops over every drive. So about O(N*N*N) 
+
+I instrumented drive_new to time how long 1000 creations took with current code:
+
+1000 drive_new() in 0 secs
+1000 drive_new() in 2 secs
+1000 drive_new() in 18 secs
+1000 drive_new() in 61 secs
+
+As a quick hack you can just disable the drive_get() calls when if=none. They're mostly just used to fill in default unit_id, but that's not really required for if=none. That said, if no id= parameter is set, then the code does expect unit_id to be valid, so not sure how to fully fix that.
+
+Anyway, with this hack applied it is much faster, but there is still some kind of N*N complexity going on, because drive_new() gets slower & slower as each drive is created - just not nearly as badly as before.
+
+1000 drive_new() in 0 secs
+1000 drive_new() in 0 secs
+1000 drive_new() in 0 secs
+1000 drive_new() in 1 secs
+1000 drive_new() in 1 secs
+1000 drive_new() in 1 secs
+1000 drive_new() in 2 secs
+1000 drive_new() in 2 secs
+1000 drive_new() in 2 secs
+1000 drive_new() in 4 secs
+1000 drive_new() in 4 secs
+1000 drive_new() in 6 secs
+1000 drive_new() in 8 secs
+1000 drive_new() in 8 secs
+
+
+I added further instrumentation and got this profile of where the remaining time goes
+
+1000x drive_new 18.347secs
+-> 1000x blockdev_init 18.328secs
+   -> 1000x monitor_add_blk 4.515secs
+      -> 1000x blk_by_name 1.545secs
+      -> 1000x bdrv_find_node 2.968secs
+   -> 1000x blk_new_open 13.786secs
+      -> 1000x bdrv_open 13.783secs
+
+These numbers are all increasing as we process more & more -drive args, so there's some O(N) factor in blk_by_name, bdrv_find_node and bdrv_open
+
+Is this faster nowadays if you use the new -blockdev parameter instead of -drive?
+
+[Expired for QEMU because there has been no activity for 60 days.]
+