qemu is very slow when adding 16,384 virtio-scsi drives qemu runs very slowly when adding many virtio-scsi drives. I have attached a small reproducer shell script which demonstrates this. Using perf shows the following stack trace taking all the time: 72.42% 71.15% qemu-system-x86 qemu-system-x86_64 [.] drive_get | --72.32%--drive_get | --1.24%--__irqentry_text_start | --1.22%--smp_apic_timer_interrupt | --1.00%--local_apic_timer_interrupt | --1.00%--hrtimer_interrupt | --0.83%--__hrtimer_run_queues | --0.64%--tick_sched_timer 21.70% 21.34% qemu-system-x86 qemu-system-x86_64 [.] blk_legacy_dinfo | ---blk_legacy_dinfo 3.65% 3.59% qemu-system-x86 qemu-system-x86_64 [.] blk_next | ---blk_next The first place where it ages an insane amount of time is simply processing -drive options. The stack trace I see is this (gdb) bt #0 0x00005583b596719a in drive_get (type=type@entry=IF_NONE, bus=bus@entry=0, unit=unit@entry=2313) at blockdev.c:223 #1 0x00005583b59679bd in drive_new (all_opts=0x5583b890e080, block_default_type=) at blockdev.c:996 #2 0x00005583b5971641 in drive_init_func (opaque=, opts=, errp=) at vl.c:1154 #3 0x00005583b5c1149a in qemu_opts_foreach (list=, func=0x5583b5971630 , opaque=0x5583b9980030, errp=0x0) at util/qemu-option.c:1114 #4 0x00005583b5830d30 in main (argc=, argv=, envp=) at vl.c:4499 We're iterating over every -drive option. Now because we're using if=none, and thus unit==0, line 996 of blockdev.c looks calling drive_get() until we find a matching drive, in order to identify the unit number. So we have a loop over every drive, calling drive_new which loops over every drive calling drive_get which loops over every drive. So about O(N*N*N) I instrumented drive_new to time how long 1000 creations took with current code: 1000 drive_new() in 0 secs 1000 drive_new() in 2 secs 1000 drive_new() in 18 secs 1000 drive_new() in 61 secs As a quick hack you can just disable the drive_get() calls when if=none. They're mostly just used to fill in default unit_id, but that's not really required for if=none. That said, if no id= parameter is set, then the code does expect unit_id to be valid, so not sure how to fully fix that. Anyway, with this hack applied it is much faster, but there is still some kind of N*N complexity going on, because drive_new() gets slower & slower as each drive is created - just not nearly as badly as before. 1000 drive_new() in 0 secs 1000 drive_new() in 0 secs 1000 drive_new() in 0 secs 1000 drive_new() in 1 secs 1000 drive_new() in 1 secs 1000 drive_new() in 1 secs 1000 drive_new() in 2 secs 1000 drive_new() in 2 secs 1000 drive_new() in 2 secs 1000 drive_new() in 4 secs 1000 drive_new() in 4 secs 1000 drive_new() in 6 secs 1000 drive_new() in 8 secs 1000 drive_new() in 8 secs I added further instrumentation and got this profile of where the remaining time goes 1000x drive_new 18.347secs -> 1000x blockdev_init 18.328secs -> 1000x monitor_add_blk 4.515secs -> 1000x blk_by_name 1.545secs -> 1000x bdrv_find_node 2.968secs -> 1000x blk_new_open 13.786secs -> 1000x bdrv_open 13.783secs These numbers are all increasing as we process more & more -drive args, so there's some O(N) factor in blk_by_name, bdrv_find_node and bdrv_open Is this faster nowadays if you use the new -blockdev parameter instead of -drive? [Expired for QEMU because there has been no activity for 60 days.]