summary refs log tree commit diff stats
path: root/hw/net/cadence_gem.c
diff options
context:
space:
mode:
authorbauerchen <bauerchen@tencent.com>2020-02-11 17:10:35 +0800
committerPaolo Bonzini <pbonzini@redhat.com>2020-02-25 09:18:01 +0100
commit037fb5eb3941c80a2b7c36a843e47207ddb004d4 (patch)
tree996d02a0c4e3ec1fa92b0ca3838430c48e56edcf /hw/net/cadence_gem.c
parent920d557e5ae58671d335acbcfba3f9a97a02911c (diff)
downloadfocaccia-qemu-037fb5eb3941c80a2b7c36a843e47207ddb004d4.tar.gz
focaccia-qemu-037fb5eb3941c80a2b7c36a843e47207ddb004d4.zip
mem-prealloc: optimize large guest startup
[desc]:
    Large memory VM starts slowly when using -mem-prealloc, and
    there are some areas to optimize in current method;

    1、mmap will be used to alloc threads stack during create page
    clearing threads, and it will attempt mm->mmap_sem for write
    lock, but clearing threads have hold read lock, this competition
    will cause threads createion very slow;

    2、methods of calcuating pages for per threads is not well;if we use
    64 threads to split 160 hugepage,63 threads clear 2page,1 thread
    clear 34 page,so the entire speed is very slow;

    to solve the first problem,we add a mutex in thread function,and
    start all threads when all threads finished createion;
    and the second problem, we spread remainder to other threads,in
    situation that 160 hugepage and 64 threads, there are 32 threads
    clear 3 pages,and 32 threads clear 2 pages.

[test]:
    320G 84c VM start time can be reduced to 10s
    680G 84c VM start time can be reduced to 18s

Signed-off-by: bauerchen <bauerchen@tencent.com>
Reviewed-by: Pan Rui <ruippan@tencent.com>
Reviewed-by: Ivan Ren <ivanren@tencent.com>
[Simplify computation of the number of pages per thread. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'hw/net/cadence_gem.c')
0 files changed, 0 insertions, 0 deletions