mistranslation: 0.848
KVM: 0.826
vnc: 0.810
graphic: 0.783
other: 0.778
device: 0.773
assembly: 0.764
instruction: 0.741
boot: 0.706
semantic: 0.704
socket: 0.702
network: 0.688
qemu-kvm takes 100% CPU when running redhat/centos 7.6 guest VM OS
Description
===========
When running redhat or centos 7.6 guest os on vm,
the cpu usage is very low on vm(100% idle), but on host,
qemu-kvm reports 100% cpu busy usage.
After searching some related bugs report,
I suspect that it is due to the clock settings in vm's domain xml.
My Openstack cluster uses the default clock settings as follow:
And in this report, https://bugs.launchpad.net/qemu/+bug/1174654
it claims that can solve the 100% cpu usage problem when using Windows Image Guest OS,
but I makes some tests, the solusion dose not work for me.
Steps to reproduce
==================
* create a vm using centos or redhat 7.6 image
* using sar tool inside vm and host to check the cpu usage, and compare them
Expected result
===============
host's cpu usage report should be same with vm's cpu usage
Actual result
=============
vm's cpu usage is 100% idle, host's cpu usage is 100% busy
Environment
===========
1. Exact version of OpenStack you are running.
# rpm -qa | grep nova
openstack-nova-compute-13.1.2-1.el7.noarch
python2-novaclient-3.3.2-1.el7.noarch
python-nova-13.1.2-1.el7.noarch
openstack-nova-common-13.1.2-1.el7.noarch
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
# libvirtd -V
libvirtd (libvirt) 3.9.0
# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.6.1), Copyright (c) 2003-2008 Fabrice Bellard
Logs & Configs
==============
The VM xml:
instance-000050227f5a66a5-****-****-****-75dec****bbb*******2019-05-20 03:08:46204812204801********2097152209715211024/machineFedora ProjectOpenStack Nova13.1.2-1.el764ab0e89-****-****-****-05312ef669837f5a66a5-****-****-****-75decaf82bbbVirtual MachinehvmIvyBridgedestroyrestartdestroy/usr/libexec/qemu-kvm+107:+107
CPU Usage Report inside VM:
# sar -u -P 0 1 5
Linux 3.10.0-957.el7.x86_64 (******) 05/20/2019 _x86_64_ (1 CPU)
11:34:40 AM CPU %user %nice %system %iowait %steal %idle
11:34:41 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:42 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:43 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:44 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:45 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
Average: 0 0.00 0.00 0.00 0.00 0.00 100.00
CPU Usage Report ON HOST(the vm's cpu is pinned on host's no.27 physic cpu):
# sar -u -P 27 1 5
Linux 3.10.0-862.el7.x86_64 (******) 05/20/2019 _x86_64_ (48 CPU)
11:34:40 AM CPU %user %nice %system %iowait %steal %idle
11:34:41 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:42 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:43 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:44 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:45 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
Average: 27 100.00 0.00 0.00 0.00 0.00 0.00
clocksource inside VM:
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm_clock
It shows that only the version 7.6 of redhat or centos affected by this bug,
in my cluster, it is OK for versions from 6.8 to 7.5, but 7.6 is not normal.
> # libvirtd -V
> libvirtd (libvirt) 3.9.0
>
> # /usr/libexec/qemu-kvm --version
> QEMU emulator version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.6.1), Copyright (c) 2003-2008 Fabrice Bellard
This looks like the host is running CentOS, with packages dating from approx 7.3. This is very outdated given current version is 7.6. So I don't think it is worth spending time to debug until the host OS is upgraded to the latest QEMU/libvirt/kernel packages at least to see if the problem still reproduces.
I tested two newest QEMU versions these days, and sadly, the problem still happens.
I tried to find the reason why the qemu process take 100% usage of cpu, and collected some facts about it.
I compared the facts with other normal vm's qemu process(who's cpu usage is normal) and didn't turn out any interesting result.
Please give me some guides to debug this problem if you could, thanks very much.
(The full content of facts is in the attachment)
1.
======the command line to start a VM======
# ps -ef |grep 1567284 | cat
qemu 1567284 1 99 Jun21 ? 2-18:09:33 /usr/libexec/qemu-kvm -name guest=instance-0000530a,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-0000530a/master-key.aes -machine pc-i440fx-4.0,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object memory-backend-ram,id=ram-node0,size=2147483648,host-nodes=1,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid 60d3ad85-1004-47e7-b2cb-5cf1a029ab47 -smbios type=1,manufacturer=Fedora Project,product=OpenStack Nova,version=13.1.2-1.el7,serial=c0700413-4a28-4e97-85c4-66fe3ec367dc,uuid=60d3ad85-1004-47e7-b2cb-5cf1a029ab47,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-instance-0000530a/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk,format=raw,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk.swap,format=raw,if=none,id=drive-virtio-disk1,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk.config,format=raw,if=none,id=drive-ide0-1-1,readonly=on,cache=none -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:21:2c:70,bus=pci.0,addr=0x3 -add-fd set=2,fd=31 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on
root 1567288 2 0 Jun21 ? 00:00:01 [vhost-1567284]
root 1567291 2 0 Jun21 ? 00:00:00 [kvm-pit/1567284]
2.
===version of QEMU===
# /usr/libexec/qemu-kvm --version
QEMU emulator version 4.0.0
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
===version of libvirt===
# libvirtd -V
libvirtd (libvirt) 3.9.0
===version of kernal===
# uname -a
Linux txyz_40_92 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server (3.10.0-862.el7.x86_64) 7.5 (Maipo)
# modinfo kvm
filename: /lib/modules/3.10.0-862.el7.x86_64/kernel/arch/x86/kvm/kvm.ko.xz
license: GPL
author: Qumranet
retpoline: Y
rhelversion: 7.5
srcversion: 8A3372406CDF0ACF69A0E58
depends: irqbypass
intree: Y
vermagic: 3.10.0-862.el7.x86_64 SMP mod_unload modversions
signer: Red Hat Enterprise Linux kernel signing key
sig_key: 51:73:02:3B:F8:16:37:D7:BF:3C:51:50:13:4E:EC:84:1B:96:FD:0B
sig_hashalgo: sha256
parm: ignore_msrs:bool
parm: min_timer_period_us:uint
parm: kvmclock_periodic_sync:bool
parm: tsc_tolerance_ppm:uint
parm: lapic_timer_advance_ns:uint
parm: vector_hashing:bool
parm: halt_poll_ns:uint
parm: halt_poll_ns_grow:uint
parm: halt_poll_ns_shrink:uint
3.
===threads===
# ps -Lp 1567284
PID LWP TTY TIME CMD
1567284 1567284 ? 00:00:12 qemu-kvm
1567284 1567286 ? 00:00:00 call_rcu
1567284 1567289 ? 00:00:00 IO mon_iothread
1567284 1567290 ? 2-19:07:09 CPU 0/KVM
1567284 1567293 ? 00:00:00 vnc_worker
1567284 1637413 ? 00:00:00 worker
===top===
# top -H -p 1567284
top - 13:02:07 up 164 days, 21:53, 2 users, load average: 1.00, 1.01, 1.05
Threads: 6 total, 1 running, 5 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.1 us, 0.0 sy, 0.0 ni, 97.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 26353241+total, 25289752+free, 2771140 used, 7863752 buff/cache
KiB Swap: 8388604 total, 8388604 free, 0 used. 25926534+avail Mem
scroll coordinates: y = 1/6 (tasks), x = 1/12 (fields)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1567290 qemu 20 0 6683072 647416 8336 R 99.7 0.2 4028:04 CPU 0/KVM
1567284 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:12.93 qemu-kvm
1567286 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 call_rcu
1567289 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 IO mon_iothread
1567293 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.27 vnc_worker
1637464 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 worker
===htop===
....
===vmstat on the host===
# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 252897184 366768 7497768 0 0 0 0 0 0 0 0 100 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1394 367 2 0 98 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1442 387 2 0 98 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1479 470 2 0 98 0 0
1 0 0 252896752 366776 7497768 0 0 0 12 1373 371 2 0 98 0 0
===vmstat on the VM===
# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 1490564 81708 238688 0 0 1 2 14 30 0 0 99 1 0
0 0 0 1490564 81708 238688 0 0 0 0 29 55 0 0 100 0 0
0 0 0 1490564 81708 238688 0 0 0 0 26 56 0 0 100 0 0
0 0 0 1490564 81708 238688 0 0 0 0 17 31 0 0 99 0 0
0 0 0 1490564 81708 238688 0 0 0 0 19 41 0 0 100 0 0
The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.
If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.
Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]