1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
|
mistranslation: 0.848
KVM: 0.826
vnc: 0.810
graphic: 0.783
other: 0.778
device: 0.773
assembly: 0.764
instruction: 0.741
boot: 0.706
semantic: 0.704
socket: 0.702
network: 0.688
qemu-kvm takes 100% CPU when running redhat/centos 7.6 guest VM OS
Description
===========
When running redhat or centos 7.6 guest os on vm,
the cpu usage is very low on vm(100% idle), but on host,
qemu-kvm reports 100% cpu busy usage.
After searching some related bugs report,
I suspect that it is due to the clock settings in vm's domain xml.
My Openstack cluster uses the default clock settings as follow:
<clock offset='utc'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
And in this report, https://bugs.launchpad.net/qemu/+bug/1174654
it claims that <timer name='rtc' track='guest'/> can solve the 100% cpu usage problem when using Windows Image Guest OS,
but I makes some tests, the solusion dose not work for me.
Steps to reproduce
==================
* create a vm using centos or redhat 7.6 image
* using sar tool inside vm and host to check the cpu usage, and compare them
Expected result
===============
host's cpu usage report should be same with vm's cpu usage
Actual result
=============
vm's cpu usage is 100% idle, host's cpu usage is 100% busy
Environment
===========
1. Exact version of OpenStack you are running.
# rpm -qa | grep nova
openstack-nova-compute-13.1.2-1.el7.noarch
python2-novaclient-3.3.2-1.el7.noarch
python-nova-13.1.2-1.el7.noarch
openstack-nova-common-13.1.2-1.el7.noarch
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
# libvirtd -V
libvirtd (libvirt) 3.9.0
# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.6.1), Copyright (c) 2003-2008 Fabrice Bellard
Logs & Configs
==============
The VM xml:
<domain type='kvm' id='29'>
<name>instance-00005022</name>
<uuid>7f5a66a5-****-****-****-75dec****bbb</uuid>
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="13.1.2-1.el7"/>
<nova:name>*******</nova:name>
<nova:creationTime>2019-05-20 03:08:46</nova:creationTime>
<nova:flavor name="2d2dab36-****-****-****-246e9****110">
<nova:memory>2048</nova:memory>
<nova:disk>12</nova:disk>
<nova:swap>2048</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>1</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="********************">****</nova:user>
<nova:project uuid="********************">****</nova:project>
</nova:owner>
<nova:root type="image" uuid="4496a420-****-****-****-b50f****ada3"/>
</nova:instance>
</metadata>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
<vcpu placement='static'>1</vcpu>
<cputune>
<shares>1024</shares>
<vcpupin vcpu='0' cpuset='27'/>
<emulatorpin cpuset='27'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='1'/>
<memnode cellid='0' mode='strict' nodeset='1'/>
</numatune>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>Fedora Project</entry>
<entry name='product'>OpenStack Nova</entry>
<entry name='version'>13.1.2-1.el7</entry>
<entry name='serial'>64ab0e89-****-****-****-05312ef66983</entry>
<entry name='uuid'>7f5a66a5-****-****-****-75decaf82bbb</entry>
<entry name='family'>Virtual Machine</entry>
</system>
</sysinfo>
<os>
<type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>IvyBridge</model>
<topology sockets='1' cores='1' threads='1'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='xsaveopt'/>
<numa>
<cell id='0' cpus='0' memory='2097152' unit='KiB'/>
</numa>
</cpu>
<clock offset='utc'>
<timer name='pit' tickpolicy='delay'/>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk.swap'/>
<backingStore/>
<target dev='vdb' bus='virtio'/>
<alias name='virtio-disk1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw' cache='none'/>
<source file='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/disk.config'/>
<backingStore/>
<target dev='hdd' bus='ide'/>
<readonly/>
<alias name='ide0-1-1'/>
<address type='drive' controller='0' bus='1' target='0' unit='1'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'>
<alias name='pci.0'/>
</controller>
<controller type='ide' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<interface type='bridge'>
<mac address='fa:16:3e:a6:ea:4f'/>
<source bridge='brq52c66dc3-64'/>
<bandwidth>
<inbound average='102400'/>
<outbound average='102400'/>
</bandwidth>
<target dev='tapa29e94e5-42'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='file'>
<source path='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/console.log'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<serial type='pty'>
<source path='/dev/pts/10'/>
<target type='isa-serial' port='1'>
<model name='isa-serial'/>
</target>
<alias name='serial1'/>
</serial>
<console type='file'>
<source path='/data/instances/7f5a66a5-****-****-****-75decaf82bbb/console.log'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='tablet' bus='usb'>
<alias name='input0'/>
<address type='usb' bus='0' port='1'/>
</input>
<input type='mouse' bus='ps2'>
<alias name='input1'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input2'/>
</input>
<graphics type='vnc' port='5910' autoport='yes' listen='0.0.0.0' keymap='en-us'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<video>
<model type='cirrus' vram='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+107:+107</label>
<imagelabel>+107:+107</imagelabel>
</seclabel>
</domain>
CPU Usage Report inside VM:
# sar -u -P 0 1 5
Linux 3.10.0-957.el7.x86_64 (******) 05/20/2019 _x86_64_ (1 CPU)
11:34:40 AM CPU %user %nice %system %iowait %steal %idle
11:34:41 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:42 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:43 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:44 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
11:34:45 AM 0 0.00 0.00 0.00 0.00 0.00 100.00
Average: 0 0.00 0.00 0.00 0.00 0.00 100.00
CPU Usage Report ON HOST(the vm's cpu is pinned on host's no.27 physic cpu):
# sar -u -P 27 1 5
Linux 3.10.0-862.el7.x86_64 (******) 05/20/2019 _x86_64_ (48 CPU)
11:34:40 AM CPU %user %nice %system %iowait %steal %idle
11:34:41 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:42 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:43 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:44 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
11:34:45 AM 27 100.00 0.00 0.00 0.00 0.00 0.00
Average: 27 100.00 0.00 0.00 0.00 0.00 0.00
clocksource inside VM:
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm_clock
It shows that only the version 7.6 of redhat or centos affected by this bug,
in my cluster, it is OK for versions from 6.8 to 7.5, but 7.6 is not normal.
> # libvirtd -V
> libvirtd (libvirt) 3.9.0
>
> # /usr/libexec/qemu-kvm --version
> QEMU emulator version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.6.1), Copyright (c) 2003-2008 Fabrice Bellard
This looks like the host is running CentOS, with packages dating from approx 7.3. This is very outdated given current version is 7.6. So I don't think it is worth spending time to debug until the host OS is upgraded to the latest QEMU/libvirt/kernel packages at least to see if the problem still reproduces.
I tested two newest QEMU versions these days, and sadly, the problem still happens.
I tried to find the reason why the qemu process take 100% usage of cpu, and collected some facts about it.
I compared the facts with other normal vm's qemu process(who's cpu usage is normal) and didn't turn out any interesting result.
Please give me some guides to debug this problem if you could, thanks very much.
(The full content of facts is in the attachment)
1.
======the command line to start a VM======
# ps -ef |grep 1567284 | cat
qemu 1567284 1 99 Jun21 ? 2-18:09:33 /usr/libexec/qemu-kvm -name guest=instance-0000530a,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-0000530a/master-key.aes -machine pc-i440fx-4.0,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object memory-backend-ram,id=ram-node0,size=2147483648,host-nodes=1,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid 60d3ad85-1004-47e7-b2cb-5cf1a029ab47 -smbios type=1,manufacturer=Fedora Project,product=OpenStack Nova,version=13.1.2-1.el7,serial=c0700413-4a28-4e97-85c4-66fe3ec367dc,uuid=60d3ad85-1004-47e7-b2cb-5cf1a029ab47,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-instance-0000530a/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk,format=raw,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk.swap,format=raw,if=none,id=drive-virtio-disk1,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/data/instances/60d3ad85-1004-47e7-b2cb-5cf1a029ab47/disk.config,format=raw,if=none,id=drive-ide0-1-1,readonly=on,cache=none -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:21:2c:70,bus=pci.0,addr=0x3 -add-fd set=2,fd=31 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on
root 1567288 2 0 Jun21 ? 00:00:01 [vhost-1567284]
root 1567291 2 0 Jun21 ? 00:00:00 [kvm-pit/1567284]
2.
===version of QEMU===
# /usr/libexec/qemu-kvm --version
QEMU emulator version 4.0.0
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
===version of libvirt===
# libvirtd -V
libvirtd (libvirt) 3.9.0
===version of kernal===
# uname -a
Linux txyz_40_92 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server (3.10.0-862.el7.x86_64) 7.5 (Maipo)
# modinfo kvm
filename: /lib/modules/3.10.0-862.el7.x86_64/kernel/arch/x86/kvm/kvm.ko.xz
license: GPL
author: Qumranet
retpoline: Y
rhelversion: 7.5
srcversion: 8A3372406CDF0ACF69A0E58
depends: irqbypass
intree: Y
vermagic: 3.10.0-862.el7.x86_64 SMP mod_unload modversions
signer: Red Hat Enterprise Linux kernel signing key
sig_key: 51:73:02:3B:F8:16:37:D7:BF:3C:51:50:13:4E:EC:84:1B:96:FD:0B
sig_hashalgo: sha256
parm: ignore_msrs:bool
parm: min_timer_period_us:uint
parm: kvmclock_periodic_sync:bool
parm: tsc_tolerance_ppm:uint
parm: lapic_timer_advance_ns:uint
parm: vector_hashing:bool
parm: halt_poll_ns:uint
parm: halt_poll_ns_grow:uint
parm: halt_poll_ns_shrink:uint
3.
===threads===
# ps -Lp 1567284
PID LWP TTY TIME CMD
1567284 1567284 ? 00:00:12 qemu-kvm
1567284 1567286 ? 00:00:00 call_rcu
1567284 1567289 ? 00:00:00 IO mon_iothread
1567284 1567290 ? 2-19:07:09 CPU 0/KVM
1567284 1567293 ? 00:00:00 vnc_worker
1567284 1637413 ? 00:00:00 worker
===top===
# top -H -p 1567284
top - 13:02:07 up 164 days, 21:53, 2 users, load average: 1.00, 1.01, 1.05
Threads: 6 total, 1 running, 5 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.1 us, 0.0 sy, 0.0 ni, 97.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 26353241+total, 25289752+free, 2771140 used, 7863752 buff/cache
KiB Swap: 8388604 total, 8388604 free, 0 used. 25926534+avail Mem
scroll coordinates: y = 1/6 (tasks), x = 1/12 (fields)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1567290 qemu 20 0 6683072 647416 8336 R 99.7 0.2 4028:04 CPU 0/KVM
1567284 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:12.93 qemu-kvm
1567286 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 call_rcu
1567289 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 IO mon_iothread
1567293 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.27 vnc_worker
1637464 qemu 20 0 6683072 647416 8336 S 0.0 0.2 0:00.00 worker
===htop===
....
===vmstat on the host===
# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 252897184 366768 7497768 0 0 0 0 0 0 0 0 100 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1394 367 2 0 98 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1442 387 2 0 98 0 0
1 0 0 252896752 366768 7497768 0 0 0 0 1479 470 2 0 98 0 0
1 0 0 252896752 366776 7497768 0 0 0 12 1373 371 2 0 98 0 0
===vmstat on the VM===
# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 1490564 81708 238688 0 0 1 2 14 30 0 0 99 1 0
0 0 0 1490564 81708 238688 0 0 0 0 29 55 0 0 100 0 0
0 0 0 1490564 81708 238688 0 0 0 0 26 56 0 0 100 0 0
0 0 0 1490564 81708 238688 0 0 0 0 17 31 0 0 99 0 0
0 0 0 1490564 81708 238688 0 0 0 0 19 41 0 0 100 0 0
The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.
If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.
Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]
|