summary refs log tree commit diff stats
path: root/results/classifier/deepseek-r1:14b/output/hypervisor/1879425
blob: 01ae23f1a3a10858eca4b9f6758f04ee58951483 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
The thread of "CPU 0 /KVM" keeping 99.9%CPU

Hi Expert:

The VM is hung here after (2, or 3, or 5 and the longest time is 10 hours) by qemu-kvm.
Notes: 
for VM:
  OS: RHEL 7.6
  CPU: 1
  MEM:4G
For qemu-kvm:
  1) version:
     /usr/libexec/qemu-kvm -version
     QEMU emulator version 2.10.0(qemu-kvm-ev-2.10.0-21.el7_5.4.1)
  2) once the issue is occurred, the CPU of "CPU0 /KVM" is more than 99% by com "top -p VM_pro_ID"
    PID  UDER   PR NI RES   S  % CPU %MEM  TIME+    COMMAND
872067   qemu   20 0  1.6g  R   99.9  0.6  37:08.87 CPU 0/KVM
  3) use "pstack 493307" and below is function trace
Thread 1 (Thread 0x7f2572e73040 (LWP 872067)):
#0  0x00007f256cad8fcf in ppoll () from /lib64/libc.so.6
#1  0x000055ff34bdf4a9 in qemu_poll_ns ()
#2  0x000055ff34be02a8 in main_loop_wait ()
#3  0x000055ff348bfb1a in main ()
  4) use strace "strace -tt -ff -p 872067 -o cfx" and below log keep printing
21:24:02.977833 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=80, events=POLLIN}, {fd=82, events=POLLIN}, {fd=84, events=POLLIN}, {fd=115, events=POLLIN}, {fd=121, events=POLLIN}], 9, {0, 0}, NULL, 8) = 0 (Timeout)
21:24:02.977918 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=80, events=POLLIN}, {fd=82, events=POLLIN}, {fd=84, events=POLLIN}, {fd=115, events=POLLIN}, {fd=121, events=POLLIN}], 9, {0, 911447}, NULL, 8) = 0 (Timeout)
21:24:02.978945 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=80, events=POLLIN}, {fd=82, events=POLLIN}, {fd=84, events=POLLIN}, {fd=115, events=POLLIN}, {fd=121, events=POLLIN}], 9, {0, 0}, NULL, 8) = 0 (Timeout)
Therefore, I think the thread "CPU 0/KVM" is in tight loop.
  5) use reset can recover this issue. however, it will reoccurred again.
Current work around is increase one CPU for this VM, then issue is gone.

thanks
Cliff