results/classifier/zero-shot/105/graphic/1687653


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151

graphic: 0.909
semantic: 0.876
mistranslation: 0.857
assembly: 0.810
device: 0.790
other: 0.785
KVM: 0.753
instruction: 0.722
boot: 0.720
vnc: 0.689
socket: 0.662
network: 0.655

QEMU-KVM / detect_zeroes causes KVM to start unlimited number of threads on Guest-Sided High-IO with big Blocksize

QEMU-KVM in combination with "detect_zeroes=on" makes a Guest able to DoS the Host. This is possible if the Host itself has "detect_zeroes" enabled and the Guest writes a large Chunk of data with a huge blocksize onto the drive.

E.g.: dd if=/dev/zero of=/tmp/DoS bs=1G count=1 oflag=direct

All QEMU-Versions after implementation of detect_zeroes are affected. Prior are unaffected. This is absolutely critical, please fix this ASAP!

#####

Provided by Dominik Csapak:

source    , bs   , count     ,    O_DIRECT, behaviour

urandom   , bs 1M, count 1024,    O_DIRECT: OK
file      , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1M, count 1024,    O_DIRECT: OK
zero file , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1G, count    1,    O_DIRECT: NOT OK
zero file , bs 1G, count    1,    O_DIRECT: NOT OK
zero file , bs 1G, count    1, no O_DIRECT: NOT OK
rand file , bs 1G, count    1,    O_DIRECT: OK
rand file , bs 1G, count    1, no O_DIRECT: OK

discard on:

urandom   , bs 1M, count 1024,    O_DIRECT: OK
rand file , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1M, count 1024,    O_DIRECT: OK
zero file , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1G, count    1,    O_DIRECT: NOT OK
zero file , bs 1G, count    1,    O_DIRECT: NOT OK
zero file , bs 1G, count    1, no O_DIRECT: NOT OK
rand file , bs 1G, count    1,    O_DIRECT: OK
rand file , bs 1G, count    1, no O_DIRECT: OK

detect_zeros off:

urandom   , bs 1M, count 1024,    O_DIRECT: OK
rand file , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1M, count 1024,    O_DIRECT: OK
zero file , bs 1M, count 1024,    O_DIRECT: OK
/dev/zero , bs 1G, count    1,    O_DIRECT: OK
zero file , bs 1G, count    1,    O_DIRECT: OK
zero file , bs 1G, count    1, no O_DIRECT: OK
rand file , bs 1G, count    1,    O_DIRECT: OK
rand file , bs 1G, count    1, no O_DIRECT: OK

#####

Provided by Florian Strankowski

bs   -    count   -    io-threads

512K -    2048    -    2
1M   -    1024    -    2
2M   -     512    -    4
4M   -     256    -    6
8M   -     128    -    10
16M  -      64    -    18
32M  -      32    -    uncountable

Please refer to further information here: 

https://bugzilla.proxmox.com/show_bug.cgi?id=1368


Sorry ab out the visibility settings, this bugtracker drives me nuts.

Just to make this clear: This bug affects only LVM-backed storages. File-based-storage is not affected. LVM-Thin and also LVM-Thick.

Status changed to 'Confirmed' because the bug affects multiple users.

I'm unable to reproduce this issue.  The host stays responsive and the dd command completes in a reasonable amount of time.  QEMU does not exceed the 64-thread pool size.

Please post steps to reproduce the issue using a minimal command-line without libvirt.

Here is information on my attempt to reproduce the problem:

Guest: Kernel 4.10.8-200.fc25.x86_64
Host: 4.10.11-200.fc25.x86_64
QEMU: qemu.git/master (e619b14746e5d8c0e53061661fd0e1da01fd4d60)

The LV is 1 GB on top of LUKS on a Samsung MZNLN256HCHP SATA SSD drive.

mpstat -P ALL 5 output:
11:02:02 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
11:02:07 AM  all    3.36    0.00    6.22   34.54    0.25    0.50    0.00    3.11    0.00   52.03
11:02:07 AM    0    2.82    0.00    5.63   32.39    0.80    1.21    0.00    3.22    0.00   53.92
11:02:07 AM    1    3.02    0.00    6.04   28.77    0.20    0.20    0.00    3.02    0.00   58.75
11:02:07 AM    2    3.56    0.00    7.71   44.27    0.20    0.40    0.00    2.37    0.00   41.50
11:02:07 AM    3    3.81    0.00    5.61   32.46    0.00    0.40    0.00    4.01    0.00   53.71

vmstat 5 output:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 1617404   6484 3541468    0    0  2145 84794 1976 8814  8  8 64 20  0
 0  0      0 1619492   6484 3538592    0    0   613 69340 1518 7430  6  7 70 17  0
 0  0      0 1618920   6484 3538680    0    0   280 75199 1421 6811  6  7 52 35  0

pidstat -v -p $PID_OF_QEMU 5 output:
11:01:08 AM   UID       PID threads   fd-nr  Command
11:02:03 AM     0     13043      67      37  qemu-system-x86
11:02:08 AM     0     13043      67      37  qemu-system-x86
11:02:13 AM     0     13043      67      37  qemu-system-x86

$ sudo x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 1024 -cpu host \
        -device virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 \
        -drive file=test.img,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on \
        -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100 \
        -drive file=/dev/path/to/testlv,if=none,id=drive-scsi1,format=raw,cache=none,aio=native,detect-zeroes=on \
        -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1,bootindex=101 \
        -nographic

guest# dd if=/dev/zero of=/dev/sdb bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.0681 s, 71.3 MB/s

Please be so kind and go for a 6G LVM-Vol and do "dd if=/dev/zero of=/dev/sdb bs=3G count=2 oflag=direct". Please keep an eye on your processor usage in comparison to the threads created. Its harder to knock-down an SSD-Backed system than one with spinners.


After further investigation on IRC the following points were raised:

1. Non-vcpu threads in QEMU weren't being isolated.  Libvirt can do this
   using the <cputune> domain XML element.  The guest can create a high
   load if some QEMU threads are unconstrained.

2. The wait% CPU stat was causing confusion.  It's the idle time during
   which synchronous I/O is pending.  High wait% does not mean that the
   system is under high CPU load.  detect-zeroes=on can take a
   synchronous I/O path even when aio=native is used, and this results
   in wait% instead of idle%.

I'm closing the bug.