1 files changed, 94 insertions, 0 deletions
diff --git a/results/classifier/zero-shot/108/other/1701449 b/results/classifier/zero-shot/108/other/1701449
new file mode 100644
index 000000000..b939eaf5b
--- /dev/null
+++ b/results/classifier/zero-shot/108/other/1701449
@@ -0,0 +1,94 @@
+debug: 0.901
+graphic: 0.849
+device: 0.824
+performance: 0.821
+semantic: 0.788
+PID: 0.772
+other: 0.721
+files: 0.720
+KVM: 0.710
+socket: 0.604
+permissions: 0.604
+boot: 0.593
+vnc: 0.484
+network: 0.469
+
+high memory usage when using rbd with client caching
+
+Hi,
+we are experiencing a quite high memory usage of a single qemu (used with KVM) process when using RBD with client caching as a disk backend. We are testing with 3GB memory qemu virtual machines and 128MB RBD client cache. When running 'fio' in the virtual machine you can see that after some time the machine uses a lot more memory (RSS) on the hypervisor than she should. We have seen values (in real production machines, no artificially fio tests) of 250% memory overhead. I reproduced this with qemu version 2.9 as well.
+
+Here the contents of our ceph.conf on the hypervisor:
+"""
+[client]
+rbd cache writethrough until flush = False
+rbd cache max dirty = 100663296
+rbd cache size = 134217728
+rbd cache target dirty = 50331648
+"""
+
+How to reproduce:
+* create a virtual machine with a RBD backed disk (100GB or so)
+* install a linux distribution on it (we are using Ubuntu)
+* install fio (apt-get install fio)
+* run fio multiple times with (e.g.) the following test file:
+"""
+# This job file tries to mimic the Intel IOMeter File Server Access Pattern
+[global]
+description=Emulation of Intel IOmeter File Server Access Pattern
+randrepeat=0
+filename=/root/test.dat
+# IOMeter defines the server loads as the following:
+# iodepth=1     Linear
+# iodepth=4     Very Light
+# iodepth=8     Light
+# iodepth=64    Moderate
+# iodepth=256   Heavy
+iodepth=8
+size=80g
+direct=0
+ioengine=libaio
+
+[iometer]
+stonewall
+bs=4M
+rw=randrw
+
+[iometer_just_write]
+stonewall
+bs=4M
+rw=write
+
+[iometer_just_read]
+stonewall
+bs=4M
+rw=read
+"""
+
+You can measure the virtual machine RSS usage on the hypervisor with:
+  virsh dommemstat <machine name> | grep rss
+or if you are not using libvirt:
+  grep RSS /proc/<PID of qemu process>/status
+
+When switching off the RBD client cache, all is ok again, as the process does not use so much memory anymore.
+
+There is already a ticket on the ceph bug tracker for this ([1]). However I can reproduce that memory behaviour only when using qemu (maybe it is using librbd in a special way?). Running directly 'fio' with the rbd engine does not result in that high memory usage.
+
+[1] http://tracker.ceph.com/issues/20054
+
+We are seeing pretty much the same issue with even small (1G mem) virtual instances using 2-3GB of RSS after running I/O intensive applications. Live migrating the instance to another machine pushes the memory usage back, but it will grow back again once I/O is back.
+
+Any update on this?
+
+Linking back to bug 1674481 which I think is the same issue seen in Ubuntu
+
+Is there any progress on solving this or does anyone has an idea how to further debug this? I think we are kinda stuck in the ceph bug tracker issue as well [1].
+
+[1] http://tracker.ceph.com/issues/20054
+
+Any reason we are keeping this bug and #1674481 separate? We are not sure?
+
+@Nick: if you can recreate the librbd memory growth, any chance you can help test a potential fix [1]?
+
+[1] https://github.com/ceph/ceph/pull/24297
+