diff options
Diffstat (limited to 'results/classifier/deepseek-1/output/hypervisor/1470720')
| -rw-r--r-- | results/classifier/deepseek-1/output/hypervisor/1470720 | 54 |
1 files changed, 54 insertions, 0 deletions
diff --git a/results/classifier/deepseek-1/output/hypervisor/1470720 b/results/classifier/deepseek-1/output/hypervisor/1470720 new file mode 100644 index 000000000..c48e7976d --- /dev/null +++ b/results/classifier/deepseek-1/output/hypervisor/1470720 @@ -0,0 +1,54 @@ + +high IRQ-TLB generates network interruptions + + we are having a problem in our hosts, all the vm running on them suddenly, and for some seconds, lost network connectivity. + +the root cause appears to be the increase of irb-tlb from low values (less than 20) to more than >100k, that spike only last for some seconds then everything goes back to normal + +i've upload an screenshot of collectd for one hypervisor here +http://zumbi.com.ar/tmp/irq-tlb.png + + +we have hosts running precise (qemu 1.5, ovs 2.0.2, libvirt 1.2.2 and kernel 3.13) where the issue is frequent. also we have an small % of our fleet running trusty (qemu 2.0.0 ovs 2.0.2 libvirt 1.2.2 and kernel 3.16) where the problem seemed to be nonexistent until today + +issue seems to be isolated to < 10% of our hypervisors, some hypervisors had this problem every few days, others only once or twice. our vm are a black box to us we don't know what users run on them, but mostly cpu and network bound workload. +most of our guests run centos 6.5 (kernel 2.6.32) + +vm are bridged to a linuxbridge then veth wired to an ovs switch (neutron openvswitch agent setup) + + + +maybe first part is not clear, here it goes again + + this happens on some hypervisors at random times, not all hypervisors at the same time, and affects all vm on the hypervisor + +overcommit ratio on latest server i had the problem is 3.6 (3.6 vcpu for each cpu), would that be part of the problem? i see other servers that never had the problem with over commit ratios as high as 4.1 + +Seeing the same here, also happens on overbooked hypervisors. + +Just one or two hosts have this behaviour. + +We are using: +qemu-kvm 2.0.0+dfsg-2ubuntu1.25 +libvirt-bin 1.2.9 +kernel 3.13.0-92-generic + +We are using contrail as a SDN. + +It looks like it started after upgrading a bunch of packages including kernel (we came from 3.13.0-83-generic) + + +Disabling huge pages seem to help. +Strangely this should theoretically increase the issue but it so far we have not seen issues after disabling THP. +(have not seen high load spikes in a week but this might also be holiday related) + +So other people can try it out: +echo never >/sys/kernel/mm/transparent_hugepage/defrag +echo never > /sys/kernel/mm/transparent_hugepage/enabled + + + +Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? + +[Expired for QEMU because there has been no activity for 60 days.] + |