graphic: 0.921 hypervisor: 0.872 performance: 0.792 network: 0.704 architecture: 0.606 virtual: 0.585 KVM: 0.531 kernel: 0.517 ppc: 0.507 semantic: 0.500 device: 0.498 assembly: 0.480 peripherals: 0.477 debug: 0.462 i386: 0.387 x86: 0.380 files: 0.366 vnc: 0.354 PID: 0.347 register: 0.333 user-level: 0.312 mistranslation: 0.269 socket: 0.267 arm: 0.267 risc-v: 0.238 permissions: 0.209 VMM: 0.195 TCG: 0.162 boot: 0.140 high IRQ-TLB generates network interruptions we are having a problem in our hosts, all the vm running on them suddenly, and for some seconds, lost network connectivity. the root cause appears to be the increase of irb-tlb from low values (less than 20) to more than >100k, that spike only last for some seconds then everything goes back to normal i've upload an screenshot of collectd for one hypervisor here http://zumbi.com.ar/tmp/irq-tlb.png we have hosts running precise (qemu 1.5, ovs 2.0.2, libvirt 1.2.2 and kernel 3.13) where the issue is frequent. also we have an small % of our fleet running trusty (qemu 2.0.0 ovs 2.0.2 libvirt 1.2.2 and kernel 3.16) where the problem seemed to be nonexistent until today issue seems to be isolated to < 10% of our hypervisors, some hypervisors had this problem every few days, others only once or twice. our vm are a black box to us we don't know what users run on them, but mostly cpu and network bound workload. most of our guests run centos 6.5 (kernel 2.6.32) vm are bridged to a linuxbridge then veth wired to an ovs switch (neutron openvswitch agent setup) maybe first part is not clear, here it goes again this happens on some hypervisors at random times, not all hypervisors at the same time, and affects all vm on the hypervisor overcommit ratio on latest server i had the problem is 3.6 (3.6 vcpu for each cpu), would that be part of the problem? i see other servers that never had the problem with over commit ratios as high as 4.1 Seeing the same here, also happens on overbooked hypervisors. Just one or two hosts have this behaviour. We are using: qemu-kvm 2.0.0+dfsg-2ubuntu1.25 libvirt-bin 1.2.9 kernel 3.13.0-92-generic We are using contrail as a SDN. It looks like it started after upgrading a bunch of packages including kernel (we came from 3.13.0-83-generic) Disabling huge pages seem to help. Strangely this should theoretically increase the issue but it so far we have not seen issues after disabling THP. (have not seen high load spikes in a week but this might also be holiday related) So other people can try it out: echo never >/sys/kernel/mm/transparent_hugepage/defrag echo never > /sys/kernel/mm/transparent_hugepage/enabled Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? [Expired for QEMU because there has been no activity for 60 days.]