1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
|
other: 0.852
KVM: 0.836
vnc: 0.820
graphic: 0.816
permissions: 0.787
device: 0.785
network: 0.780
performance: 0.776
debug: 0.772
semantic: 0.768
socket: 0.736
PID: 0.712
files: 0.698
boot: 0.676
User mode networking SLIRP rapid memory leak
QEMU compiled from git HEAD at 2d03b49c3f225994c4b0b46146437d8c887d6774 and reproducible at tag v2.0.0. I first noticed this bug using Ubuntu Trusty's QEMU 2.0.0~rc1. I used to run QEMU 1.7 without this problem.
This is the command I ran:
qemu-system-x86_64 -enable-kvm -smp 2 -m 1G -usbdevice tablet -net nic,model=e1000 -net user -vnc localhost:99 -drive if=ide,file=test.img,cache=none -net nic -net user,tftp=/tmp/tftpdata -no-reboot
The guest is Windows 7 64-bit. The VM starts off normally, but after a couple of minutes, the memory usage starts to swell. If let running, it eventually consumes all host memory and grinds the host to a halt due to heavy swapping.
When running under gdb, I set a breakpoint on mmap, and this is the stack trace I obtained.
Breakpoint 1, mmap64 () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb) where
#0 mmap64 () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff0e65091 in new_heap (size=135168, size@entry=1728,
top_pad=<optimized out>) at arena.c:554
#2 0x00007ffff0e687b2 in sysmalloc (av=0x7fffd0000020, nb=1664)
at malloc.c:2386
#3 _int_malloc (av=0x7fffd0000020, bytes=1650) at malloc.c:3740
#4 0x00007ffff0e69f50 in __GI___libc_malloc (bytes=1650) at malloc.c:2855
#5 0x00005555557a091a in m_get (slirp=0x5555561fe960)
at /src/qemu/slirp/mbuf.c:73
#6 0x00005555557a3151 in slirp_input (slirp=0x5555561fe960,
pkt=0x7ffff7e94b20 "RU\n", pkt_len=<optimized out>)
at /src/qemu/slirp/slirp.c:747
#7 0x0000555555758b24 in net_slirp_receive (nc=<optimized out>,
buf=<optimized out>, size=54) at /src/qemu/net/slirp.c:113
#8 0x00005555557567d1 in qemu_deliver_packet (sender=<optimized out>,
flags=<optimized out>, data=<optimized out>, size=<optimized out>,
opaque=<optimized out>) at /src/qemu/net/net.c:471
#9 0x00005555557588d3 in qemu_net_queue_deliver (size=54,
data=0x7ffff7e94b20 "RU\n", flags=0, sender=0x5555561fe5e0,
queue=0x5555561fe1d0) at /src/qemu/net/queue.c:157
#10 qemu_net_queue_send (queue=0x5555561fe1d0, sender=0x5555561fe5e0, flags=0,
data=0x7ffff7e94b20 "RU\n", size=54, sent_cb=<optimized out>)
at /src/qemu/net/queue.c:192
---Type <return> to continue, or q <return> to quit---
#11 0x000055555575536b in net_hub_receive (len=54, buf=0x7ffff7e94b20 "RU\n",
source_port=0x5555561fe310, hub=<optimized out>) at /src/qemu/net/hub.c:55
#12 net_hub_port_receive (nc=0x5555561fe310, buf=0x7ffff7e94b20 "RU\n", len=54)
at /src/qemu/net/hub.c:114
#13 0x00005555557567d1 in qemu_deliver_packet (sender=<optimized out>,
flags=<optimized out>, data=<optimized out>, size=<optimized out>,
opaque=<optimized out>) at /src/qemu/net/net.c:471
#14 0x00005555557588d3 in qemu_net_queue_deliver (size=54,
data=0x7ffff7e94b20 "RU\n", flags=0, sender=0x555556531920,
queue=0x5555561fe090) at /src/qemu/net/queue.c:157
#15 qemu_net_queue_send (queue=0x5555561fe090, sender=0x555556531920, flags=0,
data=0x7ffff7e94b20 "RU\n", size=54, sent_cb=<optimized out>)
at /src/qemu/net/queue.c:192
#16 0x00005555556db95d in xmit_seg (s=0x7ffff7e72010)
at /src/qemu/hw/net/e1000.c:628
#17 0x00005555556dbd38 in process_tx_desc (dp=0x7fffdf7fda30, s=0x7ffff7e72010)
at /src/qemu/hw/net/e1000.c:723
#18 start_xmit (s=0x7ffff7e72010) at /src/qemu/hw/net/e1000.c:778
#19 set_tctl (s=0x7ffff7e72010, index=<optimized out>, val=<optimized out>)
at /src/qemu/hw/net/e1000.c:1142
#20 0x0000555555840fb0 in access_with_adjusted_size (addr=14360,
value=0x7fffdf7fdb10, size=4, access_size_min=<optimized out>,
access_size_max=<optimized out>,
---Type <return> to continue, or q <return> to quit---
access=0x555555841160 <memory_region_write_accessor>, mr=0x7ffff7e747c0)
at /src/qemu/memory.c:478
#21 0x00005555558462fe in memory_region_dispatch_write (size=4, data=454,
addr=14360, mr=0x7ffff7e747c0) at /src/qemu/memory.c:990
#22 io_mem_write (mr=0x7ffff7e747c0, addr=14360, val=<optimized out>, size=4)
at /src/qemu/memory.c:1744
#23 0x00005555557e8717 in address_space_rw (
as=0x555556159c80 <address_space_memory>, addr=4273485848,
buf=0x7ffff7fed028 "\306\001", len=4, is_write=true)
at /src/qemu/exec.c:2034
#24 0x000055555583ff65 in kvm_cpu_exec (cpu=<optimized out>)
at /src/qemu/kvm-all.c:1704
#25 0x00005555557ddb6c in qemu_kvm_cpu_thread_fn (arg=0x55555651b730)
at /src/qemu/cpus.c:873
#26 0x00007ffff11b6182 in start_thread (arg=0x7fffdf7fe700)
at pthread_create.c:312
#27 0x00007ffff0ee1b2d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Let me know if you have any questions. Thanks.
liulk
I investigated further and found that a program in guest (jusched.exe Java Updater) is simultaneously sending and receiving network packets rapidly. This is what exacerbates the memory leak.
When the mmap breakpoint triggers, I now set additional breakpoints in m_get() and m_free() and found that the number of calls to these functions do not balance, hence making the leak evident.
Breakpoint 1, mmap64 () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb) info break
Num Type Disp Enb Address What
1 breakpoint keep y 0x00007ffff0edbfb0 ../sysdeps/unix/syscall-template.S:81
breakpoint already hit 6 times
2 breakpoint keep y 0x0000555555848dfa in m_get
at /src/qemu/slirp/mbuf.c:66
breakpoint already hit 645487 times
ignore next 354513 hits
3 breakpoint keep y 0x0000555555848eff in m_free
at /src/qemu/slirp/mbuf.c:103
breakpoint already hit 484477 times
ignore next 515523 hits
About 25% of the m_get() do not get m_free()'d.
liulk
I also noticed that the command I ran is causing this bug to happen. I had accidentally repeated -net nic twice, so there are two -net user network interfaces. Removing one of them makes the problem go away.
liulk
On Mon, Apr 21, 2014 at 08:53:43PM -0000, Likai Liu wrote:
> I also noticed that the command I ran is causing this bug to happen. I
> had accidentally repeated -net nic twice, so there are two -net user
> network interfaces. Removing one of them makes the problem go away.
This is still a bug, the packets should be freed even with 2 -net user.
Stefan
Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays?
I believe this is the issue i am seeing as well. QEMU continues to consume memory (>6GB) until the host machine runs out of memory (I am also using slirp for networking). I am able to reproduce it from qemu 2.5 (version found in apt-get for ubuntu 16.04.3 LTS) and I verified it exists after I compiled from source both 2.8.1.1 and 2.10.2 as well.
$ uname -a
Linux siemens 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ cat qemu.cfg
# qemu config file
[drive]
format = "raw"
file = "qemu_rootfs_512.img"
[drive]
format = "raw"
file = "hmi_sl_oa.img"
[drive]
format = "raw"
file = "swap_512m.img"
[net]
type = "nic"
[net]
type = "user"
[machine]
kernel = "vmlinuz-2.6.11.12-vanilla.bz"
initrd = "initrd-2.6.11.12.img.gz"
append = "root=/dev/hda"
[memory]
size = "2048"
[vnc "default"]
vnc = ":1"
$ qemu-system-x86_64 -readconfig qemu.cfg
Let me know what other information you need.
The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.
If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.
Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]
|