1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
|
COLO's guest VNC client hang after failover
Hello,
After setting up COLO's primary and secondary VMs,
I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside the VM.
I access the VM from another PC via VNC/RDP client, and everything is OK.
Then, kill the primary VM and issue the failover commands.
The expected result is that the VNC/RDP client can resume after failover.
But in my test, the VNC client's screen hangs and cannot be recovered no longer.
BTW, it works well after killing SVM.
Thanks.
Regards,
Derek
On Tue, 08 Sep 2020 10:25:52 -0000
Launchpad Bug Tracker <email address hidden> wrote:
> You have been subscribed to a public bug by Derek Su (dereksu):
>
> Hello,
>
> After setting up COLO's primary and secondary VMs,
> I installed the vncserver and xrdp (apt install tightvncserver xrdp) inside the VM.
>
> I access the VM from another PC via VNC/RDP client, and everything is OK.
> Then, kill the primary VM and issue the failover commands.
>
> The expected result is that the VNC/RDP client can reconnect and resume
> automatically after failover. (I've confirmed the VNC/RDP client can
> reconnect automatically.)
>
> But in my test, the VNC client's screen hangs and cannot be recovered no
> longer. (I need to restart VNC client by myself.)
>
> BTW, it works well after killing SVM.
>
> Here is my QEMU networking device
> ```
> -device virtio-net-pci,id=e0,netdev=hn0 \
> -netdev tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
> ```
>
> Thanks.
>
> Regards,
> Derek
>
> ** Affects: qemu
> Importance: Undecided
> Status: New
>
Hello,
Can you show the full qemu command line?
Regards,
Lukas Straub
Hi, Lukas
It caused by the advanced watchdog (AWD) feature instead of COLO itself.
I will check it if my misuse or not, thanks.
Best regards,
Derek
Lukas Straub <email address hidden> 於 2020年9月8日 週二 下午8:30寫道:
> On Tue, 08 Sep 2020 10:25:52 -0000
> Launchpad Bug Tracker <email address hidden> wrote:
>
> > You have been subscribed to a public bug by Derek Su (dereksu):
> >
> > Hello,
> >
> > After setting up COLO's primary and secondary VMs,
> > I installed the vncserver and xrdp (apt install tightvncserver xrdp)
> inside the VM.
> >
> > I access the VM from another PC via VNC/RDP client, and everything is OK.
> > Then, kill the primary VM and issue the failover commands.
> >
> > The expected result is that the VNC/RDP client can reconnect and resume
> > automatically after failover. (I've confirmed the VNC/RDP client can
> > reconnect automatically.)
> >
> > But in my test, the VNC client's screen hangs and cannot be recovered no
> > longer. (I need to restart VNC client by myself.)
> >
> > BTW, it works well after killing SVM.
> >
> > Here is my QEMU networking device
> > ```
> > -device virtio-net-pci,id=e0,netdev=hn0 \
> > -netdev
> tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
> > ```
> >
> > Thanks.
> >
> > Regards,
> > Derek
> >
> > ** Affects: qemu
> > Importance: Undecided
> > Status: New
> >
>
> Hello,
> Can you show the full qemu command line?
>
> Regards,
> Lukas Straub
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1894818
>
> Title:
> COLO's guest VNC client hang after failover
>
> Status in QEMU:
> New
>
> Bug description:
> Hello,
>
> After setting up COLO's primary and secondary VMs,
> I installed the vncserver and xrdp (apt install tightvncserver xrdp)
> inside the VM.
>
> I access the VM from another PC via VNC/RDP client, and everything is OK.
> Then, kill the primary VM and issue the failover commands.
>
> The expected result is that the VNC/RDP client can reconnect and
> resume automatically after failover. (I've confirmed the VNC/RDP
> client can reconnect automatically.)
>
> But in my test, the VNC client's screen hangs and cannot be recovered
> no longer. I need to restart VNC client by myself.
>
> BTW, it works well after killing SVM.
>
> Here is my QEMU networking device
> ```
> -device virtio-net-pci,id=e0,netdev=hn0 \
> -netdev
> tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
> ```
>
> Thanks.
>
> Regards,
> Derek
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1894818/+subscriptions
>
Hi, Lukas
After fixing the misuse of AWD.
There is still a high probability of the VNC/RDP client hangs after PVM died and SVM takeover.
Here are the steps.
1. Start PVM script
```
imagefolder="/mnt/nfs2/vms"
qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 4096 -smp 2 -qmp stdio \
-device piix3-usb-uhci -device usb-tablet -name primary \
-device virtio-net-pci,id=e0,netdev=hn0 \
-netdev tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
-chardev socket,id=mirror0,host=0.0.0.0,port=9003,server,nowait \
-chardev socket,id=compare1,host=0.0.0.0,port=9004,server,wait \
-chardev socket,id=compare0,host=127.0.0.1,port=9001,server,nowait \
-chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
-chardev socket,id=compare_out,host=127.0.0.1,port=9005,server,nowait \
-chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
-object iothread,id=iothread1 \
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
outdev=compare_out0,iothread=iothread1 \
-drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -vnc :0 -S
```
2. Start SVM script
```
#!/bin/bash
imagefolder="/mnt/nfs2/vms"
primary_ip=127.0.0.1
qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 100G
qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 100G
qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 4096 -smp 2 -qmp stdio \
-device piix3-usb-uhci -device usb-tablet -name secondary \
-device virtio-net-pci,id=e0,netdev=hn0 \
-netdev tap,id=hn0,br=br0,vhost=off,helper=/usr/local/libexec/qemu-bridge-helper \
-chardev socket,id=red0,host=$primary_ip,port=9003,reconnect=1 \
-chardev socket,id=red1,host=$primary_ip,port=9004,reconnect=1 \
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
-object filter-rewriter,id=rew0,netdev=hn0,queue=all \
-drive if=none,id=parent0,file.filename=$imagefolder/secondary.qcow2,driver=qcow2 \
-drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
file.backing.backing=parent0 \
-drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0=childs0 \
-vnc :1 \
-incoming tcp:0.0.0.0:9998
```
3. On Secondary VM's QEMU monitor, issue command
{'execute':'qmp_capabilities'}
{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': {'host': '0.0.0.0', 'port': '9999'} } } }
{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': true } }
4. On Primary VM's QEMU monitor, issue command:
{'execute':'qmp_capabilities'}
{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0'}}
{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 'replication0' } }
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
5. kill PVM
6. On SVM, issues
```
{'execute': 'nbd-server-stop'}
{'execute': 'x-colo-lost-heartbeat'}
{'execute': 'object-del', 'arguments':{ 'id': 'f2' } }
{'execute': 'object-del', 'arguments':{ 'id': 'f1' } }
{'execute': 'chardev-remove', 'arguments':{ 'id': 'red1' } }
{'execute': 'chardev-remove', 'arguments':{ 'id': 'red0' } }
```
I use "-device virtio-net-pci" here, but after replacing with "-device rtl8139",
the behavior seems normal.
Is "-device virtio-net-pci" available in COLO?
Thanks.
Regards,
Derek
Hi,
I also tested some emulated nic devices and virtio network devices (in the attachment).
The VNC client's screen cannot be recovered while using all virtio network devices and the emulated e1000e nic.
Thanks.
Regards,
Derek
The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.
If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".
If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:
1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:
https://gitlab.com/qemu-project/qemu/-/issues
and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.
2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).
Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]
|