1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
|
KVM: 0.867
mistranslation: 0.843
vnc: 0.798
graphic: 0.730
other: 0.709
instruction: 0.708
device: 0.684
boot: 0.653
semantic: 0.633
assembly: 0.612
network: 0.598
socket: 0.582
KVM on 16.04.3 throws an error
Problem Description
====================
KVM on Ubuntu 16.04.3 throws an error when used
---uname output---
Linux bastion-1 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:37:08 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = 8348-21C Habanero
---Steps to Reproduce---
Install 16.04.3
install KVM like:
apt-get install libvirt-bin qemu qemu-slof qemu-system qemu-utils
then exit and log back in so virsh will work without sudo
then run my spawn script
$ cat spawn.sh
#!/bin/bash
img=$1
qemu-system-ppc64 \
-machine pseries,accel=kvm,usb=off -cpu host -m 512 \
-display none -nographic \
-net nic -net user \
-drive "file=$img"
with a freshly downloaded ubuntu cloud image
sudo ./spawn.sh xenial-server-cloudimg-ppc64el-disk1.img
And I get nothing on the output.
and errors in dmesg
ubuntu@bastion-1:~$ [ 340.180295] Facility 'TM' unavailable, exception at 0xd0000000148b7f10, MSR=9000000000009033
[ 340.180399] Oops: Unexpected facility unavailable exception, sig: 6 [#1]
[ 340.180513] SMP NR_CPUS=2048 NUMA PowerNV
[ 340.180547] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm binfmt_misc joydev input_leds mac_hid opal_prd ofpart cmdlinepart powernv_flash ipmi_powernv ipmi_msghandler mtd at24 uio_pdrv_genirq uio ibmpowernv powernv_rng vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear mlx4_en hid_generic usbhid hid uas usb_storage ast i2c_algo_bit bnx2x ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx4_core drm ahci vxlan libahci ip6_udp_tunnel udp_tunnel mdio libcrc32c
[ 340.181331] CPU: 46 PID: 5252 Comm: qemu-system-ppc Not tainted 4.4.0-89-generic #112-Ubuntu
[ 340.181382] task: c000001e34c30b50 ti: c000001e34ce4000 task.ti: c000001e34ce4000
[ 340.181432] NIP: d0000000148b7f10 LR: d000000014822a14 CTR: d0000000148b7e40
[ 340.181475] REGS: c000001e34ce77b0 TRAP: 0f60 Not tainted (4.4.0-89-generic)
[ 340.181519] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22024848 XER: 00000000
[ 340.181629] CFAR: d0000000148b7ea4 SOFTE: 1
GPR00: d000000014822a14 c000001e34ce7a30 d0000000148cc018 c000001e37bc0000
GPR04: c000001db9ac0000 c000001e34ce7bc0 0000000000000000 0000000000000000
GPR08: 0000000000000001 c000001e34c30b50 0000000000000001 d0000000148278f8
GPR12: d0000000148b7e40 c00000000fb5b500 0000000000000000 000000000000001f
GPR16: 00003fff91c30000 0000000000800000 00003fffa8e34390 00003fff9242f200
GPR20: 00003fff92430010 000001001de5c030 00003fff9242eb60 00000000100c1ff0
GPR24: 00003fffc91fe990 00003fff91c10028 0000000000000000 c000001e37bc0000
GPR28: 0000000000000000 c000001db9ac0000 c000001e37bc0000 c000001db9ac0000
[ 340.182315] NIP [d0000000148b7f10] kvmppc_vcpu_run_hv+0xd0/0xff0 [kvm_hv]
[ 340.182357] LR [d000000014822a14] kvmppc_vcpu_run+0x44/0x60 [kvm]
[ 340.182394] Call Trace:
[ 340.182413] [c000001e34ce7a30] [c000001e34ce7ab0] 0xc000001e34ce7ab0 (unreliable)
[ 340.182468] [c000001e34ce7b70] [d000000014822a14] kvmppc_vcpu_run+0x44/0x60 [kvm]
[ 340.182522] [c000001e34ce7ba0] [d00000001481f674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm]
[ 340.182581] [c000001e34ce7be0] [d000000014813918] kvm_vcpu_ioctl+0x528/0x7b0 [kvm]
[ 340.182634] [c000001e34ce7d40] [c0000000002fffa0] do_vfs_ioctl+0x480/0x7d0
[ 340.182678] [c000001e34ce7de0] [c0000000003003c4] SyS_ioctl+0xd4/0xf0
[ 340.182723] [c000001e34ce7e30] [c000000000009204] system_call+0x38/0xb4
[ 340.182766] Instruction dump:
[ 340.182788] e92d02a0 e9290a50 e9290108 792a07e3 41820058 e92d02a0 e9290a50 e9290108
[ 340.182863] 7927e8a4 78e71f87 40820ed8 e92d02a0 <7d4022a6> f9490ee8 e92d02a0 7d4122a6
[ 340.182938] ---[ end trace bc5080cb7d18f102 ]---
[ 340.276202]
This was with the latest ubuntu cloud image. I get the same thing when trying to use virt-install with an ISO image.
I have no way of loading a KVM on 16.04.3
== Comment: #2 - Jason M. Furmanek <email address hidden> - 2017-08-09 17:42:34 ==
I reinstalled with the HWE kernel (4.10).
I can install VM and see the console and eveything seems fine.
== Comment: #3 - Jason M. Furmanek <email address hidden> - 2017-08-09 17:44:03 ==
I had another system at 16.04.2 (4.4) and updated that one to the latest and it hit the same issue as above.
No qemu or libvirt updates were applied. Just kernel updates and a handful of other stuff.
Seems this issue is specific to the latest kernel
old version worked:
Linux fs7 4.4.0-83-generic #106
new version did not:
Linux fs7 4.4.0-89-generic #112-Ubuntu
== Comment: #4 - Gustavo Bueno Romero <email address hidden> - 2017-08-09 20:26:42 ==
Looks like 46a704f8409f79fd66567ad3f8a7304830a84293 was backported on 88 but e47057151422a67ce08747176fa21cb3b526a2c9 was not:
[gromero@localhost ubuntu-xenial]$ git remote -vv
origin git://kernel.ubuntu.com/ubuntu/ubuntu-xenial.git (fetch)
origin git://kernel.ubuntu.com/ubuntu/ubuntu-xenial.git (push)
[gromero@localhost ubuntu-xenial]$ git log Ubuntu-4.4.0-83.106..Ubuntu-4.4.0-89.112 --oneline | fgrep "Preserve userspace HTM state properly"
a97e978 KVM: PPC: Book3S HV: Preserve userspace HTM state properly
[gromero@localhost ubuntu-xenial]$ git log Ubuntu-4.4.0-83.106..Ubuntu-4.4.0-89.112 --oneline | fgrep "Enable TM before accessing TM registers"
[gromero@localhost ubuntu-xenial]$ git tag --contains a97e978574f41ffcf1813c180aba2772d46fbb5b
Ubuntu-4.4.0-88.111
Ubuntu-4.4.0-89.112
Ubuntu-raspi2-4.4.0-1066.74
Ubuntu-raspi2-4.4.0-1067.75
Ubuntu-snapdragon-4.4.0-1068.73
Ubuntu-snapdragon-4.4.0-1069.74
So
https://github.com/torvalds/linux/commit/46a704f8409f79fd66567ad3f8a7304830a84293 is present in ISO but https://github.com/torvalds/linux/commit/e47057151422a67ce08747176fa21cb3b526a2c9 is not.
Hi,
there was a related case in bug 1664622.
The TL;DR of this was:
1. even with all qemu patches identified by IBM and upstream while working on this the issue still showed up.
2. newer kernels (>=4.8, but also newer 4.4 kernels than those on the release itself worked - that matches the report here)
3. the conclusion was that nothing can/has-to be done for qemu [1]
4. there was no way to fix the initial 16.04.0 iso, but that was considered ok since 16.04.2 worked
Mapping to this bug here it seems that a further update to 4.4 might have broken it "again" as identified by Gustavo in in this report.
I added all this to complete the context for everybody else, but for now qemu stays at won't fix for the reasons and assumptions haven't changed since our last check on this issue.
On the kernel surely if the identified two patches really need to go together a fix might be needed, but that is for the kernel Team to evaluate.
[1]: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1664622/comments/27
Thanks Christian. Reassigning to kernel team.
I built test kernels with a pick of commit e47057151422. The test kernels can be downloaded from:
Xenial: http://kernel.ubuntu.com/~jsalisbury/lp1709784/xenial/
Zesty: http://kernel.ubuntu.com/~jsalisbury/lp1709784/zesty/
Can you test these kernels and see if they resolve this bug?
Thanks in advance!
I just tested zesty kernel and it worked fine:
# uname -a
Linux sid 4.10.0-30-generic #34~lp1709784 SMP Thu Aug 10 20:01:17 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
# sudo virsh list
Id Name State
----------------------------------------------------
1 unstable running
2 debian running
4 1604 running
# sudo virsh console 1604
Connected to domain 1604
Escape character is ^]
Ubuntu 16.04.3 LTS 1604 hvc0
1604 login:
I also tested xenial GA kernel:
sid ➜ ~ sudo uname -a
Linux sid 4.4.0-89-generic #112~lp1709784 SMP Thu Aug 10 19:39:43 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
sid ➜ ~ sudo virsh list
Id Name State
----------------------------------------------------
2 debian running
3 1604 running
4 unstable running
sid ➜ ~ sudo virsh console 1604
Connected to domain 1604
Escape character is ^]
Ubuntu 16.04.3 LTS 1604 hvc0
1604 login:
Thanks for testing. I'll submit SRU requests for both Zesty and Xenial.
There is commit 17d381054b1d6f4adc3db623b2066fff41b4dc1a ("KVM: PPC: Book3S HV: Reload HTM registers explicitly") from
v4.4.80 series that is in Xenial master-next. I built a master-next test kernel, which can be downloaded from:
Xenial: http://kernel.ubuntu.com/~jsalisbury/lp1709784/xenial/
Can you test these kernels and see if they resolve this bug?
Thanks in advance!
I ran into this issue as well, and confirmed with the kernel built above (http://kernel.ubuntu.com/~jsalisbury/lp1709784/xenial/) that things now work again. Thanks!
------- Comment From <email address hidden> 2017-09-01 00:40 EDT-------
I tested the test kernel made available by jsalisbury and it worked for me as well.
This is a pretty big blocker - very much hoping this makes it into a new kernel soon.
@furmanek, The fix is in the Ubuntu-4.4.0-94 kernel, which is the current -proposed kernel.
Would it be possible for you to test the proposed kernel and post back if it resolves this bug?
See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.
------- Comment From <email address hidden> 2017-09-04 19:23 EDT-------
Hi jsalisbury,
linux-image-4.4.0-94-generic from -proposed looks fine to me. VMs boot ok and as expected no TRAP 0xF60 (Facility Unavailable) in dmesg:
root@sid:~# apt-get -t xenial-proposed install linux-image-4.4.0-94-generic
<REBOOT>
root@sid:~# uname -a
Linux sid 4.4.0-94-generic #117-Ubuntu SMP Tue Aug 29 08:13:18 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
root@sid:/# dmesg | fgrep 0f60
root@sid:/# virsh list
Id Name State
----------------------------------------------------
1 debian running
2 1604 running
4 unstable running
root@sid:/# virsh console 2
Connected to domain 1604
Escape character is ^]
Ubuntu 16.04.3 LTS 1604 hvc0
1604 login: ubuntu
Password:
Last login: Mon Aug 14 15:41:29 EDT 2017 from 10.1.0.1 on pts/0
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.10.0-30-generic ppc64le)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
You have new mail.
1604 ? ~
I understand that no further action from IBM is necessary now. Please, let me know if it's not true.
BTW, do you know when this will likely reach the SRU?
Thanks a lot and best regards,
Gustavo
------- Comment From <email address hidden> 2017-09-04 20:46 EDT-------
Only to let you know that I just tested this version successfully as well.
Thanks!
Jose Ziviani
There seems to be an update-regression in Artful.
Booting a recent Artful guest now again also throws:
[485006.367929] Facility 'TM' unavailable, exception at 0xd000000021617f10, MSR=9000000000009033
[485006.367943] Oops: Unexpected facility unavailable exception, sig: 6 [#14]
[485006.367945] SMP NR_CPUS=2048 NUMA PowerNV
Booting a Zesty guest triggers the same.
Guest kernel(s): 4.12.0.13.14 / 4.10.0.33.33
Host kernel: 4.4.0-93-generic
I thought this only triggers if the guest kernel versions is a bad one.
It seems the host kernel (-93 as shown above) also needing to upgrade to avoid the issue?
Ok that makes it worse than I thought - can't reboot to test from proposed atm.
------- Comment From <email address hidden> 2017-09-12 04:23 EDT-------
*** Bug 158561 has been marked as a duplicate of this bug. ***
Affects OpenStack on ppc64el. Marking other bug as duplicate (it has logs/attachments fyi).
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1716469
Confirmed workaround for OpenStack Ocata on Xenial:
### Nova compute host [hwe-edge kernel via MAAS]
ubuntu@node-mawhile:~$ uname -a
Linux node-mawhile 4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
### Nova guest [stock xenial ppc64el cloud image]
[ 0.000000] Linux version 4.4.0-65-generic (buildd@bos01-ppc64el-028) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 17:48:50 UTC 2017 (Ubuntu 4.4.0-65.86-gene
ric 4.4.49)
Christian,
I am looking at 4.4 git repository and version -93 contains exactly the problem described here, it contains the commit# 46a704f8409f79fd66567ad3f8a7304830a84293 (as a97e978574f41ffcf1813c180aba2772d46fbb5b), but it does not include commit# e47057151422a67ce08747176fa21cb3b526a2c9.
That said, we probably want to include e47057151422a67ce08747176fa21cb3b526a2c9 into 4.4 series as soon as possible.
------- Comment From <email address hidden> 2017-09-19 16:53 EDT-------
I've tested the new kernel 4.4.0-96 and it fixes this issue.
Thank you!
@furmanek, Thanks for testing.
@Breno, Commit e47057151 was not applied to Xenial. It did not apply cleanly because the following commit is already in Xenial as of 4.4.0-96(As Xenial SHA1 25e0720):
17d381054b1d6f4adc3db623b2066fff41b4dc1a
("KVM: PPC: Book3S HV: Reload HTM registers explicitly")
Commit 17d381054 came in from the 4.4.80 updates and does what commit e47057151 does and more. Per comment #18, it sounds like it fixes this bug. It would be great if others affected by this bug could also test -proposed.
------- Comment From <email address hidden> 2017-09-20 08:53 EDT-------
> @Breno, Commit e47057151 was not applied to Xenial. It did not apply
> cleanly because the following commit is already in Xenial as of 4.4.0-96(As
> Xenial SHA1 25e0720):
>
> 17d381054b1d6f4adc3db623b2066fff41b4dc1a
> ("KVM: PPC: Book3S HV: Reload HTM registers explicitly")
>
> Commit 17d381054 came in from the 4.4.80 updates and does what commit
> e47057151 does and more. Per comment #18, it sounds like it fixes this bug.
> It would be great if others affected by this bug could also test -proposed.
Sure. We will let you know if we hit this issue on kernel 4.4.0-96+.
Thanks
Based on comments above, moving to "Fix Released". If this issue re-occurs, please update with a new comment.
As this ticket only relates to Xenial and 4.4 kernel, marking as "Fix Released".
|