summary refs log tree commit diff stats
path: root/results/classifier/accel-gemma3:12b/kvm/2722
blob: cdca55c212a207e727ffbd45dac714587488d6b4 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
TLB Invalidation time out on i915 SR-IOV passthrough
Description of problem:
Hello,

I tried to use SR-IOV on i915 driver freshly available on the [LTS intel kernel](https://github.com/intel/linux-intel-lts) with this [kernel version ](https://github.com/intel/linux-intel-lts/tree/lts-v6.6.34-linux-240626T131354Z) for pci passthrough purpose.
After setting up SR-IOV (kernel compilation, kernel cmdline, vfio-pci driver attribution to the new pci..)
 I've got my two new pci.

```
00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)
DeviceName: Onboard IGD

Subsystem: Hewlett-Packard Company Alder Lake-P Integrated Graphics Controller
Kernel driver in use: i915

00:02.1 VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)
Subsystem: Hewlett-Packard Company Alder Lake-P Integrated Graphics Controller
Kernel driver in use: vfio-pci

00:02.2 VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)
Subsystem: Hewlett-Packard Company Alder Lake-P Integrated Graphics Controller
Kernel driver in use: vfio-pci
```
I gave one of those pci to my VM with this qemu cmdline:
```
-cpu host,migratable=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-passthrough,hv-vendor-id=IrisXE
...
-device vfio-pci-nohotplug,host=0000:00:02.1,id=hostdev0,bus=pci.4,addr=0x0
```
Sometimes it working properly when I start the qemu cmdline but most of the time I've got those kernel errors and a GPU hang:
```
    kernel [ 2252.208134] i915 0000:00:02.0: [drm] ERROR GT0: GUC: TLB invalidation response timed out for seqno 9679
    kernel [ 2252.208134] i915 0000:00:02.0: [drm] ERROR GT0: GUC: TLB invalidation response timed out for seqno 9679
    kernel i915 0000:00:02.0: [drm] ERROR GT0: GUC: TLB invalidation response timed out for seqno 9679
    kernel i915 0000:00:02.0: [drm] ERROR GT0: GUC: TLB invalidation response timed out for seqno 9679
    ....
    kernel Fence expiration time out i915-0000:00:02.0:renderThread22381:6e0!
    kernel i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.13.1
    kernel i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
    kernel i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
    kernel i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
    kernel i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
    kernel [ 2730.991019] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dfbfff, in renderThread [22381]
    kernel [ 2730.991084] i915 0000:00:02.0: [drm] renderThread22381 context reset due to GPU hang
```
It mostly appears when Qemu is starting..

Any help would be appreciate, thanks a lot