diff options
| author | Christian Krinitsin <mail@krinitsin.com> | 2025-06-03 12:04:13 +0000 |
|---|---|---|
| committer | Christian Krinitsin <mail@krinitsin.com> | 2025-06-03 12:04:13 +0000 |
| commit | 256709d2eb3fd80d768a99964be5caa61effa2a0 (patch) | |
| tree | 05b2352fba70923126836a64b6a0de43902e976a /results/classifier/105/device/1596579 | |
| parent | 2ab14fa96a6c5484b5e4ba8337551bb8dcc79cc5 (diff) | |
| download | emulator-bug-study-256709d2eb3fd80d768a99964be5caa61effa2a0.tar.gz emulator-bug-study-256709d2eb3fd80d768a99964be5caa61effa2a0.zip | |
add new classifier result
Diffstat (limited to 'results/classifier/105/device/1596579')
| -rw-r--r-- | results/classifier/105/device/1596579 | 113 |
1 files changed, 113 insertions, 0 deletions
diff --git a/results/classifier/105/device/1596579 b/results/classifier/105/device/1596579 new file mode 100644 index 00000000..3b9642e8 --- /dev/null +++ b/results/classifier/105/device/1596579 @@ -0,0 +1,113 @@ +assembly: 0.971 +semantic: 0.965 +device: 0.964 +other: 0.963 +instruction: 0.963 +graphic: 0.959 +boot: 0.957 +socket: 0.955 +vnc: 0.943 +KVM: 0.942 +network: 0.930 +mistranslation: 0.923 + +segfault upon reboot + +[ 31.167946] VFIO - User Level meta-driver version: 0.3 +[ 34.969182] kvm: zapping shadow pages for mmio generation wraparound +[ 43.095077] vfio-pci 0000:1a:00.0: irq 50 for MSI/MSI-X +[166493.891331] perf interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 +[315765.858431] qemu-kvm[1385]: segfault at 0 ip (null) sp 00007ffe5430db18 error 14 +[315782.002077] vfio-pci 0000:1a:00.0: transaction is not cleared; proceeding with reset anyway +[315782.910854] mptsas 0000:1a:00.0: Refused to change power state, currently in D3 +[315782.911236] mptbase: ioc1: Initiating bringup +[315782.911238] mptbase: ioc1: WARNING - Unexpected doorbell active! +[315842.957613] mptbase: ioc1: ERROR - Failed to come READY after reset! IocState=f0000000 +[315842.957670] mptbase: ioc1: WARNING - ResetHistory bit failed to clear! +[315842.957675] mptbase: ioc1: ERROR - Diagnostic reset FAILED! (ffffffffh) +[315842.957717] mptbase: ioc1: WARNING - NOT READY WARNING! +[315842.957720] mptbase: ioc1: ERROR - didn't initialize properly! (-1) +[315842.957890] mptsas: probe of 0000:1a:00.0 failed with error -1 + +The qemu-kvm segfault happens when I issue a reboot on the Windows VM. The card I have is: +1a:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev ff) + +I have two of these cards (bought with many years difference), exact same model, and they fail the same way. I'm using PCI passthrough on this card for access to the tape drive. +This is very easy to reproduce, so feel free to let me know what to try. +Kernel 3.10.0-327.18.2.el7.x86_64 (Centos 7.2.1511). +qemu-kvm-1.5.3-105.el7_2.4.x86_64 +Reporting it here because of the segfault, but I guess I might have to open a bug report with mptbase as well? + +Running the debuginfo qemu-kvm rpm and attaching with gdb might be interesting to get a backtrace from that segfault. Otherwise devices getting stuck in D3 often mean they didn't return from a reset correctly. Are we to assume that the VM died between the vfio-pci line and the mptbase line and the device was set to managed='yes' and therefore libvirt returned the device to the host driver? Please document your VM config and provide lspci -vvv info for the assigned device. + +The VM is fine until I issue a reboot on the guest OS, in this case it happened right at [315765.858431]. That is, once I boot the host, the guest starts fine, I am able to use the tape drive fine, but when I reboot for whatever reason, I guess the segfault. + +Is this enough or do you want the full dumpxml ? + + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0000' bus='0x1a' slot='0x00' function='0x0'/> + </source> + <alias name='hostdev0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> + </hostdev> + +Makes sense to me what you've explained about libvirt returning to the host driver, but unfortunately I don't have enough knowledge to comment, sorry. + +lspci -vvv: + +1a:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) + Subsystem: Hewlett-Packard Company SC44Ge Host Bus Adapter + Physical Slot: 3 + Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ + Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- + Latency: 0, Cache Line Size: 64 bytes + Interrupt: pin A routed to IRQ 27 + Region 0: I/O ports at 2000 [disabled] [size=256] + Region 1: Memory at 97a10000 (64-bit, non-prefetchable) [size=16K] + Region 3: Memory at 97a00000 (64-bit, non-prefetchable) [size=64K] + Expansion ROM at <ignored> [disabled] + Capabilities: [50] Power Management version 2 + Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) + Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- + Capabilities: [68] Express (v1) Endpoint, MSI 00 + DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us + ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- + DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- + RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ + MaxPayload 256 bytes, MaxReadReq 4096 bytes + DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- + LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us + ClockPM- Surprise- LLActRep- BwNot- + LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- + ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- + LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- + Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+ + Address: 0000000000000000 Data: 0000 + Capabilities: [b0] MSI-X: Enable+ Count=1 Masked- + Vector table: BAR=1 offset=00002000 + PBA: BAR=1 offset=00003000 + Capabilities: [100 v1] Advanced Error Reporting + UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- + UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- + UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- + CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- + CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr- + AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- + Kernel driver in use: vfio-pci + +Installed debuginfo packages and abrt caught it. The coredump is 11G but tar.bz2 everything is at 367M. Anything in specific you'd like from all the debug info generated? + + + + + +By all means, feel free to provide me instructions on how to debug this myself, so I can help others in the future, although I understand that can be more time consuming. If anyone would rather prefer talking on IRC, just let me know the network and channel. Thanks + +Can you reproduce this problem with the latest upstream version of QEMU (currently version 4.1)? Or is it only reproducible in the qemu-kvm from your distribution? (In the latter case, please report this bug to your distro instead) + +I have no way of reproducing now, should be good by now, I wouldn't worry about it, just close this, thanks. + +Ok, thanks for the update + |