1 files changed, 101 insertions, 0 deletions
diff --git a/results/classifier/118/all/1887854 b/results/classifier/118/all/1887854
new file mode 100644
index 000000000..ad885ab63
--- /dev/null
+++ b/results/classifier/118/all/1887854
@@ -0,0 +1,101 @@
+peripherals: 0.968
+register: 0.966
+debug: 0.959
+semantic: 0.953
+permissions: 0.952
+arm: 0.952
+mistranslation: 0.949
+architecture: 0.949
+device: 0.948
+ppc: 0.944
+socket: 0.944
+PID: 0.941
+network: 0.939
+virtual: 0.939
+graphic: 0.939
+kernel: 0.939
+risc-v: 0.935
+assembly: 0.934
+vnc: 0.933
+performance: 0.927
+user-level: 0.922
+files: 0.920
+boot: 0.919
+VMM: 0.918
+i386: 0.914
+TCG: 0.900
+hypervisor: 0.882
+x86: 0.879
+KVM: 0.864
+
+Spurious Data Abort on qemu-system-aarch64
+
+When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows:
+Taking exception 4 [Data Abort]
+...from EL1 to EL1
+...with ESR 0x25/0x96000010
+...with FAR 0x104010ca28
+...with ELR 0x400195d8
+...to EL1 PC 0x40018200 PSTATE 0x3c5
+
+The ESR indicates that a synchronous external abort has occurred.
+ESR EC field: 0b100101
+
+From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions.
+
+ESR ISS field: 0b10000
+
+From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table.
+
+The following command line is used to invoke qemu:
+qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors
+
+This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches).
+
+
+
+Writing to SCTLR can cause QEMU to flush its TLB (as an internal implementation detail), so if adding SCTLR writes is sufficient to cause the problem to go away, I would be suspicious that your guest code is missing necessary TLB maintenance instructions.
+
+QEMU 3.1 and 4.1 are quite old -- can you reproduce with 5.0 or (ideally) head-of-git ?
+
+
+I would have thought that TLB considerations would not apply when the MMU is disabled (RTEMS runs in a completely flat memory space). I'll try to reproduce on more modern QEMU today. Thanks for taking a look at this.
+
+It does still crash on current QEMU. The proximate cause of the crash is that you are trying to read from an address which is way outside RAM:
+
+Trace 0: 0x7f8d50054340 [0000000000000000/00000000400195d8/0x82104000] strcmp
+ PC=00000000400195d8 X00=000000104010ca28 X01=000000004001ec28
+X02=0000000000000fe8 X03=00000000401098c8 X04=000000004010ba40
+X05=5641526f44654b00 X06=1f276f6c62717372 X07=0000000000000000
+X08=00000000ffffffda X09=00000000401097d0 X10=0101010101010101
+X11=0000000000000000 X12=0000000000000000 X13=0000000000000000
+X14=0000000000000000 X15=0000000000000000 X16=0000000040014610
+X17=0000000000000008 X18=0000000000000000 X19=000000004010b9f0
+X20=000000004001ec28 X21=000000084001ec20 X22=000000004001ec60
+X23=000000004001ec40 X24=000000004001f548 X25=000000104001ec28
+X26=000000104001ec40 X27=000000034001ec38 X28=0000000000000000
+X29=00000000401098d0 X30=0000000040008a38  SP=00000000401098d0
+PSTATE=40000005 -Z-- EL1h
+Taking exception 4 [Data Abort]
+...from EL1 to EL1
+...with ESR 0x25/0x96000010
+...with FAR 0x104010ca28
+...with ELR 0x400195d8
+...to EL1 PC 0x40018200 PSTATE 0x3c5
+
+where the insn at 0x400195d8 is (inside strcmp)
+0x400195d8:  f8408402  ldr      x2, [x0], #8
+
+You can see that x0 is is 000000104010ca28, so QEMU is correct to give the data abort here. Further diagnosis would require working back through the log to find out where that address came from, which will be easier for you to do since you have the source.
+
+NB: I recommend these options for producing the logfile:
+ /tmp/q.log -d in_asm,int,cpu_reset,exec,cpu,guest_errors,nochain -singlestep
+Execution will be slower, but the crash here is pretty quick so that's not a problem, and these options mean that every insn executed will produce a "Trace" line and a CPU register dump. That's easier to understand and read (especially reading backwards) than logs produced when QEMU is doing its normal optimisations of chaining TBs together and putting multiple guest insns in each TB.
+
+
+Ok, thanks for rooting this out. I could swear that I checked that address several times and I clearly remember 0x4010ca28, but I don't remember ever seeing 0x10 ahead of it. I'll dig into it a bit and hopefully find the root cause in my code.
+
+An update for anyone interested: I didn't remember seeing the leading 0x10 because the values are correct when retrieved from memory. They get packed into a structure that gets returned in a single register, so the 0x10 second element ends up in the upper 4 bytes of x0 which is provided as the first argument to strcmp. strcmp doesn't appear to clear the upper bytes of x0 in ilp32 mode before using it to access memory. This issue is actually either a GCC codegen problem or a multilib selection problem in the build environment.
+
+Also of note, GDB prints the full 64bit address when printing $w0 instead of the lower 4 bytes, but I don't think that's a Qemu bug either.
+